Top 3 Most Popular Posts of 2020

So, I realize it’s now almost April of 2021 when I’m posting this. Normally this sort of post would come out in January for the previous year. However, I have a reason for this being late and I’ll actually post about that separately.

Showing My Blog Some Love

In mid-2020 I decided to give my blog more attention. there were a number of reasons for this. First, I’d already spent a fair amount of money over the years on hosting and related things so I didn’t want to waste that. Second, I really wanted to try to start giving back to the SQL Server community. I have been the recipient of a lot of learning and help through a number of regular, and occasional, bloggers. I have also purchased and mostly read my fair share of SQL Server books, which was another way I had been helped by the community. Third, I wanted a place to record my own troubleshooting and learning so when I needed to do something again, I would have a record of what to do. By the way, I have seen this repeatedly listed as a reason that someone blogs.

Blog Posts by Month

So, here is the break down of the number of posts by month for 2020:

  • April – 1
  • July – 1 This was the last week of the month and marks my decision to start blogging more consistently.
  • Aug – 3
  • Sept – 4
  • Oct – 4
  • Nov – 7 – I had built a good backlog of posts that were ready to publish and nearly reached my goal of posting twice a week for the entire month
  • Dec – 4

To get the list of the most popular blog posts I looked at Google Analytics PageView data for Jan 1, 2020 – Dec 31, 2020.

Popular Post # 1

The most viewed page based on this data had 354 views and was part 1 of a series on PowerShell for the DBA. The series was meant to show some “getting started” type techniques so that someone could open PowerShell and start exploring what the tool could do for them as a DBA. I explored how to find commands that might be of interest and how to use the help system in PowerShell to figure out how to use the command. I also provided some practical examples of PowerShell to use in every day scenarios.

https://leemarkum.com/archive/2020/11/introduction-to-powershell-for-the-dba-part-1/ 

Popular Post #2

The second most viewed page had 332 views and covered 5 different ways to make a SQL Server backup. Backups are so important, especially in the world of data. They give you and your business a way to recover data that has been lost or damaged in some way. As a Database Administrator, you need to be able to recover and that all starts with taking backups. That seems obvious but I still read stories of businesses that aren’t even backing up their databases. If you do have backups, you also need also to test whether you can use those backups to actually restore data.

https://leemarkum.com/archive/2020/09/5-ways-to-make-sql-server-backups/ 

Popular Post #3

The third most viewed post had 273 views and was about using a new feature in SQL Server 2019. This feature is based on the polybase technology that arrived in SQL Server 2016. However, the feature has now been extended to allow SQL Server to have external tables to ole db and odbc sources, such as SQL Server, MS Access, and, yes, even things like Oracle and IBM iSeries/AS400. The external, or virtual table, uses mostly built-in drivers. With this feature yo can run SELECT statements against remote data sources without a linked server. The environment I’m in has an AS400/IBM iSeries so I wanted to see if I could get the feature to work with that data source.

https://leemarkum.com/archive/2020/04/querying-an-as400-using-sql-server-2019-data-virtualization/

There you have it. I hope you enjoyed these posts in 2020, and if you haven’t seen them, give them a click and read through the information. I hope you learn something from the posts and if you have questions or comments reach out to me here, or on Twitter, and I’ll be happy to talk to you.

Querying An AS400 Using SQL Server 2019 Data Virtualization

 

For those who have taken an interest in the latest SQL Server release, SQL Server 2019, you may have noticed a new feature called Data Virtualization. Perhaps you’ve seen a presentation about it, or maybe you’ve read about it in Bob Ward’s book entitled “SQL Server 2019 Revealed.”

I myself was curious about this feature and so I set out to see if I could use it to connect to an AS400 iSeries machine. Tom Willwerth, my friend at StraightPathSQL, and I worked on this process together. This post is primarily about walking you through the steps we recently took to get this configured.

SQL Server Polybase and SQL Server 2019 Data Virtualization

Before I dive into that, why was I interested in this feature? What did I hope to gain? Well, first of all, there was definitely the motivation of wondering, “Can we get this to work?” Secondly, and more practically, the promise of SQL Server 2019 Data Virtualization is to make other data sources available without using a Linked Server and without the time it takes to develop an ETL process to move the data. On a related note, you can cut out the time it takes for an ETL job to actually move the data somewhere like a data warehouse or flattened tables for reporting. Third, the Polybase feature has a built in engine that can provide query performance not available via Linked Servers. Fourth, I wanted to provide a way for developers to query data in the AS400 without having to learn the different syntax required by the AS400 iSeries. Fifth, query writers can also join the external able to local SQL Server data.

How To Set Up SQL Server 2019 Data Virtualiztion

Now let’s move forward with the how of this process. I started by installing SQL Server 2019, and in that process there are some steps to select for installing the Polybase feature. If you’re not familiar with that, it came out in SQL Server 2016 and is the foundation of data virtualization. During the install process, SQL Server suggested a port range that it would use as part of configuring Polybase. I wasn’t sure if SQL Server knew that the suggested port range was open or not. As a result, I actually stopped to download the SysInternals tool called TCPView and run it on the server to confirm that the suggested port range was open.

After the SQL Server install completed, I ran the code below in SSMS to enable the Polybase feature.

exec sp_configure 'Show Advanced Options', 1
reconfigure
EXEC sp_configure 'polybase enabled', 1
reconfigure
EXEC sp_configure 'polybase enabled'

 

The next thing I did was create a database master key. This has to exist in order to encrypt the credential that will be stored in SQL Server later on in the process.

CREATE MASTER KEY WITH ENCRYPTION BY PASSWORD = 'S0meStr0ngP@55w0rd1234xYz!'

 

I then set my test database to compatibility level 150 for SQL Server 2019 and made my database scoped credential, like so.

ALTER DATABASE MyTestSQLDatabase SET COMPATIBILITY_LEVEL = 150

CREATE DATABASE SCOPED CREDENTIAL MySQLCred WITH IDENTITY = 'MYAS400CRED', SECRET = 'S0m3Str0ngP@55w0rd';

The value for IDENTITY corresponds to a user in the source database, in this case, the AS400. SECRET, of course, is the password for the user name stored in IDENTITY.

I believe after this Tom joined the process and we began together by installing version 1.8 of the 64 bit iSeries Client Access driver for Windows onto the SQL Server instance I was testing from. Going into the details of that are a bit beyond the scope of this post so I won’t describe the process here.

Creating an external data source was the next part of the journey. Syntax for that is here.

After a lot of wrangling, searching, and looking here for an AS400 connection string example, Tom and I arrived at this code sample that we ran successfully. Obviously I’ve changed some details to anonymize this, but you get the idea.

CREATE EXTERNAL DATA SOURCE MYAS400DEV WITH (LOCATION = 'odbc://MYAS400DEV.LeeMarkum.com', CONNECTION_OPTIONS = 'Driver={iSeries Access ODBC Driver}; System = MYAS400DEV'; ServerNode = MYAS400DEV.LeeMarkum.com', PUSHDOWN = ON, CREDENTIAL = MySQLCred);

Next up we took the optional step of creating a schema called EXT for holding our external tables. Afterward, we moved on to the business of actually making our first external table. This was very much a multi-step, trial and error process.

Here is the process we eventually found that worked:

  • Get SQL definition from AS400 via the iSystem Navigator, minus trigger definitions, and paste into SSMS
  • Remove any index or key definitions. Without this change SQL Server throws an error
  • Remove any defined defaults. Without this change SQL Server throws an error
  • Change the CHAR data types to NCHAR. Without this change SQL Server throws an error
  • Replace DB2 schema name with EXT, our previously created SQL Server schema.
  • Change CREATE TABLE  to CREATE EXTERNAL TABLE
  • Add this at the bottom: WITH ( LOCATION=’MYAS400DEV.SomeSchemaName.DB2TABLENAMEHERE’, DATA_SOURCE= MYDEVAS400)

This left us with code like the below:

CREATE EXTERNAL TABLE EXT.Customer 
(CUST# NUMERIC(7,0)NOT NULL,
CUSTNAME NCHAR(30) NOT NULL,
CUSTADDRS NCHAR(50) NOT NULL,
CUSTCITY NCHAR(30) NOT NULL,
CUSTSTATE NCHAR(2) NOT NULL,
CUSTZIP NCHAR(2) NOT NULL
)
WITH ( LOCATION='MYAS400DEV.SomeSchemaName.DB2TABLENAMEHERE', DATA_SOURCE= MYDEVAS400)

As part of this process, we had to keep the same column names as what existed in the AS400. Otherwise, an error was thrown, and this makes sense given that we’re looking at the AS400 table in the WITH (LOCATION = ”, data_source = ) part of the syntax.

After this step there is the optional step of creating statistics on columns of the virtual table. You might want to do this, for instance, on columns that might be involved in joins as a performance enhancement. For the purposes of this work, I skipped this step for the time being because Tom and I just wanted to see some query results from all this work! We could return later to this item to try to tweak performance.

The moment of truth had finally arrived. We loaded up a query against our EXT.Customer table in SQL Server 2019 and, to our delight, we received the expected result set back using standard T-SQL.

I’m not gonna lie, this was a fair amount of work and took some time to set up. However, I’m hoping this blog post, and others like it, will save you some pain and time.

Next Steps To Take

  1. Get a copy of SQL Server 2019 Developer Edition and install it and the Polybase feature. Then enable Polybase.
  2. Identify a data source in your environment that you would like to use data virtualization to query. It could be Oracle, Microsoft Access, PostGres, MySQL or any number of things.
  3. Look through the available documentation on the connection strings for the data source you’ve chosen and figure out what the connection string should look like.
  4. Look through the rest of the walk-through in this article and give the feature a try.
  5. If you would like help with anything in this post, or with something else related to SQL Server, reach out to me here, or on Twitter, and I’ll be glad to offer assistance.