Querying An AS400 Using SQL Server 2019 Data Virtualization

For those who have taken an interest in the latest SQL Server release, SQL Server 2019, you may have noticed a new feature called Data Virtualization. Perhaps you’ve seen a presentation about it, or maybe you’ve read about it in Bob Ward’s book entitled “SQL Server 2019 Revealed.”

I myself was curious about this feature and so I set out to see if I could use it to connect to an AS400 iSeries machine. Tom Willwerth, my friend at StraightPathSQL, and I worked on this process together. This post is primarily about walking you through the steps we recently took to get this configured.

Before I dive into that, why was I interested in this feature? What did I hope to gain? Well, first of all, there was definitely the motivation of wondering, “Can we get this to work?” Secondly, and more practically, the promise of SQL Server Data Virtualization is to make other data sources available without using a Linked Server and without the time it takes to develop an ETL process to move the data. On a related note, you can cut out the time it takes for an ETL job to actually move the data somewhere like a data warehouse or flattened tables for reporting. Third, the Polybase feature has a built in engine that can provide query performance not available via Linked Server. Fourth, I wanted to provide a way for developers to query data in the AS400 without having to learn the different syntax required by the AS400 iSeries. Fifth, query writers can also join the external able to local SQL Server data.

Now let’s move forward with the how of this process. I started by installing SQL Server 2019, and in that process there are some steps to select for installing the Polybase feature. If you’re not familiar with that, it came out in SQL Server 2016 and is the foundation of data virtualization. During the install process, SQL Server suggested a port range that it would use as part of configuring Polybase. I wasn’t sure if SQL Server knew that the suggested port range was open or not. As a result, I actually stopped to download the SysInternals tool called TCPView and run it on the server to confirm that the suggested port range was open.

After the SQL Server install completed, I ran the code below in SSMS to enable the Polybase feature.

exec sp_configure 'Show Advanced Options', 1
reconfigure
EXEC sp_configure 'polybase enabled', 1
reconfigure
EXEC sp_configure 'polybase enabled'

The next thing I did was create a database master key. This has to exist in order to encrypt the credential that will be stored in SQL Server later on in the process.

CREATE MASTER KEY WITH ENCRYPTION BY PASSWORD = 'S0meStr0ngP@55w0rd1234xYz!'

I then set my test database to compatibility level 150 for SQL Server 2019 and made my database scoped credential, like so.

ALTER DATABASE MyTestSQLDatabase SET COMPATIBILITY_LEVEL = 150

CREATE DATABASE SCOPED CREDENTIAL MySQLCred WITH IDENTITY = 'MYAS400CRED', SECRET = 'S0m3Str0ngP@55w0rd';

The value for IDENTITY corresponds to a user in the source database, in this case, the AS400. SECRET, of course, is the password for the user name stored in IDENTITY.

I believe after this Tom joined the process and we began together by installing version 1.8 of the 64 bit iSeries Client Access driver for Windows onto the SQL Server instance I was testing from. Going into the details of that are a bit beyond the scope of this post so I won’t describe the process here.

Creating an external data source was the next part of the journey. Syntax for that is here.

After a lot of wrangling, searching, and looking here for an AS400 connection string example, Tom and I arrived at this code sample that we ran successfully. Obviously I’ve changed some details to anonymize this, but you get the idea.

CREATE EXTERNAL DATA SOURCE MYAS400DEV WITH (LOCATION = 'odbc://MYAS400DEV.LeeMarkum.com', CONNECTION_OPTIONS = 'Driver={iSeries Access ODBC Driver}; System = MYAS400DEV'; ServerNode = MYAS400DEV.LeeMarkum.com', PUSHDOWN = ON, CREDENTIAL = MySQLCred);

Next up we took the optional step of creating a schema called EXT for holding our external tables. Afterward, we moved on to the business of actually making our first external table. This was very much a multi-step, trial and error process.

Here is the process we eventually found that worked:

  • Get SQL definition from AS400 via the iSystem Navigator, minus trigger definitions, and paste into SSMS
  • Remove any index or key definitions. Without this change SQL Server throws an error
  • Remove any defined defaults. Without this change SQL Server throws an error
  • Change the CHAR data types to NCHAR. Without this change SQL Server throws an error
  • Replace DB2 schema name with EXT, our previously created SQL Server schema.
  • Change CREATE TABLE  to CREATE EXTERNAL TABLE
  • Add this at the bottom: WITH ( LOCATION=’MYAS400DEV.SomeSchemaName.DB2TABLENAMEHERE’, DATA_SOURCE= MYDEVAS400)

This left us with code like the below:

CREATE EXTERNAL TABLE EXT.Customer 
(CUST# NUMERIC(7,0)NOT NULL,
CUSTNAME NCHAR(30) NOT NULL,
CUSTADDRS NCHAR(50) NOT NULL,
CUSTCITY NCHAR(30) NOT NULL,
CUSTSTATE NCHAR(2) NOT NULL,
CUSTZIP NCHAR(2) NOT NULL
)
WITH ( LOCATION='MYAS400DEV.SomeSchemaName.DB2TABLENAMEHERE', DATA_SOURCE= MYDEVAS400)

As part of this process, we had to keep the same column names as what existed in the AS400. Otherwise, an error was thrown, and this makes sense given that we’re looking at the AS400 table in the WITH (LOCATION = ”, data_source = ) part of the syntax.

After this step there is the optional step of creating statistics on columns of the virtual table. You might want to do this, for instance, on columns that might be involved in joins as a performance enhancement. For the purposes of this work, I skipped this step for the time being because Tom and I just wanted to see some query results from all this work! We could return later to this item to try to tweak performance.

The moment of truth had finally arrived. We loaded up a query against our EXT.Customer table in SQL Server 2019 and, to our delight, we received the expected result set back using standard T-SQL.

I’m not gonna lie, this was a fair amount of work and took some time to set up. However, I’m hoping this blog post, and others like it, will save you some pain and time.

One Reply to “Querying An AS400 Using SQL Server 2019 Data Virtualization”

Leave a Reply

Your email address will not be published. Required fields are marked *