ADO Best Practices
For Development of iSeries
n-tier Applications

By Howard F. Arner, Jr.
for Client Server Development, Inc.

Introduction

ADO and ADO.Net are technologies for developing distributed applications that reside on Microsoft platforms and communicate with the iSeries. The purpose of this document is to outline the best policies for developing with ADO, when ADO is not appropriate and how to use a development methodology with the iSeries that ensures the data access layer of your distributed application is scalable and reliable. This document reveals when a programmer should use the Client Access OLEDB provider, when to use the Client Access ODBC provider via MSDASQL and when to use direct programming to the ODBC interface. The intended audience for this paper is a programmer familiar with ADO, VBScript or Visual Basic and ADO.Net.

Which Access Method

At the time of this writing, there is not a managed provider available for accessing the iSeries via ADO.Net. However, there is an OLEDB provider for accessing the iSeries via ADO or OLEDB. This provider allows you to access specific capabilities of the iSeries, some of which are not readily available via an ODBC provider. These are:

o Access to data queues

o Record level access to physical and logical files

o Access to iSeries API code

o Access to CL Commands

o Running SQL statements

There is no support in the ODBC provider for record level access to physical and logical files. If this is important to the section of code you are working on, it would be appropriate to use the OLEDB provider. If you need to access data queues, you can do this with the ODBC provider as long as you have created stored procedures for manipulating data queues. In addition, you can access and use iSeries API’s via the OLEDB provider, whereas via ODBC you would need to call a native program as a stored procedure. For these types of accesses, the OLEDB provider gives good performance and integrity.

However, the OLEDB provider lacks good support for manipulating the capabilities of DB2 when using SQL. In tests performed by Client Server Development, we have found that using the ODBC provider via the MSDASQL provider provides a 45% improvement in performance when fetching more than one record. In addition, the OLEDB provider does not support the packaging of SQL statements and instead relies on the system and job level SQL statement cache. Because most SQL statements are against legacy tables with a large number of physical files, we feel the packaging of SQL statements is of paramount importance in producing a scalable and reliable n-tier system. Finally, the OLEDB provider only supports the SQL naming convention for execution of SQL statements against the iSeries. Sometimes, a program needs the *SYS naming convention which is supported by ODBC in order to avoid effects programmed into the *SQL naming convention for objects. It is for these reasons that we recommend all SQL access to the iSeries be accomplished via the MSDASQL provider against the Client Access ODBC driver.

If programming in the .Net framework, there is an ODBC provider available in version 1.1, system.data.odbc, which allows managed code access to ODBC drivers. In our tests, this provider has performed quite well when connecting to the iSeries and we feel that is should be considered if you need managed code access to the iSeries.

However, one area where we recommend programming directly to the ODBC API is when you must write more than one record to a table. ODBC supports the binding of arrays of records to parameter markers in SQL statements. This feature can significantly improve the performance of database writes by decreasing the number round trips between the client and server. In a recent test on a client connected to the iSeries via a 256KB Internet connection we were able to write 14 records per second using ADO/ODBC. On the same connection using direct calls to ODBC and binding arrays of records we were able to achieve writes of 400 records per second. This is a significant performance difference.

Developing for Performance (Methodology)

The following is the recommended methodology for developing distributed applications with the iSeries.

Data centric design

The programmer should begin by designing and coding the SQL statement in a tool like SQLThing Enterprise Edition, before any client code is written. The statement should be tested against the iSeries as the statement will be used in the component, (i.e. using parameter markers rather than hard coded values in the where and having clauses). When testing the developed statement, the programmer should place the iSeries server job into debug mode so that they can view optimization messages in the job log. Care should be taken to look for messages that can negatively impact performance.

The following metrics should be obtained against statements:

o Prepare time

o Execution time

o Records per second

o Overall execution time

If there are no contraindications in the job log and times appear adequate to support the suspected load, the programmer should copy the statement directly from the testing tool and place it into his component. At this point, the programmer knows that if there is a performance problem with the fielded application, it is not in the data access layer as a quality assurance process has vetted the statements.

The programmer should have the following data at hand during this process

o Index selectivity reports for each logical file associated with the physical files

o A report describing all physical files and columns that are in the target tables

The programmer can use the job log output to see which indexes, (logical files), are being used in the resolution of the query and why other logical files are not being accessed. This information can be valuable to spotting inconsistent data values or comparisons between files and other actions that can negatively impact performance.

Client Programming Best Practices

The following information recounts best programming practices for achieving optimal performance when accessing iSeries data.

Use of Prepared Statements

When programming in ADO, many programmers use a technique called SQL mashing to put a select statement together. This involves taking a stub of the SQL statement and concatenating together the required where clause using string concatenation techniques and then opening the recordset object using the concatenated statement. This causes the iSeries to have to optimize the SQL statement when it is received, which is a very bad thing when accessing legacy files. Unlike most relational databases, iSeries databases support record level access techniques and direct reading of indexes. Whereas a typical relational database might have two or three indexes over a table, the iSeries might have seventy or more indexes over a physical file in order to support various programs that need to access the data in different sequences. Because the query optimizer attempts to evaluate each index available, the optimization process can take an inordinate amount of time and becomes very expensive. Therefore, use of dynamic SQL should be avoided.

The alternative to dynamic SQL is to use static SQL. This is similar to the creating a stored procedure in a product like SQL server. When the iSeries first sees a static statement, it will completely optimize the statement to find the best access plan and then store a copy of the execution plan in an SQL Package file. When you attempt to use the statement again, the iSeries will first search the package file for the occurrence of the statement, and if found it will forgo the optimization process and use the stored execution plan. This is very efficient and saves both CPU and disk IO access on the iSeries.

In order to use the packaging facility, you must use a JDBC or ODBC driver to access the iSeries as packaged statements are not currently supported by the OLEDB or DRDA access methods. In addition, when programming in ADO the programmer should make use of parameter markers in the statements as the iSeries will only package statements that have no hard-coded values in the where or having clauses. Here is an example of a dynamic SQL statement:

SELECT * FROM PARTS WHERE PART_CLASS = ‘Filters’

Here is an example of a static SQL statement

SELECT * FROM PARTS WHERE PART_CLASS = ?

In order to use static SQL, the programmer should use the ADO Command object. Here is a code snippet that will execute the above statement.

Dim Con1 as New ADODB.Command

Dim Cmd1 as New ADODB.Connection

Dim Rs as New ADODB.Recordset

Con1.Open “DSN=MYAS400;UID=HOWIE;PWD=SECRET;”

Cmd1.ActiveConnection = Con1

Cmd1.CommandText = “SELECT * FROM PARTS WHERE PART_CLASS = ?”

Cmd1.Parameters.Refresh

Cmd1.Parameters(0).Value = “Filters”

Rs.CacheSize = 200

Rs.Open Cmd1

While Not Rs.Eof()

…

Rs.MoveNext

Wend

Rs.Close

Con1.Close

The above is close to best practice code, as it uses parameter markers and adjusts the cachesize of the recordset before opening the recordset to anticipate the number of records being returned by the client. Setting the cachesize allows the iSeries to send up to 200 records per server fetch, thus reducing round trips between client and server and increasing throughput. The DSN used in this connection has packaging turned on, which will cause the iSeries to store the execution plan of the command so that subsequent executions of the statement need not be optimized.

To make this code into best practice code, the programmed should use a connection string that does not incorporate a DSN. DSNless connection string avoid a hit against the Windows registry and allow for greater control over connection options.

Using DSNLess Connection Strings

The following information relates how to create a connection string and use the connection string with an ADO or ODBC.Net provider.

A DSN, Data Source Name, is defined via the ODBC Driver manager in the Windows Control Panel. This causes an entry to be made into the system registry recording all of the connection information that you enter into the Control Panel. The problem with using a DSN is that every time you connect to the AS/400, the target computer must look up the connection information in the Windows registry. This is not a factor if you are casually connecting from applications like Excel, Access or SQLThing, but if you are doing Active Server Pages these registry lookups do add a small amount of time to the connection process. Also, using a DSN-less connection can give you finer control of connection attributes; in fact you can control your connection attributes on a page-by-page basis. In addition, by not requiring a DSN you make you application or web pages easier to distribute to target computers, i.e. you do not have to go to the target computer and use the control panel to create a DSN.

There are a host of connectivity options that you can use when connecting to the AS/400, and these can lead your connection string to become quite complex. Instead of typing all of the options in, you can let Windows create a connection string for you by creating a Universal Data Link (UDL) file. A UDL file is a file persisted set of information on how to connect to a data source. In this case, we will be using the UDL file as a quick way to build an ODBC or OLEDB connection string. To start, open a Command Prompt window, switch to a directory you wish to use and type COPY CON A.UDL and press Enter. Next, press the F6 key to send a ^Z command to the console. You should see the text 1 file copied and then be returned to a command prompt. You have just created a 0 length file called A.UDL in your target directory.

Next, pull up the Windows Explorer and right click on the A.UDL file and select the Properties menu item. This will bring up an interactive window where you can set the provider and all connection options for that provider. Once you have your settings correct, press Apply then OK to save the UDL information. Next, open the UDL file with notepad, or some other editor, and Volla, you have a preformed almost-perfect connection string. The only problem with the string is that it still points to an ODBC DSN. To make the string perfect, remove the DSN information from the extended properties and replace the quote (“) characters with double quotes (“”) so that the string will not be confused in your ASP page. Finally, after the Provider=MSDASQL.1; place the following information: Driver=Client Access ODBC Driver (32-bit); This string lets the MSDASQL OLEDB provider know to use the Client Access ODBC driver when it attempts to connect. Here is a code snippet that shows a proper connection string with lots of options:

ConStr = "Provider=MSDASQL.1;Driver=”
ConStr = ConStr & “Client Access ODBC Driver (32-bit);Password=SECRET;”
ConStr = ConStr & “Persist Security Info=True;User ID=HFA;Extended”
ConStr = ConStr & “Properties=""SYSTEM=172.16.2.200;CMT=0;”
ConStr = ConStr & “DBQ=WEBOE,HD1100PD,HD1100PO;NAM=0;”
ConStr = ConStr & “DFT=5;DSP=1;TFT=0;TSP=0;DEC=0;XDYNAMIC=1;”
ConStr = ConStr & “RECBLOCK=2;BLOCKSIZE=32;SCROLLABLE=0;”
ConStr = ConStr & “TRANSLATE=1;LAZYCLOSE=1;LIBVIEW=0;
ConStr = ConStr & “REMARKS=1;CONNTYPE=0;SORTTYPE=0;
ConStr = ConStr & “PREFETCH=0;DFTPKGLIB=QGPL;LANGUAGEID=ENU;”
ConStr = ConStr & “SORTWEIGHT=0;SSL=2;MAXFIELDLEN=32;”
ConStr = ConStr & “COMPRESSION=1;ALLOWUNSCHAR=0;”
ConStr = ConStr & “SEARCHPATTERN=0;MGDSN=0;"""

Set Con1 = Server.CreateObject(“ADODB.Connection”)
Con1.Open ConStr

Note that the connection string keyword parameters randomly change for no apparent reason between versions of Client Access, why is left as a mystery to be contemplated by minds superior to ours. Because these connection keywords change, it is recommended that you use the above method to create the desired DSNless connection string. This gives you complete access to the GUI interface IBM providers for setting up ODBC data sources, without having to hit the registry to retrieve the information.

Bulk Insertion of Records

When writing record to the iSeries the best practice is to use a command object and parameter markers so that the insert statement is optimized and packaged once. However, in some cases you are not writing one or two records during a transaction, but many records. In these cases, it is desirable to use bulk insert techniques in order to avoid multiple round trips between client and server. Since the MSDASQL provider does not support this technique, you must program directly to the ODBC API to take advantage of bulk inserts.

Client Server Development provides a COM object that eases the use of ODBC for performing bulk insert operations. This COM object is written in VB6 and has a small companion DLL written in C++ for memory management. The object has been commercially available since 1996 and is currently in use by over 3,000 organizations to abstract the VB and COM programmer from the intricacies of ODBC API manipulation.

The COM object contains three objects representing the ODBC Environment handle, the ODBC Connection handle and the ODBC Statement handle. Each object exposes methods and properties that allow the programmer to control how ODBC behaves and invoke ODBC API functions. To install the object, the programmer copies the two DLL files to the Windows/System32 folder and then uses the REGSVR32 program to register the ODBC400 DLL. Once installed, the COM object appears as a reference-able object in the list of references in VB and other COM programming environments. The following is a set of code used to bind multiple records to an insert statement using the COM object. This program opens a dBase file and reads records and binds them to the SQL statement. When the number of records read is equal to the number of records to bind per execute, the statement handle is executed and the records are sent to the iSeries. The logic is continued until all records are read from the dBase file and sent to the iSeries.

Dim OEnv As New sqltODBC400.odbcenv

Dim OCon As sqltODBC400.odbccon

Dim rc As Integer

Dim ADC As New ADODB.Connection

Dim rs As New ADODB.Recordset

Dim i As Long, ii As Long, e As Long

Dim eRcol As New Collection

Dim RW As Long, rr As Long

Dim begtime, endtime

Dim NBR

‘initialize ODBC objects

OEnv.Sql_Init

‘connect to the iSeries.

Set OCon = New sqltODBC400.odbccon

X = OCon.Connect(OEnv, txtDSN.Text, txtUID.Text, txtPWD.Text)

‘Set number of records to bind before execute

NBR = CInt(250)

‘set recordset cachesize for performance

rs.CacheSize = 2500

‘open dBase file

ADC.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & App.Path & ";Extended Properties=""dBASE 5.0;"";Persist Security Info=False"

rs.Open "GLSUM", ADC

Dim Stmt As String

Dim BinAR() As Variant

‘prepare odbc statement handle

Dim s1 As New sqltODBC400.odbcstmt

s1.Make OCon

‘make a table to hold information

rc = s1.Prep("drop table YYGLSUM")

rc = s1.Execute()

Stmt = ""

Stmt = Stmt & "CREATE TABLE YYGLSUM ("

Stmt = Stmt & " ACCOUNT CHAR(10) not null,"

Stmt = Stmt & " COMPANY integer not null,"

Stmt = Stmt & " BRANCH integer not null,"

Stmt = Stmt & " RMN integer not null,"

Stmt = Stmt & " PRMN integer not null,"

Stmt = Stmt & " YR integer not null,"

Stmt = Stmt & " MO integer not null,"

Stmt = Stmt & " AMOUNT decimal(12,2) not null,"

Stmt = Stmt & " YTD decimal(12,2) not null,"

Stmt = Stmt & " PYAMOUNT decimal(12,2) not null,"

Stmt = Stmt & " PYTD decimal(12,2) not null,"

Stmt = Stmt & " PK integer not null primary key)"

‘prepare create table statement

rc = s1.Prep(Stmt)

If rc <> True Then

OEnv.Errd

Exit Sub

End If

‘execute statement

rc = s1.Execute()

If rc <> True Then

OEnv.Errd

End If

‘kill the statement object

s1.Kill

‘make a new one

Set s1 = New sqltODBC400.odbcstmt

s1.Make OCon

begtime = Timer()

‘note use of parameter markers

Stmt = "insert into YYGLSUM values (?,?,?,?,?,?,?,?,?,?,?,?)"

‘the mrprep method prepares the statement and indicates

‘the maximum number of rows we will bind per execute

rc = s1.MrPrep(Stmt, NBR)

‘mrpreprows prepares the memory buffers

rc = s1.MrPrepRows()

Dim RowRead As Long

RowRead = 1

RW = 0

rr = 0

While Not rs.EOF

rr = rr + 1

For i = 1 To 11

‘use MRBindParm method to place data into buffer

rc = s1.MrBindParm(rs.Fields(i - 1).Value, i, RowRead - 1)

If rc <> True Then

s1.Errd

End If

'add primary key

rc = s1.MrBindParm(rr, 12, RowRead - 1)

If rc <> True Then

s1.Errd

End If

If RowRead = NBR Then

rc = s1.MrExecute(NBR)

If rc <> -1 Then

If s1.Errnum = 170 Then

‘ this is to counter a bug sometimes found

‘when communicating with SQL Server

s1.MrExecute (NBR)

Else

s1.Errd

End If

DoEvents

If Not rc Then

e = e + 1

End If

RowRead = 0

End If

rs.MoveNext

DoEvents

RowRead = RowRead + 1

RW = RW + 1

Wend

‘commit the transactions

OCon.Commit

endtime = Timer()

MsgBox "ET: " & endtime - begtime & " RPS " & RW / (endtime - begtime) & " for " & RW & " rows"

Performance Verification

Finally, it is important that we verify the actions of the completed component against the iSeries. It is my recommendation that the component be run while the IBM Database Monitor is active and journaling SQL information. Once all aspects of the component have been exercised the monitor is turned off and data in the monitor file is examined using SQL statements and programs provided by Client Server Development. By examining the database monitor, we can spot potential performance problems like creation of temporary indexes, unnecessary code-page sorts, use of dynamic SQL or dynamic rebuilding of access plans. This audit will ensure that the system is robust, scalable and have little impact on normal iSeries operations.

ADO Best Practices For Development of iSeriesn-tier Applications

By Howard F. Arner, Jr. for Client Server Development, Inc.

Introduction

Which Access Method

Developing for Performance (Methodology)

Data centric design

Client Programming Best Practices

Use of Prepared Statements

Using DSNLess Connection Strings

Bulk Insertion of Records

Performance Verification

ADO Best Practices
For Development of iSeries
n-tier Applications

By Howard F. Arner, Jr.
for Client Server Development, Inc.