Problem I have a number of DTS packages that I want to rewrite to use SSIS. One problem I have is that these packages make use of the ActiveX Script task in order to run parameterized queries against an Oracle database. Any ideas on how best to convert these to work in an SSIS package?
Solution The best bet for reworking your ActiveX Script tasks would be to use the Script Component Data Flow task. The Script Component allows you to execute VB.Net code within your data flow. When you add a Script Component to a data flow, you have to choose one of the following options:
Source - your code will retrieve data from somewhere and add rows to the data flow
Destination - your code will retrieve data from columns in the data flow and save that data somewhere
Transformation - your code will do some sort of transformation on columns in the data flow
To execute parameterized queries against an Oracle database, you would choose Source. The following is the data flow that we will build as an example:
The first step in the above data flow is a Script Component Source that will execute a parameterized query against an Oracle database and add rows to the data flow. The Row Count assigns the number of rows to a package variable and the Flat File Destination writes out each row to a text file, allowing us to easily see the data extracted. We will focus all of our attention on configuring the Script Component Source and the VB.Net code we need to retrieve data from Oracle.
The following three steps are involved in configuring the Script Component Source:
Inputs and Outputs - define the output columns to be added to the data flow.
Connection Managers - specify any connection managers defined in the SSIS package that we will use in the VB.Net code.
Script - specify any package variables that we will use in the VB.Net code and enter the VB.Net code.
Inputs and Outputs
When you begin to edit the Script Component Source properties, the first page in the editor is Inputs and Outputs as shown below:
Use this page to specify the output columns that you want to add to the data flow. Right click on Output Columns to add columns. In the VB.Net code you will add a single row at a time to the data flow and specify the values for each output column defined here.
The Connection Managers page is shown below:
Use this page to specify any connection managers defined in the SSIS package that you want to use in the VB.Net code. Fill in the Name column with an identifier of your choice; the name you enter will be how you reference the connection manager in the VB.Net code. The Connection Manager column is a drop down list where you can select any connection manager defined in the SSIS package. In our example we will be writing some ADO.Net code to access the Oracle database. While you could define the connection, open it, close it, etc. in the code, it makes sense to let the SSIS package handle those details. In the Script section below I'll show how to access the connection manager in code.
The Script page is shown below:
List any variables defined in the SSIS package that you want to access in the VB.Net code in either the ReadOnlyVariables or the ReadWriteVariables. In this example we'll use the v_dept variable in the where clause of our SQL statement.
There is a Design Script button on this page (not shown in the above screen shot); click it to launch the Microsoft Visual Studio for Applications design environment where you can enter the VB.Net code to access the Oracle database and populate rows and columns in the data flow.
Now let's walk through writing the code for the Script Component Source. Add a reference to the System.Data.OracleClient assembly as shown below. Right click on the References node, choose References from the context menu, select the assembly from the list of available assemblies, then click the Add button. There are a number of assembly references that are added automatically; the Oracle one isn't so you have to do that.
The VB.Net code to execute the parameterized query against the Oracle database and add the results to the data flow is shown below:
' STEP 1: add import of OracleClient
Public Class ScriptMain
' STEP 2: add variables
Private connMgr As IDTSConnectionManager90
Private oracleConn As OracleConnection
Private oracleCmd As OracleCommand
Private oracleRdr As OracleDataReader
' STEP 3: add override to get connection
Public Overrides Sub AcquireConnections(ByVal Transaction As Object)
connMgr = Me.Connections.ORCL
oracleConn = CType(connMgr.AcquireConnection(Transaction), _
' STEP 4: add override to setup the oracle command
Public Overrides Sub PreExecute()
Dim sql As String = "SELECT ENAME, JOB FROM SCOTT.EMP " + _
"WHERE DEPTNO = :DEPT"
oracleCmd = New OracleCommand(sql, oracleConn)
.CommandType = CommandType.Text
' STEP 5: Add override to dispose of the oracle command
Public Overrides Sub PostExecute()
If Not oracleCmd Is Nothing Then
' STEP 6: Add override to release the oracle connection
Public Overrides Sub ReleaseConnections()
' STEP 7: Add code to execute the query and add the results
' to the data flow
Public Overrides Sub CreateNewOutputRows()
Dim DEPT As Integer = Me.Variables.vdept
.Parameters(":DEPT").Value = DEPT
oracleRdr = oracleCmd.ExecuteReader()
OutputBuffer.EmployeeName = oracleRdr.GetString(0)
OutputBuffer.Job = oracleRdr.GetString(1)
The following are the main points about the code shown above:
STEP 1 - adds an import statement so that we can reference classes in the System.Data.OracleClient namespace without specifying the full name; e.g. in STEP 2 the oracleConn variable is defined as an OracleConnection type; without the imports it would have to be defines as System.Data.OracleClient.OracleConnection.
STEP 2 - defines some variables that are used throughout the code.
STEP 3 - contains the code to use the Oracle connection manager defined in the SSIS package.
STEP 4 - sets up the oracleCmd variable, specifying the query containing a parameter for the DEPT value. Note that the parameter is prefixed with a colon in the query; this is required Oracle syntax. You can have multiple parameters; just add them to the query and call the Parameters.Add method as shown for each one.
STEP 5 and STEP 6 - are both performing cleanup functions.
STEP 7 - contains the code to execute the query and add each row returned to the data flow. The oracleCmd and oracleRdr variables were defined in STEP 2. Note that the vdept variable is used to supply the value for the DEPT parameter that was defined in STEP 4. The ExecuteReader method is used where you want to execute a query that will return rows; it returns an OracleReader. The Read method is called to read each row in the result set. For each row that is read, we add a row to the data flow by calling the AddRow method on the OutputBuffer, then assign values to each of the output columns defined on the Inputs and Outputs page of the Script Component Source. The SetEndOfRowSet method is called on the OutputBuffer to signal that we are done adding rows.
Note that while there is a fair amount of code, the majority of it would be the same for any query. For the most part only STEP 4 and STEP 7 would need to be changed based on the query to be run.
The Script Component is a very useful when you want to write custom VB.Net code and execute it in the data flow.
You can use the Script Component Source to retrieve data from just about anywhere.
Download the sample SSIS package and experiment with it. See the README.txt file for some configuration details about running the sample. The sample assumes you have the Oracle client installed and access to an Oracle instance with the sample data loaded and available.
Last Update: 7/6/2009
About the author
Ray Barley is a Principal Architect at RDA Corporation and a MSSQLTips.com BI Expert.