When selecting data from a database using T-SQL, one can use CASE statements to derive new columns based on data in the database. However, when reading data from files in SQL Server Integration Services (SSIS) there isn't a CASE statement readily available when new columns need to be derived or existing values need to be replaced.
The solution to this problem is to use the Derived Column transformation in SSIS. The image below shows a simple SSIS package that utilizes a Flat File Source, Derived Column Transform, and OLE DB Destination.
The data used for this example is shown below.
The data types assigned for each column are represented below. This image is from the SQL Server Management Studio's display of the destination table.
Double-clicking on the Derived Column transformation brings up the Derived Column Transformation Editor.
For the first example, we will create a new column named PriceCode with a data type of string with a length of 4. If the value of the Amount column is less than 300, then we set PriceCode equal to "Low", else we set PriceCode equal to "High".
- We type "PriceCode" in the "Derived Column Name" column.
- We leave the "Derived Column" column set to the default of "add as new column".
- Drag the Amount column from the Columns folder above to the "Expression" column text box. Notice how SSIS puts square brackets around the column name by default. Also, notice how the Data Type column defaults to the type specified in the column metadata, which is currency [DT_CY].
At this point we have just copied the data as-is from the Amount column to the newly created PriceCode column. Now, we need to add our logic to set the price code.
We need to add the following code to the Expression text box to set the value of the PriceCode column to "Low" if the Amount is less than 300, else we set PriceCode equal to "High".
(DT_STR, 4, 1252)([Amount] < 300 ? "Low" : "High" )
The syntax in the above statement is (condition_to_be_tested ? value_if_true : value_if_false). In the above statement, "Amount < 200" is the condition to be tested, the expression following the question mark, "Low", is the true condition while the expression following the colon, "(Amount < 500 ? "Med" : "High")", is the false condition which happens to be another conditional test nested in the statement.
If there is a syntax error with the Expression, the text in the Expression box will be red. If the text is black, then the syntax is correct. By default, the data type for strings is set to Unicode string [DT_WSTR]. We use the Type Cast (DT_STR, 4, 1252) to cast the derived value to a string with a length of 4 using a code page of 1252.
Next, we have to add the PriceCode column to the destination table.
alter table [dbo].[tblTips] add PriceCode varchar(4) null
After adding the column to the destination table, we need to map the PriceCode from the "Available Input Columns" to the "Available Destination Columns" in the OLE DB Destination Editor. Failing to perform this mapping will result in NULL values in the PriceCode column in our destination table. The following image shows the editor before the mapping.
The following image shows the editor after the mapping.
After executing the SSIS package, we execute a query to view the contents of the destination table. We can see in the image below that the PriceCode column is populated as desired.
Taking this example a step further, we can add the following code to the Expression text box of the Derived Column Transformation Editor to set the value of the PriceCode column to:
(DT_STR,4,1252)(Amount < 200 ? "Low" : (Amount < 500 ? "Med" : "High"))
The results of this change are shown below.
Taking this example another step further, we can add the following code to the Expression text box of the Derived Column Transformation Editor to set the value of the PriceCode column to
(DT_STR,4,1252)((Amount < 200 || Category == "A") ? "X" : (Amount >= 200 && Amount < 400 && Category == "B" ? "Y" : (Amount >= 500) ? "Z" : "U"))
In the above statement, the double pipe "||" is a logical OR operator, the double ampersand "&&" is a logical AND operator, while the double equal sign "==" is the test for equality operator. If you are familiar with C/C++, then these operators have the same functionality.
The results of this change are shown below.
- Once you get this example working explore the changing of data types, using different logic statements, and utilizing variables and parameters.
- Please refer to the following tips for further assistance with SSIS:
Last Update: 4/16/2013
About the author
View all my tips