How to add NULL Defense to your SSIS packages for Operators and Functions
In the previous two parts of the NULL Defense series, we saw how to add NULL Defense at the database level for dates and how to add NULL Defense within our package for scripting. We discussed earlier that NULLs can cause undesirable results in data and can cause transformations to fail. Functions and operators are no exception. NULLs will easily by-pass operators and/or functions without any exception until we investigate the results and finally figure out that NULL values were the cause.
In this tip, we will address the issue of how to add NULL Defense to defend our transformations from breaking and shield our operators from generating undesirable results due to NULL values.
For this discussion, we will leverage the package we built in the first part of the NULL Defense Series. We will edit the DataFlow task and make a few changes to simulate the problem that we are going to discuss.
Follow the steps below after you have created the base package.
Add a Derived Column Transformation after the OLE DB Source Adapter and add a Column "FullName" as shown in the picture below.
Now let's say we are interested only in records that have a FirstName value of "Siddharth". So change the split condition of the Conditional Split transformation as shown in the below picture. Notice also that I changed the Default output name to "Invalid Records".
Now execute the package and if you have followed the steps as explained above, your package should fail as shown in the below picture. Also, note that I have changed the Conditional Split output to have Invalid Records go to Invalid Records and Sid go to Valid Records.
There are actually two issues in the package, one which you can apparently see by the red color of the failed transformation and there is another issue which we will discuss shortly.
If you open the Progress Tab, you will find the cause of the exception with a error message. In my package I get the below message.
"[Split valid and invalid rec ] Error: The expression "FirstName == "Siddharth"" on "output "Sid" (42)" evaluated to NULL, but the "component "Split valid and invalid rec" (31)" requires a Boolean results. Modify the error row disposition on the output to treat this result as False (Ignore Failure) or to redirect this row to the error output (Redirect Row). The expression results must be Boolean for a Conditional Split. A NULL expression result is an error."
The cause of the error is very apparent from the error message, that for the comparison a boolean value is expected. You might think that NULL is not equal to the value "Siddharth" so the result should be False. But that is not the case, as NULL cannot be compared to a value, because NULL is something that can be thought of as non-existent.
So now edit the Conditional Split transformation as shown in the below picture using the code below. Basically this is saying if the FirstName is not null and FirstName equals Siddharth then this is True.
!ISNULL(FirstName) && FirstName == "Siddharth"
The issue I was pointing out is not apparent, but can be visualized by placing a Data Viewer between the Derived Column transformation and the Conditional Split transformation. To do this, double click on the line between the Derived Column and Conditional Split transforms, select Data Viewers and add a Grid data viewer.
Execute the package and you should be able to see, that the FullName field that we created in the Derived Column transformation is not providing the desired output as shown below. We can see that two of our FullName columns have NULL values. You would find similar results when you use functions with fields containing NULL values.
Edit the Derived Column transformation and change it as shown in the below picture using the below code. This is saying if the FirstName is NULL then make it blank otherwise use the FirstName, then add a space between the FirstName and LastName and then do the same NULL check on the LastName.
(ISNULL(FirstName) ? "" : FirstName + " ") + (ISNULL(LastName) ? "" : LastName)
Now execute the package and if you have made the changes as explained above, your package should execute successfully with the desired results.
- Add more columns in the derived column transformation and use different functions to learn how NULL values can cause different results with different functions.
- Implement this safeguard in your existing solution to make your code more robust and defensive against NULL values
- Use Data Viewers to check data between each step.
- Review these other related tips:
About the author
View all my tips