By: Koen Verbeeck | Last Updated: 2015-01-28 | Comments | Integration Services NULL Management
An important task in data warehouse projects is incorporating data cleansing in the ETL flows. A very common issue there is dealing with NULL values. In SQL Server 2012, a new function called REPLACENULL was added to the Integration Services (SSIS) expression language to simplify derived column constructs. This tip will take a closer look at this new function.
SQL Server 2012 released a lot of new features for Integration Services. A lot has already been written about them, but somehow the new SSIS expression functions are somewhat left behind and REPLACENULL is one of them. It isn't even mentioned in the What's New page about SSIS. It's a bit of a shame, because it's a really useful function.
SSIS REPLACENULL Syntax and Example
The purpose of the function is pretty straightforward: check if a value is NULL and if it is, replace it with another value. The syntax is as follows:
So if expression1 is NULL, it will be replaced by expression2 by the derived column transformation. If it is not NULL, expression1 itself is returned. If both are NULL, NULL is returned. If the data types of both arguments are different, SSIS will try to convert the data type of the 2nd expression to that of the 1st expression. If the data types are incompatible, an error will be returned.
For example, suppose we have the following expression in a derived column inside a data flow, where columnA is of the DT_I4 data type (an integer):
Note that SSIS does not stop us from creating an expression with incompatible data types. However, if we run the package, we are confronted with an error:
Some of you might already have noticed that this new function resembles the T-SQL function ISNULL a lot and that is correct. The behavior of REPLACENULL can exactly be compared to that of ISNULL. So why not call this function ISNULL instead of REPLACENULL? Because there is already an SSIS function called ISNULL! However, this function is a Boolean function. It takes only one argument and it returns true or false whether the argument is NULL or not. The SSIS ISNULL function can lead to a bit of confusion since it does not behave the same way as its T-SQL counterpart.
Checking for NULL values before Integration Services 2012
Since REPLACENULL was only introduced since SQL Server 2012, how can you check for NULL values in SSIS 2005/2008/2008R2? You had to use a combination of the ISNULL function and the conditional operator. A typical expression would look like this:
ISNULL(columnA) ? "Hello World" : columnA
It's easy to see REPLACENULL allows for more elegant expressions.
REPLACENULL and Divide by Zero
With the REPLACENULL function, a convenient expression can be written that checks for NULL values and for divide by zero errors at the same time. Suppose we want to calculate the ratio columnB / columnA of the integer columns A and B. If we just have the following derived column expression, the code is vulnerable for a divide by zero error when columnA contains the value 0.
columnB / columnA
So we have to verify columnA is not 0. But we also want to check if columnA is NULL or not, because if it is NULL, the business requirements state the result should be 0. With the following expression, we can do both checks in a single line:
(REPLACENULL(columnA,0) == 0) ? 0.0 : columnB / columnA
If columnA is 0, the Boolean expression will return 0. If columnA is NULL, REPLACENULL will replace it by 0 and the condition is true, resulting in 0 yet again. If column A is neither 0 nor NULL, the ratio is calculated. If you also want to verify columnB, the expression can easily be expanded with the || (logical OR) operator.
(REPLACENULL(columnA,0) == 0 || REPLACENULL(columnB,0) == 0) ? 0.0 : columnB / columnA
In this tip, we have introduced the REPLACENULL function which was introduced in SQL Server 2012. We saw how we could use the function and how it can help us dealing with NULL values. There is also an alternative expression given for SSIS versions earlier than 2012.
- Try it out yourself! Think about your SSIS packages and how REPLACENULL can make your life (and expressions) easier.
- Another tip about dealing with NULL values: How to add NULL defense for dates in SSIS packages.
- Also check out: Deciding between COALESCE and ISNULL in SQL Server.
- For more tips about the new features of Integration Services 2012:
Last Updated: 2015-01-28
About the author
View all my tips