When writing T-SQL, a lot of developers use either COALESCE or ISNULL in order to provide a default value in cases where the input is NULL. They have various reasons for their choice, though sometimes this choice may be based on false assumptions. Some think that ISNULL is always faster than COALESCE. Some think that the two are functionally equivalent and therefore interchangeable. Some think that you need to use COALESCE because it is the only one that adheres to the ANSI SQL standard. The two functions do have quite different behavior and it is important to understand the qualitative differences between them when using them in your code.
The following differences should be considered when choosing between COALESCE and ISNULL:
The COALESCE and ISNULL SQL Server statements handle data type precedence differently
COALESCE determines the type of the output based on data type precedence. Since DATETIME has a higher precedence than INT, the following queries both yield DATETIME output, even if that is not what was intended:
If you uncomment the second SELECT, the batch terminates with the following error, since you can't implicitly convert a DATETIME to INT:
Msg 257, Level 16, State 3, Line 3 Implicit conversion from data type datetime to int is not allowed. Use the CONVERT function to run this query.
While in some cases this can lead to errors, and that is usually a good thing as it allows you to correct the logic, you should also be aware about the potential for silent truncation. I consider this to be data loss without an error or any hint whatsoever that something has gone wrong. For example:
DECLARE @c5 VARCHAR(5);
SELECT 'COALESCE', COALESCE(@c5, 'longer name')
SELECT 'ISNULL', ISNULL(@c5, 'longer name');
COALESCE longer name
This happens because ISNULL takes the data type of the first argument, while COALESCE inspects all of the elements and chooses the best fit (in this case, VARCHAR(11)). You can test this by performing a SELECT INTO:
DECLARE @c5 VARCHAR(5);
c = COALESCE(@c5, 'longer name'),
i = ISNULL(@c5, 'longer name')
SELECT name, t = TYPE_NAME(system_type_id), max_length, is_nullable
WHERE [object_id] = OBJECT_ID('dbo.testing');
name system_type_id max_length is_nullable
---- -------------- ---------- -----------
c varchar 11 1
i varchar 5 0
As an aside, you might notice one other slight difference here: columns created as the result of COALESCE are NULLable, while columns created as a result of ISNULL are not. This is not really an endorsement one way or the other, just an acknowledgement that they behave differently. The biggest impact you'll see from this difference is if you use a computed column and try to create a primary key or other non-null constraint on a computed column defined with COALESCE, you will receive an error:
CREATE TABLE dbo.works
b AS ISNULL(a, 15) PRIMARY KEY
CREATE TABLE dbo.breaks
b AS COALESCE(a, 15) PRIMARY KEY
Msg 1711, Level 16, State 1, Line 1 Cannot define PRIMARY KEY constraint on column 'b' in table 'breaks'. The computed column has to be persisted and not nullable. Msg 1750, Level 16, State 0, Line 1 Could not create constraint. See previous errors.
Using ISNULL, or defining the computed column as PERSISTED, alleviates the problem. Trying again, this works fine:
CREATE TABLE dbo.breaks
b AS COALESCE(a, 15) PERSISTED PRIMARY KEY
Just be aware that if you try to insert more than one row where a is either NULL or 15, you will receive a primary key violation error.
One other slight difference due to data type conversion can be demonstrated with the following query:
Both columns are converted to VARCHAR(12), but COALESCE ignores the padding implicitly associated with concatenating a CHAR(10), while ISNULL obeys the specification for the first input and converts the empty string to a CHAR(10).
The SQL Server COALESCE statement supports more than two arguments
Consider that if you are trying to evaluate more than two inputs, you'll have to nest ISNULL calls, while COALESCE can handle any number. The upper limit is not explicitly documented, but the point is that, for all intents and purposes, COALESCE will better handle your needs in this case. Example:
SELECT COALESCE(a, b, c, d, e, f, g) FROM dbo.table;
-- to do this with ISNULL, you need:
SELECT ISNULL(a, ISNULL(b, ISNULL(c, ISNULL(d, ISNULL(e, ISNULL(f, g)))))) FROM dbo.table;
The two queries produce absolutely identical plans; in fact, the output is extrapolated to the exact same expression for both queries:
CASE WHEN [tempdb].[dbo].[table].[a] IS NOT NULL THEN [tempdb].[dbo].[table].[a]
ELSE CASE WHEN [tempdb].[dbo].[table].[b] IS NOT NULL THEN [tempdb].[dbo].[table].[b]
ELSE CASE WHEN [tempdb].[dbo].[table].[c] IS NOT NULL THEN [tempdb].[dbo].[table].[c]
ELSE CASE WHEN [tempdb].[dbo].[table].[d] IS NOT NULL THEN [tempdb].[dbo].[table].[d]
ELSE CASE WHEN [tempdb].[dbo].[table].[e] IS NOT NULL THEN [tempdb].[dbo].[table].[e]
ELSE CASE WHEN [tempdb].[dbo].[table].[f] IS NOT NULL THEN [tempdb].[dbo].[table].[f]
ELSE [tempdb].[dbo].[table].[g] END END END END END END
So the main point here is that performance will be identical in this case and that the T-SQL itself is the issue, it becomes needlessly verbose. And these are very simple, single-letter column names, so imagine how much longer that second query would look if you were dealing with meaningful column or variable names.
COALESCE and ISNULL perform about the same (in most cases) in SQL Server
Different people have run different tests comparing ISNULL and COALESCE, and have come up with surprisingly different results. I thought I would introduce a new test based on SQL Server 2012 to see if my results show anything different. So I created a simple test with two variables, and tested the speed of COALESCE and ISNULL in four scenarios: (1) both arguments NULL; (2) first argument NULL; (3) second argument NULL; and, (4) neither argument NULL. I simply assigned the result of COALESCE or ISNULL to another variable, in a loop, 500,000 times, and measured the duration of each loop in milliseconds. This was on SQL Server 2012, so I was able to use combined declaration / assignment and a more precise data type than DATETIME:
@a VARCHAR(5), -- = 'str_a', -- this line changed per test
@b VARCHAR(5), -- = 'str_b', -- this line changed per test
@x INT = 0,
@time DATETIME2(7) = SYSDATETIME();
WHILE @x <= 500000
SET @v = COALESCE(@a, @b); --ISNULL --this line changed per test
SET @x += 1;
SELECT DATEDIFF(MILLISECOND, @time, SYSDATETIME());
I ran each test 10 times, recorded the duration in milliseconds, and then averaged the results:
This demonstrates that, at least when we're talking about evaluating constants (and here I only evaluated two possibilities), the difference between COALESCE and ISNULL is not worth worrying about.
Where performance can play an important role, and hopefully this scenario is uncommon, is when the result is not a constant, but rather a query of some sort. Consider the following:
SELECT COALESCE((SELECT MAX(index_id) FROM sys.indexes WHERE [object_id] = t.[object_id]), 0)
FROM sys.tables AS t;
SELECT ISNULL((SELECT MAX(index_id) FROM sys.indexes WHERE [object_id] = t.[object_id]), 0)
FROM sys.tables AS t;
If you look at the execution plans (with some help from SQL Sentry Plan Explorer), the plan for COALESCE is slightly more complex, most noticeably with an additional Stream Aggregate operator and a higher number of reads. The plan for COALESCE:
And the plan for ISNULL:
The COALESCE plan is actually evaluated as something like:
SELECT CASE WHEN (SELECT index_id FROM sys.indexes WHERE [object_id] = s.[object_id]) IS NOT NULL
THEN (SELECT MAX(index_id) FROM sys.indexes WHERE [object_id] = s.[object_id]) ELSE 0 END;
In other words, it is evaluating at least part of the subquery twice. To me, this is kind of like selecting the number of rows of a table to determine if the number is greater than zero, then as a result of that, computing the count again. ISNULL, on the other hand, somehow has the smarts to only evaluate the subquery once. To be honest, I think this is often an edge case, but the sentiment seems to proliferate into all discussions that involve the two functions.
If you are writing complex expressions using ISNULL, COALESCE or CASE where the output is either a query or a call to a user-defined function, it is important to test all of the variants to be sure that performance will be what you expect. If you are using simple constant, expression or column outputs, the performance difference is almost certainly going to be negligible. But if every last nanosecond is important, the only way you can know for sure which will be faster, is to test for yourself, on your hardware, against your schema and data.
ISNULL is not consistent across Microsoft products/languages
ISNULL can be confusing for developers from other languages, since in T-SQL it is an expression that returns a context-sensitive data type based on exactly two inputs, while in - for example - MS Access, it is a function that always returns a Boolean based on exactly one input. In some languages, you can say:
-- do something
In SQL Server, you have to compare the result to something, since there are no Boolean types. So you have to write the same logic in one of the following ways:
IF something IS NULL
-- do something
IF ISNULL(something, NULL) IS NULL
-- do something
IF ISNULL(something, '') = ''
-- do something
Of course you have to do the same thing with COALESCE, but at least it's not different depending on where you're using the function. Now you could also argue the other way - and I'm trying hard to not be biased against ISNULL here. Since COALESCE isn't available in other languages or within MS Access at all, it can be confusing for those developers to have to learn about COALESCE when they realize that ISNULL does not work the same way.
Another interesting note is that SQL Server Compact Edition does not support ISNULL, so if you are working with both SQL Server and SQL Server Compact, you're going to have to use COALESCE or CASE when using the latter.
COALESCE is ANSI standard
COALESCE is part of the ANSI SQL standard, and ISNULL is not. Adhering to the standard is not a top priority for me personally; I will use proprietary features if there are performance gains to take advantage of outside of the strict standard (e.g. a filtered index), if there isn't an equivalent in the standard (e.g. GETUTCDATE()), or if the current implementation doesn't quite match the functionality and/or performance of the standard (again a filtered index can be used to honor true unique constraints). But when there is nothing to be gained from using proprietary functionality or syntax, I will lean toward following the standard.
Developers should be well aware of the different programmability characteristics of COALESCE and ISNULL, and should be careful not to draw any general conclusions about performance from hearsay or from isolated observations.
Personally I always use COALESCE both because it is compliant to the SQL standard and because it supports more than two arguments. Also I have yet to write a query that uses an atomic subquery as one of the possible outcomes of CASE or COALESCE, so the obscure scenario where performance can matter has not been a concern to date.
Take an inventory of your T-SQL codebase to see if you are using one or both functions consistently.
Test some of those scenarios to see if using the other function will work better (particularly cases where the result is a subquery or function call).
You write: "Where performance can play an important role, and hopefully this scenario is uncommon, is when the result is not a constant, but rather a query of some sort." Unfortunately, this scenario is very common, if you have subqueries evaluating SUM. For reasons nobody really understands, SELECT SUM(...) FROM ...empty set... returns NULL, not 0. Thus, if you use such a sum in a more complex expression, e.g.
(SELECT SUM(...) FROM ...) + (SELECT SUM(...) FROM ...)
it is necessary in virtually all cases to write
COALESCE((SELECT SUM(...) FROM ...),0) + COALESCE((SELECT SUM(...) FROM ...), 0)
ISNULL((SELECT SUM(...) FROM ...),0) + ISNULL((SELECT SUM(...) FROM ...), 0)
Rewriting this to the respective CASE statements shows that using COALESCE will typically give you a severe performance penalty.
Tuesday, October 07, 2014 - 11:14:41 AM - Aaron Bertrand
Hi Thomas, yes, you are right - it seems the COALESCE version in this case forces parameterization and also throws in a few implicit converts for good measure - I believe these are primarily what lead to the scan, but it is probably also due to the predicate expansion into a series of CASE expressions (which isn't exposed by the plan for the isnull() variation). I can get a seek with coalesce() if I use OPTION (RECOMPILE) on the query (which eliminates the parameterization and makes the seek predicates much simpler).
Tuesday, October 07, 2014 - 5:22:29 AM - Thomas Franz
Thanks Aaron for your help and sorry for the delay.
In my scenario I have 2 scalar functions. Lets's call them fn_Scalar_1 and fn_Scalar_2.
fn_Scalar_1 calls fn_Scalar_2. Now fn_Scalar_2 uses COALESCE functions with 3 arguments. something like COALESCE(TableA.Value, TableB.Value1, TableB.DefaultValue). Here the first argument to the function comes from Table A which has a LEFT JOIN to TABLE B (eg. FROM Table B LEFT JOIN Table A ). Not sure if that left join converted to subquery by the optimizer.
My select query is doing CROSS APPLY to fn_Scalar_1. Is it the CROSS APPLY that is causing optimizer to treat arguments of the COALESCE as subqueries ?
I replaced CROSS APPLY with in-line scalar function call (eg. select fn_Scalar_1), but I still see the same issue.
My thoughts are
1) CROSS APPLY causing optimizer to treat arguments of the COALESCE as subqueries
2) LEFT JOIN in fn_Scalar_2 causing subquery
3) In-Line fn_Scalar_1 call is also shows the same behavior. That suggests it is not the cross apply to fn_Scalar_1 causing this behavior but LEFT join within fn_Scalar_2 causing this issue
Tuesday, February 12, 2013 - 3:51:00 PM - Aaron Bertrand
Devesh can you post your actual query somewhere (like StackOverflow) so I have some way to determine exactly what is going on? If you do not have a subquery then I suspect your result is a misunderstanding of how your joins should work, not a bug and not symptom of the Connect item you referenced.
Thanks Devesh, this is only a concern where a subquery is involved (in which case ISNULL, or probably even a re-write, is a better choice anyway). The problem is due to the same reason I pointed out in the plan differences above - the subquery is evaluated twice.
So is it worth the "risk"? There should be no "risk" at all if you aren't using a subquery.
Jeff, sure in absolute isolation you can determine some slight difference. But you've kind of proven my point: in practical usage, where you're never simply performing this operation millions of times in isolation, the delta is such a miniscule difference that it really becomes negligible. To draw a comparison, who cares how fast a human would fall 100 feet if it weren't for gravity and wind resistance?
Thank for the article, Aaron. Although I personally prefer COALESCE because it's the ANSI standard, I've found ISNULL is a better alternative in order to control nullability when using SELECT...INTO.
One minor clarification is that ISNULL can return NULL. Nullability with ISNULL is determined by the nullability of the second function parameter unless the first parameter is non-nullable (which of course eliminates the need for ISNULL).
SELECT ISNULL(NULL, NULL) AS Col1 --int NULL ,ISNULL(NULL, 1) AS Col2 --int NOT NULL ,ISNULL(NULL, CAST(NULL AS int)) AS Col3 --int NULL ,ISNULL(1, CAST(NULL AS int)) AS Col4 --int NOT NULL INTO dbo.IsNullExample;
EXEC sp_help 'dbo.IsNullExample';
Tuesday, May 08, 2012 - 6:03:17 AM - Paul St., Amant
Thnaks for the explanation. I always wondered if there was much of a difference. I use ISNULL mainly because I'm using it in the same context as when I use != null in code to set a default value. Plus it's less typing and the word coalesce is a pretty uncommon and one I've never had the chance to use in conversation (even with coders). Since there isn't much to worry about I suppose those reasons are good enough. Thanks.
Tuesday, May 08, 2012 - 3:34:22 AM - Wilfred van Dijk
Gosh... be careful folks. Using a While Loop to test like this makes things come out pretty darned equal because most of the time is spent calulating the next value of @X. Try something like this instead and see that ISNULL is actually twice as fast as COALESCE in some cases especially in the NULL/NULL case.
DBCC DROPCLEANBUFFERS; DECLARE @a VARCHAR(5), -- = 'str_a', -- this line changed per test @b VARCHAR(5), -- = 'str_b', -- this line changed per test @v VARCHAR(5), @time DATETIME; SELECT @time = GETDATE(); SELECT TOP (1000000) @v = coalesce(@a, @b) -- Change to ISNULL for other half of testing FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2 ; SELECT DATEDIFF(ms, @time, GETDATE()); ;