SQL Server COUNT() Function Performance Comparison


By:   |   Updated: 2016-09-19   |   Comments (7)   |   Related: More > Functions - System

Problem

SQL Server Developers have a variety of innovative ways to use the COUNT function in SQL Server.  Often times the assumption is that one syntax provides better performance than the others. This tip will explain the differences between the following COUNT function varieties: COUNT(*) vs. COUNT(1) vs. COUNT(column_name) to determine if there is a performance difference.

Solution

There are more efficient ways than using the COUNT() function if the goal is just to retrieve the total row count from a table. One way is to retrieve the row count directly from SQL Server sys.partitions DMV. But most of the time, the COUNT function is still used when counting a subset of rows based on filter criteria specified with in the WHERE clause of a T-SQL statement.

In terms of behavior, COUNT(1) gets converted into COUNT(*) by SQL Server, so there is no difference between these. The 1 is a literal, so a COUNT('whatever') is treated as equivalent.

COUNT(column_name) behaves differently. If the column_name definition is NOT NULL, this gets converted to COUNT(*). If the column_name definition allows NULLs, then SQL Server needs to access the specific column to count the non-null values on the column.

COUNT(*) vs. COUNT(1) vs. COUNT(column_name) Performance Comparison

A test table is created in SQL Server 2016 Developer Edition on RTM CU1.

SET NOCOUNT ON;
CREATE TABLE #Count 
(ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
, ColumnWithNulls CHAR(100) NULL DEFAULT 'Allow NULLs'
, ColumnWithNoNulls CHAR(100) NOT NULL DEFAULT 'Do not allow NULLs'
, LargeColumn CHAR(5000) NOT NULL DEFAULT 'Large column');
GO

CREATE NONCLUSTERED INDEX NCI_Count_ColumnWithNoNulls ON #Count (ColumnWithNoNulls)
GO

INSERT INTO #Count DEFAULT VALUES 
GO 10000

UPDATE #Count SET  ColumnWithNulls = NULL WHERE ID % 2 = 0
GO

Here are a few COUNT query runs with row counts.  Note: the above code creates 5000 records and 2500 of the records have a NULL value for column ColumnWithNulls.

Query

Row Count

SELECT COUNT(*) FROM #Count WHERE ID > 5000 5000
SELECT COUNT(1) FROM #Count WHERE ID > 5000 5000
SELECT COUNT(LargeColumn) FROM #Count WHERE ID > 5000 5000
SELECT COUNT(ColumnWithNulls) FROM #Count WHERE ID > 5000 2500

Interpretation of Results

SELECT COUNT(*)

Looking at the execution plan, the Aggregate operation AggType is countstar and the ScalarString is Count(*). SQL Server is actually pretty smart as it can choose an efficient route to get the record count. In this case, there is a clustered index and a non-clustered index which does not allow NULL. SQL Server uses index NCI_Count_ColumnWithNonNulls and performs a smaller index scan to get the table row count.

SQL Server Execution Plan for COUNT(*)

SELECT COUNT(1)

In this execution plan, COUNT(1) gets converted to use AggType countstar and ScalarString is Count(*).

SQL Server Execution Plan for COUNT(1)

SELECT COUNT(LargeColumn)

There is no index defined on column LargeColumn, but this column definition does not allow NULLs. When a count is performed on this column, SQL Server then uses COUNT(*) and has the flexibility to choose a path it deems efficient to return the row count.

In this case, SQL Server can still maintain a scan on the nonclustered index NCI_Count_ColumnWithNoNulls to get the table row count.

SQL Server Execution Plan for COUNT(LargeColumn)

SELECT COUNT(ColumnWithNulls)

There is no index defined on column ColumnWithNulls and the column definition allows NULLs. SQL Server performs a count on the column using the clustered index. This would make sense because SQL Server needs to exclude counting rows with NULLs, so it will need to access all rows in the column.

The COUNT function itself is converted to use AggType COUNT_BIG. Typically, you would want to create a nonclustered index here, so that SQL Server would be able to perform a scan on a smaller index.

SQL Server Execution Plan for COUNT(Column With Nulls)

Summary

Typically I would recommend using COUNT(*) instead of many other options as it is also what SQL Server interprets. There is no performance gain by specifying a literal in the COUNT function.

COUNT(column_name) is applicable when the row count should exclude null values from the table, but if the column_name definition does not allow NULLs, then it is equivalent to COUNT(*).

Next Steps


Last Updated: 2016-09-19


get scripts

next tip button



About the author
MSSQLTips author Simon Liew Simon Liew is an independent SQL Server Consultant in Sydney, Australia. He is a Microsoft Certified Master for SQL Server 2008 and holds a Masterís Degree in Distributed Computing.

View all my tips





Comments For This Article




Wednesday, December 18, 2019 - 7:57:16 PM - Simon Liew Back To Top

Hi Jeff,

The tip is about COUNT(*) vs COUNT(1) vs COUNT(column_name) performance. Internally, SQL Server converts COUNT(1) and COUNT(column_name) into COUNT(*) when the column definition is NOT NULL.

I'm not sure if CPU, I/O stats, duration, reads and writes are necessary when its obvious SQL Server is using the same COUNT(*) function underneath despite the variance in code which is evident in the query plan.

But I take your point to provide measurement as proof to be clearer and thanks for that.

Just curious, the example that you've provided doesn't seem to have reference to COUNT() function?


Wednesday, December 11, 2019 - 10:39:54 AM - Jeff Moden Back To Top

I realize that I'm a bit late to this thread but you've not actually proven your conclusion of no performance advantage because you've not actually measured CPU usage, duration, reads, or writes.  There's no question that execution plans can aid you in figuring out what needs to be tuned but you MUST NOT EVER make a decision as to which code is better based soley on execution plans.

In fact, here's a simple example that has led thousands of people into thinking that using a Recursive CTE (rCTE) to produce incremental counts is a "Best Practice".  Turn on the Actual Execution plan, run it, and see.  Then, turn off the exectuion plan (it actually imparts some delays in running code) and see messages tab where you'll see proof that the exact opposite of what the execution plan seems to say is the actual truth.

DECLARE @BitBucket INT
;
-------------------------------------------------------------------------------
        RAISERROR('--===== Reursive CTE =====',0,0) WITH NOWAIT;
    SET STATISTICS TIME,IO ON;

   WITH cte AS
(
 SELECT N = 1
  UNION ALL
 SELECT N = N + 1
   FROM cte
  WHERE N < 1000000
)
 SELECT @BitBucket = N
   FROM cte
 OPTION (MAXRECURSION 0)
;
    SET STATISTICS TIME,IO OFF;
  PRINT REPLICATE('-',119)
;
-------------------------------------------------------------------------------
        RAISERROR('--===== Pseudo-Cursor =====',0,0) WITH NOWAIT;
    SET STATISTICS TIME,IO ON;
 SELECT TOP 1000000
        @BitBucket  = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
   FROM      sys.all_columns ac1
  CROSS JOIN sys.all_columns ac2
;
    SET STATISTICS TIME,IO OFF;
PRINT REPLICATE('-',119)
;

Friday, September 23, 2016 - 10:45:06 AM - Simon Liew Back To Top

Hi Sahir,

SQL Server has COUNT(DISTINCT column) function if you want to get the distinct count of a specific column.


Tuesday, September 20, 2016 - 7:59:15 AM - sahir Back To Top

What's the best way to get a count of distinct values in the column? 


Monday, September 19, 2016 - 6:30:28 PM - Simon Liew Back To Top

Hi Ryan,

COUNT(*) and COUNT(1) behaviour has been the same since SQL Server 2008 to the current SQL Server 2016. 

 


Monday, September 19, 2016 - 3:12:12 PM - Ryan Cerney Back To Top

Great article, thank you!  Quick question, does the way SQL handles the count(1) vs count(*) the same for all versions or is the engine slightly different in 2008R2 and newer, etc?   

 


Monday, September 19, 2016 - 5:25:25 AM - VJ Back To Top

 "COUNT(column_name) is applicable when the row count should exclude non-null values from the table"

Mayby quite opposite ...?

 



download


Recommended Reading

Concatenate SQL Server Columns into a String with CONCAT()

SQL Server Rounding Functions - Round, Ceiling and Floor

SQL Server 2016 STRING_SPLIT Function

Using FOR XML PATH and STRING_AGG() to denormalize SQL Server data

Different ways to get random data for SQL Server data sampling





get free sql tips
agree to terms


Learn more about SQL Server tools