By: Aaron Bertrand | Updated: 2023-12-01 | Comments | Related: > TSQL
Problem
GROUP BY
queries can
become overly convoluted if your grouping column is a complex expression. Because
of the logical processing order of a query, you're often forced to repeat
such an expression since its alias can't be used in the
GROUP BY
clause.
Oracle recently solved this in their 23c release by adding the ability to
GROUP BY column_alias
. This
is such simple but powerful syntax, and I'm hoping we can get SQL Server to
follow Oracle's lead.
Solution
Reviewing the Logical Processing Order section of the official documentation:
Since the GROUP BY
is
processed first, aliases defined in the
SELECT
list are simply unavailable.
This means you can't say, for example:
SELECT TheMonth = DATEFROMPARTS(YEAR(create_date), MONTH(create_date), 1), MonthlyDBs = COUNT(*) FROM sys.databases GROUP BY TheMonth;
With this query, the following error is returned:
Msg 207, Level 16, State 1 Invalid column name 'TheMonth'.
To group by the TheMonth column, we need to use one of the following workarounds:
Repeating the expression
SELECT TheMonth = DATEFROMPARTS(YEAR(create_date), MONTH(create_date), 1), MonthlyDBs = COUNT(*) FROM sys.databases GROUP BY DATEFROMPARTS(YEAR(create_date), MONTH(create_date), 1);
Using a common table expression (CTE)
WITH cte(TheMonth) AS ( SELECT DATEFROMPARTS(YEAR(create_date), MONTH(create_date), 1) FROM sys.databases ) SELECT TheMonth, MonthlyDBs = COUNT(*) FROM cte GROUP BY TheMonth;
Using a subquery
SELECT TheMonth, MonthlyDBs = COUNT(*) FROM ( SELECT DATEFROMPARTS(YEAR(create_date), MONTH(create_date), 1) FROM sys.databases ) AS sq(TheMonth) GROUP BY TheMonth;
Using CROSS APPLY
SELECT TheMonth, MonthlyDBs = COUNT(*) FROM sys.databases CROSS APPLY ( SELECT DATEFROMPARTS(YEAR(create_date), MONTH(create_date), 1) ) AS ca(TheMonth) GROUP BY TheMonth;
Note: In SQL Server 2022 or Azure SQL Database, you could simplify the expression
toDATETRUNC(MONTH,
create_date)
. But you still can't avoid one of the above workarounds.
In Oracle (again, starting with 23c), you can simply say (forgetting about SQL Server-specific metadata):
SELECT TRUNC(DateTimeColumnName, 'MONTH') AS TheMonth, COUNT(*) AS MonthlyWhatever FROM TableName GROUP BY TheMonth;
This form is much more convenient and removes the need to understand logical
processing order – including clearing up the often-confusing fact that aliases
from SELECT
are
available in ORDER BY
.
Wouldn't it be great to have this syntax support in SQL Server?
On the other hand…
I don't want SQL Server to blindly follow Oracle's lead.
Another recent addition was GROUP
BY column_position
, which allows you to reference a grouping column by its
ordinal (which has been in MySQL for some time). Like with
ORDER BY column_position
,
I consider this
a ticking time bomb, and I do not want this parity at all.
Next Steps
Vote for this feature suggestion.
Also, see these tips and other resources:
- Learning the SQL GROUP BY Clause
- SQL GROUP BY Queries to Summarize and Rollup Sets of Records
- SQL Server DATEFROMPARTS Function
- New Date and Time Functions in SQL Server 2012
- SELECT (Transact-SQL)
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2023-12-01