In this part of the tutorial, we’ll look at how window functions work. What is their syntax and what are those “windows” exactly?
The backbone of every window function is the OVER clause, which has the following structure:
window function OVER ( [PARTITION BY expression] [ORDER BY expression] [ROWS or RANGE clause] )
Depending on which function is used, some clauses are optional, mandatory or not allowed. Some functions, like PERCENTILE_DISC have even more clauses. We’ll explore all of the options in later parts of this tutorial.
The PARTITION BY clause defines the segment of rows over which a window function will operate. You can think of it that it does a grouping of your rows, like the GROUP BY clause, but without actually grouping them. The ORDER BY clause sorts the rows inside the segment defined by the PARTITION BY clause. If you use the ROWS or RANGE clause (named frame extents), these limit the number of rows within the segment over which the function will be applied.
Let’s illustrate with an example. The following query sums the current and the previous row together:
WITH [CTE_source] AS ( SELECT [Month] = YEAR([OrderDate]) * 100 + MONTH([OrderDate]) ,[SalesTerritoryKey] ,[SalesAmount] = SUM([SalesAmount]) FROM [dbo].[FactResellerSales] WHERE YEAR([OrderDate]) = 2013 AND MONTH([OrderDate]) <= 6 GROUP BY YEAR([OrderDate]) * 100 + MONTH([OrderDate]) ,[SalesTerritoryKey] ) SELECT [Month] ,[SalesTerritoryKey] ,[SalesAmount] ,[WindowFunction] = SUM([SalesAmount]) OVER (PARTITION BY [SalesTerritoryKey] ORDER BY [Month] ROWS BETWEEN 1 PRECEDING AND CURRENT ROW ) FROM [CTE_source] ORDER BY [SalesTerritoryKey],[Month];
Let’s analyze the result set:
In the windows defined by the PARTITION BY, SQL Server will calculate the sum of the previous row and the current row for each record of the window.
A T-SQL query has a logical order in which all of the clauses are processed:
Window functions are part of the SELECT or ORDER BY clause, meaning they are calculated in step 5 OR 6 of the process. In other words, rows can already be filtered out by the WHERE clause or the HAVING clause, or grouped by the GROUP BY clause.
Keep in mind that sometimes you have the same function appear twice in the same expression: once when used in conjunction with a GROUP BY (step 3) and once as a window function (step 5). For example, in the following query we calculate the grand total of the sales amount per year:
SELECT [Year] = YEAR([OrderDate]) ,[Sales Amount] = SUM([SalesAmount]) ,GrandTotal = SUM(SUM([SalesAmount])) OVER() FROM [dbo].[FactResellerSales] GROUP BY YEAR([OrderDate]);