SQL Server 2005 and 2008 Ranking Functions DENSE_RANK and NTILE
SQL Server introduced four new ranking functions either to rank records in a result-set or to rank records within groups of records of a result-set. With these new functions, we are no longer required to write several lines of code to get ranking data. It does not only help in simplifying the query but also improves the performance of the query. So now the questions are: what are these ranking functions, how they work and how they differ from each others?
In my last tip (Ranking Functions in SQL Server 2005 and 2008 - Part 1) of this series I discussed the ROW_NUMBER and RANK ranking functions. In this tip, I am going to discuss the DENSE_RANK and NTILE ranking functions, how they work and how they differ from each other.
As you have seen in the last example of my last tip on this series, the RANK function does not give the consecutive ranking number, so what if you have a requirement to have ranks but no skipping in between? In other words, you want to have consecutive ranking within the result-set or within partitions (groups of records). For this kind of scenario you can use DENSE_RANK function, which has the same syntax as of RANK function but unlike the RANK function it gives consecutive ranking. So next record will get a new rank which would be (previous rank + 1). All the examples in this tip will run on AdventureWorks database.
DENSE_RANK() OVER ( [PARTITION BY partition_value_expression , ... [ n ]] ORDER BY order_value_expression , ... [ n ])
Script #4 - DENSE_RANK Function
--This script assigns a consecutive rank to each row of result-set which --is ordered by Title column. If two or more records happen to have --same value for Title Column they will get the same rank SELECT DENSE_RANK() OVER ( ORDER BY Title) AS [RecordRank], LoginID, ManagerID, Title, BirthDate, MaritalStatus, Gender FROM HumanResources.Employee
--This script assign consecutive rank to each row of a partition --(group of records) which is partitioned by the Gender column value and --which is ordered by the Title column. If two or more records happen to --have same value for Title Column in the partition they will get the same rank SELECT DENSE_RANK() OVER ( PARTITION BY Gender ORDER BY TITLE) AS [RecordRank], LoginID, ManagerID, Title, BirthDate, MaritalStatus, Gender FROM HumanResources.Employee
The NTILE function differs from other ranking functions in terms of how it breaks the result-set or partitions (groups of records) into specified number of sub-partitions (sub-groups of records). For each row in the sub partition (sub group) it returns the number of the sub partition (sub group) in which the row falls. If the NTILE can not divide the rows evenly to the specified number of sub groups, then larger sub groups will come before smaller groups. For example, lets assume you have 11 records in a table and you use NTILE with parameter value of 2, then NTILE will divide rows into two sub-groups. The first sub group will have 6 records whereas the second sub group will have 5 records only.
The syntax of NTILE also differs in terms of how it accepts a positive integer (of type int or bigint) constant expression that specifies the number of sub partitions (sub groups) into which each partition (group of records) must be divided.
NTILE (<integer_constant_expression>) OVER ( [PARTITION BY partition_value_expression , ... [ n ]] ORDER BY order_value_expression , ... [ n ])
Script #5 - NTILE Function
--This script divides the result-set which is considered --to be a single partition into 50 sub partitions --Because this table has 290 records, first 40 sub partitions --will have 6 records whereas the last 10 sub partitions will --have 5 records each, hence total = (40*6) + (10*5) SELECT NTILE(50) OVER ( ORDER BY Title) AS [SubPartition#], LoginID, ManagerID, Title, BirthDate, MaritalStatus, Gender FROM HumanResources.Employee
--This script divides each partition, which has been partitioned --on Gender column, into 50 sub partitions --Because the first partition for "F" has 84 records, first 34 sub partitions --will have 2 records whereas the last 16 sub partitions will have 1 records each, --hence total = (34*2) + (16*1), the same rule applies for other partitions as well SELECT NTILE(50) OVER ( PARTITION BY Gender ORDER BY TITLE) AS [SubPartition#], LoginID, ManagerID, Title, BirthDate, MaritalStatus, Gender FROM HumanResources.Employee
Common Table Expression (CTE)
With Common Table Expression you basically create a derived table which is available within the scope of a single, SELECT, INSERT, UPDATE, DELETE or MERGE statement. It helps you to simplify the process of achieving the same goal for which you had to write complex/recursive queries in prior versions of SQL Server. Many times you might be required to use ranking functions along with a CTE. For example you can combine these two commands to delete duplicate records from a table with just one query. Refer "Different strategies for removing duplicate records in SQL Server" for more details. For details about Common Table Expressions on msdn, click here.
- Review Ranking Functions in SQL Server 2005 and 2008 - Part 1 tip of this series for ROW_NUMBER and RANK functions.
- Review Ranking Functions on msdn.
- Check out these related tips
About the author
View all my tips