Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Scripts          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

SQL Server Stored Procedure to get every Nth row of a Query Result Set


By:   |   Updated: 2017-12-13   |   Comments (7)   |   Related: More > Testing

Problem

The requirement is to create simple and dynamic T-SQL code in SQL Server that will get a query string and return every Nth row from the result set of that query.  For example, if N equals 2 then the procedure should return every even row (second, fourth, sixth, etc.).  Both the query and the N should be set dynamically in the T-SQL code.  Another requirement mainly for performance purposes is to avoid loops, complex OFFSET-FETCH statements or cursors.  How can we accomplish this?

Solution

The solution involves creating a T-SQL stored procedure in SQL Server that gets an input SQL query string and the Nth parameter then produces the expected result.

  1. The procedure gets a valid query @qry and @step parameters
  2. The procedure checks whether the objects called ##RESULT and table_sequence exists and if so it drops them.
  3. The procedure creates a global temporary table ##RESULT by constructing a SELECT INTO statement from the given query and executes it dynamically.
  4. The procedure adds the table_sequence column to the newly created temporary table as a not null identity column and by that forcing it to count all rows from 1 to the number of rows in the temporary table.
  5. The results of the procedure are all the rows where the modulus of the table_sequence column and the @step parameter equals zero.

Here is the T-SQL code for the stored procedure

-- =====================================================
-- Author:  Eli Leiba
-- Create date: 11-2017
-- Description: Get every nth row from a given query
-- =====================================================
CREATE PROC usp_PickEveryNthRow (@qry VARCHAR (2000), @step INT)
AS
BEGIN
   DECLARE @CreateGlobalTemporaryTBTSQL VARCHAR (2000)
   DECLARE @FromClausePosition INT

   -- If objects ##result and/or table_sequence exists in Database then drop them.
   IF Object_Id('##RESULT') IS NOT NULL
      DROP TABLE ##RESULT

   IF Object_Id('table_sequence') IS NOT NULL
      DROP sequence table_sequence

   SET @FromClausePosition = CHARINDEX ('FROM', @qry, 1)
   -- construct the global temporary table creation statement
   SET @CreateGlobalTemporaryTBTSQL = CONCAT (
         LEFT (@qry, @FromClausePosition - 1),
         '  INTO ##RESULT ',
         SUBSTRING (@qry, @FromClausePosition, LEN (@qry) - @FromClausePosition + 1)
         )

   EXEC (@CreateGlobalTemporaryTBTSQL)

   -- add the running sequence to the global temporary table
   ALTER TABLE ##RESULT ADD table_sequence INT IDENTITY NOT NULL

   -- pull every Nth row by using the modulus operator in the where clause
   SELECT *
   FROM ##RESULT
   WHERE table_sequence % @step = 0
   DROP TABLE ##RESULT
END
GO
   

Keep in mind the following with this code:

  • What are the use cases for using the stored procedure from an IT and business perspective?
    • The ability to pick every Nth row of a given result set can be used for statistical applications, partitioning purposes and more.
  • What are the limitations to using this stored procedure?
    • The limitations of using this procedure are that it always returns a running sequence in its result set.
    • It uses dynamic SQL execution that is checked only upon execution and usually the SQL plan is not in cache.
  • How does this stored procedure perform faster than a loop, cursor, etc.?
    • The procedure uses a single SELECT statement from a global temporary table filtering the rows by using the modulus operator with a SELECT INTO statement that does not use logging to fill the global temporary table.
    • These are simple SQL statements that by default are faster that using loops and cursors in T-SQL code.
  • What happens to the stored procedure if the input logic is wrong?
    • The input parameters for this procedure are a query string an integer numbered as the step N for each Nth row.
    • The query string that is supposed to be a valid query. If the string contains an invalid query then the procedure will fail in the dynamic execution statement.
    • The integer number should be less than the number of returned rows in order to return results.

Here are some examples for using the procedure 

Letís take a look at some examples from Microsoft's demonstration database - NORTHWIND.

Get every tenth row of the products table ordered by the product name

Exec dbo.usp_PickEveryNthRow 'SELECT productname from products order by 1', 10  
Product name                     table_sequence
===============================  ==============
Chef Anton's Cajun Seasoning     10 
Grandma's Boysenberry Spread     20 
Jack's New England Clam Chowder  30 
Mishi Kobe Niku                  40 
Queso Cabrales                   50 
Scottish Longbreads              60 
Tofu                             70

Get every fifth row of the products table for category 1 ordered by unit price

Exec dbo.usp_PickEveryNthRow 'SELECT productname,unitprice from products where categoryid = 1 order by 2',5

   

And the result is:

productname           unitprice     table_sequence 
============          =========     ============== 
Cate de Blaye           263.50       5 
Rhanbrau Klosterbier      7.75      10 
Next Steps
  • You can create this simple procedure in your application database and use it for similar functionality in your applications queries.
  • The input query is assumed to be a valid simple SELECT query and validity should be checked prior to executing this procedure.
  • The procedure uses a table called ##RESULT and a sequence called table_sequence, so don't use these names in your database or update the script.
  • The procedures were tested for SQL Server version:
    • Microsoft SQL Server 2014 - 12.0.2000.8 (Intel X86)
    • Feb 20 2014 19:20:46
    • Copyright (c) Microsoft Corporation
    • Standard Edition on Windows NT 6.1 <X86> (Build 7601: Service Pack 1)
  • The stored procedure should be compatible with SQL Server 2012 version and above.


Last Updated: 2017-12-13


get scripts

next tip button



About the author
MSSQLTips author Eli Leiba Eli Leiba is a senior application DBA, a teacher and a senior database consultant with 19 years of RDBMS experience.

View all my tips




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Email me updates 


Signup for our newsletter
 I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



    



Thursday, December 14, 2017 - 10:44:37 PM - Henry Stinson Back To Top

Clever  idea adding column to the temp table using IDENTITY int.   --- where default is (1,1)

I would think also one could use a CTE that selects just the primary key along with ROW_NUMBER( ) OVER (PARTITION BY  .. ORDER BY ..) As [RowNo]

and join that CTE into the SELECT statement where [RowNo] % @step = 0.

Of course, this might not be feasible if you just need to do it in a stored proc that accepts any sql query, but the CTE might just save loading up a lot of data into a temp table (or memory) if it is only the primary key that the CTE gets, and then the following SQL query gets a lot less data.

 


Thursday, December 14, 2017 - 2:46:52 AM - Eli Leiba Back To Top

 Yes

your note is correct -  input query should not have identity column

 because I am adding an extra IDENTITY column


Thursday, December 14, 2017 - 2:45:31 AM - Eli Leiba Back To Top

 Using the row_number() function can do the same action.

I wanted to keep the procedure simple so I have used an extra identity column.

 


Wednesday, December 13, 2017 - 8:26:47 AM - etsav Back To Top

 I made another proc, based on your, but with some specific differences (use of ROW_NUMBER(), ORDER BY compulsory, flag to obtain nth rows or <>nth rows).  Pls let me know if it can be of some interest for readers (in that case I can post it).

 

 


Wednesday, December 13, 2017 - 5:34:50 AM - etsav Back To Top

 I suspect that ORDER BY in input query is not respected during adding your column table_sequence.

 

Ettore.


Wednesday, December 13, 2017 - 5:04:01 AM - etsav Back To Top

I think it should be pointed out another limitation:  input query should not have identity column.

Error:

Msg 2744, Level 16, State 2, Procedure usp_PickEveryNthRow, Line 25

Multiple identity columns specified for table '##RESULT'. Only one identity column per table is allowed.

Ettore

 


Wednesday, December 13, 2017 - 3:47:40 AM - R Evans Back To Top

Would this perform better than using a CTE with the window function ROW_NUMBER()?

e.g.

;WITH cte AS (

SELECT productname,unitprice, ROW_NUMBER() OVER (ORDER BY unitprice) table_sequence

FROM products

WHERE categoryid = 1

)

SELECT productname,unitprice,table_sequence

FROM CTE

WHERE table_sequence % @step = 0;


Learn more about SQL Server tools