SQL Server Convert Cursor to Set Based

Overview

The T-SQL programming language is a declarative, set based language. What does this mean? You issue a query for the results you want and the database engine sweats the details on how best to satisfy the request. Developers that are generally unfamiliar with declarative queries usually turn towards looping mechanisms, such as cursors, which retrieve data one row at a time resulting in query inefficiencies.

Explanation

Cursor logic is one of the common performance killers I come across when troubleshooting database issues. If a cursor is working with few rows, it may not ever manifest itself as an issue. Invariably, you will end up working with large sets of data and, at this point, you will likely see cursor logic begin to cause grief. I once had to troubleshoot a conversion script for a customer that worked with data from 3 related tables. The script processed the first table with a cursor, looked for related entries in a 2nd table which was also looped through with a cursor, followed by the 3rd table, also accessed using a cursor. Total number of rows to process was roughly 120,000. Total execution time was three days (I only know because I had a morbid curiosity to find out, so I let the script run to completion). I re-wrote the script querying the tables as a set (basically INNER JOINing the 3 tables) and I was able to cut the conversion down to about 1 minute. Needless to say, the customer was happy.

Let’s examine a simple example. The following stored procedure is used to get some information about an order in AdventureWorks. It uses a set based query to retrieve the results.

set statistics time on
go
 
create procedure GetOrderData @SalesOrderID int
as
begin
   set nocount on
 
   select OH.OrderDate, OH.DueDate, OH.ShipDate, D.ProductID, P.Name
   from Sales.SalesOrderHeader as OH
   inner join Sales.SalesOrderDetail as D on D.SalesOrderID = OH.SalesOrderID
   inner join Production.Product as P on P.ProductID = D.ProductID
   where OH.SalesOrderID = @SalesOrderID
end
go
 
exec GetOrderData @SalesOrderID = 51721
go

The resulting execution is extremely efficient returning all 72 rows for the resultset in about 5 milliseconds.

SQL Server Execution Times: CPU time = 0 ms,  elapsed time = 5 ms.

Now let’s re-write this procedure to retrieve the same results using a cursor to spin through the sales order detail rows and examine the difference.

set statistics time on
go
 
alter procedure GetOrderData @SalesOrderID int
as
begin
   set nocount on
 
   declare @OrderDate datetime
   declare @DueDate datetime
   declare @ShipDate datetime
   declare @ProductID int
   declare @Name nvarchar(50)
 
   select @OrderDate = OrderDate,
      @DueDate = DueDate,
      @ShipDate = ShipDate
   from Sales.SalesOrderHeader
   where SalesOrderID = @SalesOrderID
 
   declare DetailCursor
   cursor for
   (
      select ProductID
      from Sales.SalesOrderDetail
      where SalesOrderID = @SalesOrderID
   )
 
   open DetailCursor
   fetch DetailCursor into @ProductID
   while @@FETCH_STATUS = 0
   begin
      select @Name = Name
      from Production.Product
      where ProductID = @ProductID
 
      select @OrderDate as OrderDate,
         @DueDate as DueDate,
         @ShipDate as ShipDate,
         @ProductID as ProductID,
         @Name as Name
 
      fetch next from DetailCursor into @ProductID
   end
 
   close DetailCursor;
  deallocate DetailCursor;
end
go
 
exec GetOrderData @SalesOrderID = 51721
go

The cursor approach in this stored procedure takes longer to produce the same query result.

SQL Server Execution Times: CPU time = 15 ms,  elapsed time = 1886 ms.

While this is not an extreme case, you may find the cursor approach start to exponentially degrade as the number of rows to process grows. I generally avoid cursors in production code except under specialized circumstances.

Additional Information

Leave a Reply

Your email address will not be published. Required fields are marked *