SQL Server Convert Cursor to Set Based
By: Armando Prato
The T-SQL programming language is a declarative, set based language. What does this mean? You issue a query for the results you want and the database engine sweats the details on how best to satisfy the request. Developers that are generally unfamiliar with declarative queries usually turn towards looping mechanisms, such as cursors, which retrieve data one row at a time resulting in query inefficiencies.
Cursor logic is one of the common performance killers I come across when troubleshooting database issues. If a cursor is working with few rows, it may not ever manifest itself as an issue. Invariably, you will end up working with large sets of data and, at this point, you will likely see cursor logic begin to cause grief. I once had to troubleshoot a conversion script for a customer that worked with data from 3 related tables. The script processed the first table with a cursor, looked for related entries in a 2nd table which was also looped through with a cursor, followed by the 3rd table, also accessed using a cursor. Total number of rows to process was roughly 120,000. Total execution time was three days (I only know because I had a morbid curiosity to find out, so I let the script run to completion). I re-wrote the script querying the tables as a set (basically INNER JOINing the 3 tables) and I was able to cut the conversion down to about 1 minute. Needless to say, the customer was happy.
Let's examine a simple example. The following stored procedure is used to get some information about an order in AdventureWorks. It uses a set based query to retrieve the results
set statistics time on go create procedure GetOrderData @SalesOrderID int as begin set nocount on select OH.OrderDate, OH.DueDate, OH.ShipDate, D.ProductID, P.Name from Sales.SalesOrderHeader as OH inner join Sales.SalesOrderDetail as D on D.SalesOrderID = OH.SalesOrderID inner join Production.Product as P on P.ProductID = D.ProductID where OH.SalesOrderID = @SalesOrderID end go exec GetOrderData @SalesOrderID = 51721 go
The resulting execution is extremely efficient returning all 72 rows for the resultset in about 5 milliseconds.
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 5 ms.
Now let's re-write this procedure to retrieve the same results using a cursor to spin through the sales order detail rows and examine the difference.
set statistics time on go alter procedure GetOrderData @SalesOrderID int as begin set nocount on declare @OrderDate datetime declare @DueDate datetime declare @ShipDate datetime declare @ProductID int declare @Name nvarchar(50) select @OrderDate = OrderDate, @DueDate = DueDate, @ShipDate = ShipDate from Sales.SalesOrderHeader where SalesOrderID = @SalesOrderID declare DetailCursor cursor for ( select ProductID from Sales.SalesOrderDetail where SalesOrderID = @SalesOrderID ) open DetailCursor fetch DetailCursor into @ProductID while @@FETCH_STATUS = 0 begin select @Name = Name from Production.Product where ProductID = @ProductID select @OrderDate as OrderDate, @DueDate as DueDate, @ShipDate as ShipDate, @ProductID as ProductID, @Name as Name fetch next from DetailCursor into @ProductID end close DetailCursor; deallocate DetailCursor; end go exec GetOrderData @SalesOrderID = 51721 go
The cursor approach in this stored procedure takes longer to produce the same query result
SQL Server Execution Times: CPU time = 15 ms, elapsed time = 1886 ms.
While this is not an extreme case, you may find the cursor approach start to exponentially degrade as the number of rows to process grows. I generally avoid cursors in production code except under specialized circumstances.
Last Update: 9/10/2011