Learn more about SQL Server tools



solving sql server problems for millions of dbas and developers since 2006 attend our next webcast













































   Got a SQL tip?
            We want to know!

Lengthy SQL Server Queries Will Consume Your CPU

MSSQLTips author Tal Olier By:   |   Read Comments (2)   |   Related Tips: More > Query Optimization

Problem
As far as data management and RDBMS rules are concerned, relational databases are the most adoptable way for maintaining data with a reasonable level of concurrency and while supporting data management activities such as backups, mass purges, changing data structures, etc.

One problem is the difference in programming languages in traditional applications.  The SQL (Structured Query Language) language is a declarative language that became the "Data Language" used for describing "what I need" and "where to fetch it from" in most organizations.  OOP (Object Oriented Programming) languages became the most common practice among developers widely adopted by R&D (Research and Development) organizations around the world.  So how do we bridge the gap?

Solution
Those two trends created a need for a "bridge" that will cover the gap by translating request from object oriented programming languages to SQL.   In most cases the DAL (Data Access Layer) is used to describe the mechanism (mainly proprietary) built to manage all this "data bridging task" in a centralized way.

Since database vendors (Microsoft, Oracle, IBM, etc.) offer a very large set of proprietary commands in their specific flavor of in SQL, translation in the DAL has to support many options.  What ends up happening is the implementation details sometimes lose many of the performance optimizations built into the engine.  This has been causing many of these DAL's to be implemented in a very straight forward way, that being break down the request to smaller pieces translating each to it's equivalent SQL statements and building the "SELECT... FROM... WHERE..." clause that will figuratively do the job.

Over the years I have been observing some DAL's design and implementation and from my experience a "machine written SQL statement" sometimes tend to be very long text statements.  Just the other day the question popped up "What is the maximum text length of queries should we aim for?", so I checked the Maximum Capacity Specifications for SQL Server 2005 and found that the length of a string containing SQL statements (i.e. batch size) for both 32 and 64 bit systems is defined as 65,536 * Network packet size. The default network packet size is 4096, so the SQL statement text is limited for 256 MB.

I suspected that long text queries (with much less than 256 MB) will create some challenge for the server's CPU.  So I thought this should be tested and published in this tip. So for the scope of this tip we are trying to address these items:

  • Proving that long text queries will consume your CPU
  • Give some sense regarding the actual penalty expected on a medium sized server
    • Dual core CPU with 2GB of RAM and 4 x 10,000 RPM disks

Test Table Characteristics

For my test I am going to create a table (called t1000) with 200,000 records. This table has many different data types because I think it is a reasonable representation of many production tables work with on a regular basis. The characteristics of this table include:

  • A single integer column as a primary key (clustered by default).
  • A varchar column.
  • A char column simulating additional 1 KB of data.
  • Five integer columns that will be used for creating the long text query's in the WHERE clause.

Script: Create Test Table

create table t1000 (
c1 int not null constraint test_pk primary key,
c2 varchar(10) not null,
c3 char(1000),
c4 int not null,
c5 int not null,
c6 int not null,
c7 int not null,
c8 int not null
)
go

Script: Populate Test Table

set nocount on

declare @i as int

set @i = 0

while @i<200000

begin

set @i = @i + 1

insert into t1000 (c1, c2, c3, c4, c5, c6, c7, c8)
values (@i,
cast (@i as varchar (10)),
'...simulating additional 1k data...',
@i, @i, @i, @i, @i)

end

set nocount off
go

Script: Creating Test Queries

Since I am planning to test some very long queries, I will produce them in an automatic way. My first thought was to just print some long text to the screen and then paste it to a new SQL Server Management Studio's query window. What I found out was that long queries (hundreds of KB) are a bit heavy for the Management Studio (especially when word wrap is on) so I turned to the next best thing - files.

Writing text files can be done in various programming languages but since we deal with SQL Server I'll demonstrate a T-SQL way I learned from Reading and Writing Files in SQL Server using T-SQL - a great article by Phil Factor.  I will use the following stored procedure, originally published by Phil here with a few tweaks for my needs.

create PROCEDURE spWriteStringToFile
(@String Varchar(max), --8000 in SQL Server 2000
@Path VARCHAR(255),
@Filename VARCHAR(100)
)
AS

DECLARE @objFileSystem int
,@objTextStream int,
@objErrorObject int,
@strErrorMessage Varchar(1000),
@Command varchar(1000),
@hr int,
@fileAndPath varchar(80)

set nocount on

select @strErrorMessage='opening the File System Object'

EXECUTE @hr = sp_OACreate 'Scripting.FileSystemObject' , @objFileSystem OUT

Select @FileAndPath=@path+'\'+@filename

if @HR=0 Select @objErrorObject=@objFileSystem , @strErrorMessage='Creating file "'+@FileAndPath+'"'

if @HR=0 execute @hr = sp_OAMethod @objFileSystem , 'CreateTextFile'

, @objTextStream OUT, @FileAndPath,2,True

if @HR=0 Select @objErrorObject=@objTextStream,

@strErrorMessage='writing to the file "'+@FileAndPath+'"'

if @HR=0 execute @hr = sp_OAMethod @objTextStream, 'Write', Null, @String

if @HR=0 Select @objErrorObject=@objTextStream, @strErrorMessage='closing the file "'+@FileAndPath+'"'

if @HR=0 execute @hr = sp_OAMethod @objTextStream, 'Close'

if @hr<>0

begin

Declare

@Source varchar(255),

@Description Varchar(255),

@Helpfile Varchar(255),

@HelpID int

EXECUTE sp_OAGetErrorInfo @objErrorObject,

@source output,@Description output,@Helpfile output,@HelpID output

Select @strErrorMessage='Error whilst '

+coalesce(@strErrorMessage,'doing something')

+', '+coalesce(@Description,'')

raiserror (@strErrorMessage,16,1)

end

EXECUTE sp_OADestroy @objTextStream

EXECUTE sp_OADestroy @objTextStream
GO

Script: Enable OLE Automation

Since the stored procedure above uses OLE Automation you will need to enable it on your SQL Server because this option is turned off by default due to security concerns.  It is still fun playing with it :). Turning on the Ole Automation is done with the following command:

EXEC sp_configure 'Ole Automation Procedures', 1

RECONFIGURE WITH OVERRIDE

GO

In case your server does not have Ole Automation enabled, trying to run the above stored procedure will produce the following error:

Msg 15281, Level 16, State 1, Procedure sp_OACreate, Line 1

SQL Server blocked access to procedure 'sys.sp_OACreate' of component 'Ole Automation Procedures' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Ole Automation Procedures' by using sp_configure. For more information about enabling 'Ole Automation Procedures', see "Surface Area Configuration" in SQL Server Books Online.

Script: What SQL statement should I use for my test?

I want to write a simple query that will return only one row but will have a very long WHERE clause.  The following is an example query:

select top 1 c1
from t1000
where c1 > 0
or (c5 = 1)
or (c6 = 2)
or (c7 = 3)
or (c8 = 4)
or (c4 = 5)
or (c5 = 6)
or (c6 = 7)
or (c7 = 8)
or (c8 = 9)
or (c4 = 10)
GO

All that is left is to create a T-SQL block that will create a query similar to the one above:

/* 01 */ set nocount on

/* 02 */ declare @i as int

/* 03 */ declare @sql_stmt as varchar(max)

/* 04 */ declare @num_of_ors as int

/* 05 */ set @num_of_ors = 5000

/* 06 */ set @i = 0

/* 07 */ set @sql_stmt = 'select top 1 c1 from t1000 where c1 > 0'

/* 08 */ while @i<@num_of_ors

/* 09 */ begin

/* 10 */ set @i = @i + 1

/* 11 */ set @sql_stmt = @sql_stmt +

/* 12 */ ' or (c'

/* 13 */ +

/* 14 */ cast

/* 15 */ (

/* 16 */ @i%5+4

/* 17 */ as varchar(10)

/* 18 */ )

/* 19 */ +

/* 20 */ ' = '

/* 21 */ +

/* 22 */ cast

/* 23 */ (

/* 24 */ @i as varchar(10)

/* 25 */ )

/* 26 */ +

/* 27 */ ')'

/* 28 */ end

/* 29 */ set nocount off

/* 30 */ execute spWriteStringToFile @sql_stmt, 'c:\temp\', 'query.sql'

/* 31 */ print 'Done.'

/* 32 */ go

Here is an explanation of the code:

  • Lines 01 to 06 are initiating variables, line 4 holds the number of "OR"s that the where clause will include.
  • Line 07 is the beginning of the SQL statement that is going to be produced.
  • Lines 08 - 28 holds a while loop that creates the where clause.  Line 30 is using the previously created stored procedure (spWriteStringToFile) to write the query to a file.

For each file I will also add the code which will use the correct database, the dbcc commands to flush the cache and SET STATISTICS to show the IO, CPU and execution plan of the query:

use total_long_text_queries
go

dbcc dropcleanbuffers

dbcc freeproccache
go

SET STATISTICS IO ON
go

SET STATISTICS TIME ON
go

SET STATISTICS PROFILE ON
go

Running the query above with the @num_of_ors set to 5000, 10000, 15000 and 20000 produces the following files:

Running The Queries

As I mentioned earlier loading such long queries can be a bit heavy for Management Studio.  So during my testing I ran the queries with the Sqlcmd Utility (new and much enhanced version of osql) which is perfect for running queries entered at the command line or from a parameter file.

Running the test queries with sqlcmd utility is very easy.  For testing purposes I used the command below to connect to the SQL Server instance, read from an input file and write output to an file.

sqlcmd -S <server>\<instance> -i c:\temp\query<XX>000.txt -o c:\temp\queryresults<XX>000.txt

Analyzing the Results

Now that we have run the queries from above, lets take a look at the result:

Based on the analysis, I would like to bring attention on two specific portions of the results:

  • First, review the section outlining how much I/O was performed by the query
    • As an example in the queryresults20000.txt file:
      • Table 't1000'. Scan count 1, logical reads 3, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0
  • Second, review the section outlining how much time was wasted on parsing the query
    • As an example in the queryresults20000.txt file:
      • SQL Server parse and compile time: CPU time = 83829 ms, elapsed time = 83893 ms

These results show the following trends:

  • Running the query required almost no I/O resources.
  • Most of the time was "parse and compile time".
  • "parse and compile time" is mostly composed of CPU time.
  • The CPU time consumed raises as the query's text gets longer.

I have summarized the results into this Excel file; though the bottom line is included in the following table and graph:

Size of Query (KB) CPU Time (ms) Elapsed Time (ms)
145 5053 5053
292 19875 19944
448 45625 45657
604 83829 83893

The graphic below outlines the size of the query and the CPU time in milliseconds for the three tests conducted separated by a comma.  Based on this information, I believe we have found out that lengthy SQL Server queries will consume your CPU.

Next Steps

  • When designing a DAL (Data Access Layer) or using one in your applications, you should take the following items into consideration:
    • Long SQL statements require a large amount of CPU to processes regardless of the actual record amount being fetched.
    • DAL architecture should address functionality challenges in addition to performance challenges including the length of the SQL query statement created. A good rule of thumb can be to make sure that all queries larger than 50KB are thoroughly tested to make sure they are not creating a performance issue.
    • DAL implementers might want to consider including a fuse for limiting the size of queries.
  • You might want to get more familiar with sqlcmd utility as this can be a key tool for testing, maintenance and general usage.


Last Update: 5/16/2008


About the author
MSSQLTips author Tal Olier
Tal Olier is a database expert currently working for HP holding various positions in IT and R&D departments.

View all my tips


print tip Print  
Become a paid author





join MSSQLTips for free SQL Server tips     



Learn more about SQL Server tools
Post a comment or let the author know this tip helped you.

       All comments are reviewed, so stay on subject or we may delete your comment.

*Name   *Email Notify for updates



       Note: your email address is not published. Required fields are marked with an asterisk (*)


Get free SQL tips:

*Enter Code refresh code     



Saturday, May 17, 2008 - 1:08:48 PM - tal.olier Read The Tip

[quote user="TroyK"]

Hi Tal;

Interesting experiment, but I think the effect you're seeing is more a function of the number of connectives you have in the predicate than the actual length of the query string in bytes.

Try rerunning your tests, but instead of increasing the query size by adding "OR"s, make your predicate like this:

SELECT c1
FROM t1000
WHERE c1 = 5555
OR
'<ABCD Short String>'
=
'<WXYZ Short String>'

Then, vary the length of your query by increasing the length of the two strings after the OR, keeping them distinct so that they don't inform the final result.

TroyK

[/quote] 

 

 

Hello Troy,

First, Yes you are correct,

Second , as I wrote in my article, this is a true story I found myself in a design review meeting when the question popped up: "What is the maximum text length of queries should we aim for?". You can assume that in the practical world where an automatic DAL (Data Access Layer) writes your query for you the lengthy query are results of a lot of conditions (poor design)  in the criteria clause and not one that is actually long.

 

Anyway, just to prove your point for you I wrote the following script that created the test you asked for:

use otal_long_text_queries

go

 

drop table t1001

go

 

create table t1001

(

      c1 int not null,

      c2 varchar(max)

)

go

 

alter table t1001 add constraint t1001_pk primary key (c1)

go

 

set nocount on

declare @i as int

declare @txt as varchar(max)

declare @sql_stmt as varchar(max)

set @i = 0

set @txt = ''

while @i<1000

begin

      set @i = @i + 1

      set @txt = @txt + '1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890'

end

insert into t1001 (c1, c2) values (1, @txt)

set @sql_stmt = 'select c1 from t1001 where c2 = '''+@txt+''''

execute spWriteStringToFile @sql_stmt, 'c:\temp\', 'query.sql'

set nocount off

go

 

select len(c2)/1024 KB from t1001

go

 

Then I ran the script via:

sqlcmd -S <server> -i c:\temp\query.sql -o c:\temp\query.txt

 

The result was less than 1 second of running for more 2000KB query file. So yes you are correct.

 

In case you are interested with actual results flies, drop me an e-mail.

 

--Tal Olier (tal.olier@gmail.com).

 

 


Friday, May 16, 2008 - 2:22:26 PM - TroyK Read The Tip

Hi Tal;

Interesting experiment, but I think the effect you're seeing is more a function of the number of connectives you have in the predicate than the actual length of the query string in bytes.

Try rerunning your tests, but instead of increasing the query size by adding "OR"s, make your predicate like this:

SELECT c1
FROM t1000
WHERE c1 = 5555
OR
'<ABCD Short String>'
=
'<WXYZ Short String>'

Then, vary the length of your query by increasing the length of the two strings after the OR, keeping them distinct so that they don't inform the final result.

TroyK




 
Sponsor Information







Copyright (c) 2006-2014 Edgewood Solutions, LLC All rights reserved
privacy | disclaimer | copyright | advertise | about
authors | contribute | feedback | giveaways | free t-shirt | user groups | community | events | first timer?
Some names and products listed are the registered trademarks of their respective owners.