![]() |
|

Identify and resolve SQL Server problems before they happen
|
|
By: Tal Olier | Read Comments (2) | Related Tips: More > Query Optimization |
Problem
As far as data management and RDBMS rules are concerned, relational databases are the most adoptable way for maintaining data with a reasonable level of concurrency and while supporting data management activities such as backups, mass purges, changing data structures, etc.
One problem is the difference in programming languages in traditional applications. The SQL (Structured Query Language) language is a declarative language that became the "Data Language" used for describing "what I need" and "where to fetch it from" in most organizations. OOP (Object Oriented Programming) languages became the most common practice among developers widely adopted by R&D (Research and Development) organizations around the world. So how do we bridge the gap?
Solution
Those two trends created a need for a "bridge" that will cover the gap by translating request from object oriented programming languages to SQL. In most cases the DAL (Data Access Layer) is used to describe the mechanism (mainly proprietary) built to manage all this "data bridging task" in a centralized way.
Since database vendors (Microsoft, Oracle, IBM, etc.) offer a very large set of proprietary commands in their specific flavor of in SQL, translation in the DAL has to support many options. What ends up happening is the implementation details sometimes lose many of the performance optimizations built into the engine. This has been causing many of these DAL's to be implemented in a very straight forward way, that being break down the request to smaller pieces translating each to it's equivalent SQL statements and building the "SELECT... FROM... WHERE..." clause that will figuratively do the job.
Over the years I have been observing some DAL's design and implementation and from my experience a "machine written SQL statement" sometimes tend to be very long text statements. Just the other day the question popped up "What is the maximum text length of queries should we aim for?", so I checked the Maximum Capacity Specifications for SQL Server 2005 and found that the length of a string containing SQL statements (i.e. batch size) for both 32 and 64 bit systems is defined as 65,536 * Network packet size. The default network packet size is 4096, so the SQL statement text is limited for 256 MB.
I suspected that long text queries (with much less than 256 MB) will create some challenge for the server's CPU. So I thought this should be tested and published in this tip. So for the scope of this tip we are trying to address these items:
Test Table Characteristics
For my test I am going to create a table (called t1000) with 200,000 records. This table has many different data types because I think it is a reasonable representation of many production tables work with on a regular basis. The characteristics of this table include:
Script: Create Test Table
| create table t1000 ( c1 int not null constraint test_pk primary key, c2 varchar(10) not null, c3 char(1000), c4 int not null, c5 int not null, c6 int not null, c7 int not null, c8 int not null ) go |
Script: Populate Test Table
| set nocount on declare @i as int set @i = 0 while @i<200000 begin set @i = @i + 1 insert into t1000 (c1, c2, c3, c4, c5, c6, c7, c8) values (@i, cast (@i as varchar (10)), '...simulating additional 1k data...', @i, @i, @i, @i, @i) end set nocount off go |
Script: Creating Test Queries
Since I am planning to test some very long queries, I will produce them in an automatic way. My first thought was to just print some long text to the screen and then paste it to a new SQL Server Management Studio's query window. What I found out was that long queries (hundreds of KB) are a bit heavy for the Management Studio (especially when word wrap is on) so I turned to the next best thing - files.
Writing text files can be done in various programming languages but since we deal with SQL Server I'll demonstrate a T-SQL way I learned from Reading and Writing Files in SQL Server using T-SQL - a great article by Phil Factor. I will use the following stored procedure, originally published by Phil here with a few tweaks for my needs.
| create PROCEDURE spWriteStringToFile (@String Varchar(max), --8000 in SQL Server 2000 @Path VARCHAR(255), @Filename VARCHAR(100) ) AS DECLARE @objFileSystem int ,@objTextStream int, @objErrorObject int, @strErrorMessage Varchar(1000), @Command varchar(1000), @hr int, @fileAndPath varchar(80) set nocount on select @strErrorMessage='opening the File System Object' EXECUTE @hr = sp_OACreate 'Scripting.FileSystemObject' , @objFileSystem OUT Select @FileAndPath=@path+'\'+@filename if @HR=0 Select @objErrorObject=@objFileSystem , @strErrorMessage='Creating file "'+@FileAndPath+'"' if @HR=0 execute @hr = sp_OAMethod @objFileSystem , 'CreateTextFile' , @objTextStream OUT, @FileAndPath,2,True if @HR=0 Select @objErrorObject=@objTextStream, @strErrorMessage='writing to the file "'+@FileAndPath+'"' if @HR=0 execute @hr = sp_OAMethod @objTextStream, 'Write', Null, @String if @HR=0 Select @objErrorObject=@objTextStream, @strErrorMessage='closing the file "'+@FileAndPath+'"' if @HR=0 execute @hr = sp_OAMethod @objTextStream, 'Close' if @hr<>0 begin Declare @Source varchar(255), @Description Varchar(255), @Helpfile Varchar(255), @HelpID int EXECUTE sp_OAGetErrorInfo @objErrorObject, @source output,@Description output,@Helpfile output,@HelpID output Select @strErrorMessage='Error whilst ' +coalesce(@strErrorMessage,'doing something') +', '+coalesce(@Description,'') raiserror (@strErrorMessage,16,1) end EXECUTE sp_OADestroy @objTextStream EXECUTE sp_OADestroy @objTextStream GO |
Script: Enable OLE Automation
Since the stored procedure above uses OLE Automation you will need to enable it on your SQL Server because this option is turned off by default due to security concerns. It is still fun playing with it :). Turning on the Ole Automation is done with the following command:
| EXEC sp_configure 'Ole Automation Procedures', 1 RECONFIGURE WITH OVERRIDE GO |
In case your server does not have Ole Automation enabled, trying to run the above stored procedure will produce the following error:
| Msg 15281, Level 16, State 1, Procedure sp_OACreate, Line 1 SQL Server blocked access to procedure 'sys.sp_OACreate' of component 'Ole Automation Procedures' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Ole Automation Procedures' by using sp_configure. For more information about enabling 'Ole Automation Procedures', see "Surface Area Configuration" in SQL Server Books Online. |
Script: What SQL statement should I use for my test?
I want to write a simple query that will return only one row but will have a very long WHERE clause. The following is an example query:
| select top 1 c1 from t1000 where c1 > 0 or (c5 = 1) or (c6 = 2) or (c7 = 3) or (c8 = 4) or (c4 = 5) or (c5 = 6) or (c6 = 7) or (c7 = 8) or (c8 = 9) or (c4 = 10) GO |
All that is left is to create a T-SQL block that will create a query similar to the one above:
| /* 01 */ set nocount on /* 02 */ declare @i as int /* 03 */ declare @sql_stmt as varchar(max) /* 04 */ declare @num_of_ors as int /* 05 */ set @num_of_ors = 5000 /* 06 */ set @i = 0 /* 07 */ set @sql_stmt = 'select top 1 c1 from t1000 where c1 > 0' /* 08 */ while @i<@num_of_ors /* 09 */ begin /* 10 */ set @i = @i + 1 /* 11 */ set @sql_stmt = @sql_stmt + /* 12 */ ' or (c' /* 13 */ + /* 14 */ cast /* 15 */ ( /* 16 */ @i%5+4 /* 17 */ as varchar(10) /* 18 */ ) /* 19 */ + /* 20 */ ' = ' /* 21 */ + /* 22 */ cast /* 23 */ ( /* 24 */ @i as varchar(10) /* 25 */ ) /* 26 */ + /* 27 */ ')' /* 28 */ end /* 29 */ set nocount off /* 30 */ execute spWriteStringToFile @sql_stmt, 'c:\temp\', 'query.sql' /* 31 */ print 'Done.' /* 32 */ go |
Here is an explanation of the code:
For each file I will also add the code which will use the correct database, the dbcc commands to flush the cache and SET STATISTICS to show the IO, CPU and execution plan of the query:
| use total_long_text_queries go dbcc dropcleanbuffers dbcc freeproccache go SET STATISTICS IO ON go SET STATISTICS TIME ON go SET STATISTICS PROFILE ON go |
Running the query above with the @num_of_ors set to 5000, 10000, 15000 and 20000 produces the following files:
Running The Queries
As I mentioned earlier loading such long queries can be a bit heavy for Management Studio. So during my testing I ran the queries with the Sqlcmd Utility (new and much enhanced version of osql) which is perfect for running queries entered at the command line or from a parameter file.
Running the test queries with sqlcmd utility is very easy. For testing purposes I used the command below to connect to the SQL Server instance, read from an input file and write output to an file.
| sqlcmd -S <server>\<instance> -i c:\temp\query<XX>000.txt -o c:\temp\queryresults<XX>000.txt |
Analyzing the Results
Now that we have run the queries from above, lets take a look at the result:
Based on the analysis, I would like to bring attention on two specific portions of the results:
These results show the following trends:
I have summarized the results into this Excel file; though the bottom line is included in the following table and graph:
| Size of Query (KB) | CPU Time (ms) | Elapsed Time (ms) |
| 145 | 5053 | 5053 |
| 292 | 19875 | 19944 |
| 448 | 45625 | 45657 |
| 604 | 83829 | 83893 |
The graphic below outlines the size of the query and the CPU time in milliseconds for the three tests conducted separated by a comma. Based on this information, I believe we have found out that lengthy SQL Server queries will consume your CPU.

Next Steps
| Friday, May 16, 2008 - 2:22:26 PM - TroyK | Read The Tip |
|
Hi Tal; Interesting experiment, but I think the effect you're seeing is more a function of the number of connectives you have in the predicate than the actual length of the query string in bytes. Try rerunning your tests, but instead of increasing the query size by adding "OR"s, make your predicate like this: SELECT c1 Then, vary the length of your query by increasing the length of the two strings after the OR, keeping them distinct so that they don't inform the final result. TroyK |
|
| Saturday, May 17, 2008 - 1:08:48 PM - tal.olier | Read The Tip |
|
[quote user="TroyK"] Hi Tal; Interesting experiment, but I think the effect you're seeing is more a function of the number of connectives you have in the predicate than the actual length of the query string in bytes. Try rerunning your tests, but instead of increasing the query size by adding "OR"s, make your predicate like this: SELECT c1 Then, vary the length of your query by increasing the length of the two strings after the OR, keeping them distinct so that they don't inform the final result. TroyK [/quote]
Hello Troy, First, Yes you are correct, Second , as I wrote in my article, this is a true story – I found myself in a design review meeting when the question popped up: "What is the maximum text length of queries should we aim for?". You can assume that in the practical world where an automatic DAL (Data Access Layer) writes your query for you the lengthy query are results of a lot of conditions (poor design) in the criteria clause and not one that is actually long.
Anyway, just to prove your point for you I wrote the following script that created the test you asked for: use otal_long_text_queries go
drop table t1001 go
create table t1001 ( c1 int not null, c2 varchar(max) ) go
alter table t1001 add constraint t1001_pk primary key (c1) go
set nocount on declare @i as int declare @txt as varchar(max) declare @sql_stmt as varchar(max) set @i = 0 set @txt = '' while @i<1000 begin set @i = @i + 1 set @txt = @txt + '1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890' end insert into t1001 (c1, c2) values (1, @txt) set @sql_stmt = 'select c1 from t1001 where c2 = '''+@txt+'''' execute spWriteStringToFile @sql_stmt, 'c:\temp\', 'query.sql' set nocount off go
select len(c2)/1024 KB from t1001 go
Then I ran the script via: sqlcmd -S <server> -i c:\temp\query.sql -o c:\temp\query.txt
The result was less than 1 second of running for more 2000KB query file. So yes you are correct.
In case you are interested with actual results flies, drop me an e-mail.
--Tal Olier (tal.olier@gmail.com).
|
|
|
privacy | disclaimer | copyright | advertise | about authors | contribute | feedback | giveaways | user groups Some names and products listed are the registered trademarks of their respective owners. Edgewood Solutions LLC | MSSharePointTips.com | MSSQLTips.com |