Learn more about SQL Server tools

mssqltips logo
giveaway
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

Convert Implicit and the related performance issues with SQL Server


By:   |   Read Comments (17)   |   Related Tips: More > Performance Tuning

Attend a SQL Server Conference for FREE >> click to learn more


Problem
I was a running a routine query using an equal operator on the only column of the primary key for a table and I noticed that the performance was terrible.  These queries should have been flying because all I was doing was retrieving one row of data which should have been doing an index seek.  When I looked at the query plan it was doing a scan instead.  This tip shows you what I found and how to resolve the problem.

Solution
The problem I was facing can be seen by doing a similar query in the SQL Server 2005 AdventureWorks database on the HumanResources.Employee table.  To help us understand what SQL Server is doing when we run these queries, let's ask for IO statistics and also use the SSMS menu command Query\Include Actual Execution Plan. 

To use the AdventureWorks database and turn on IO statistics, start with this query:

use AdventureWorks
go
SET STATISTICS IO ON
go

Here's a query on the Employee table which is similar to the ones that caused me so much difficulty:

SELECT EmployeeID, NationalIDNumber, LoginID
FROM HumanResources.Employee
WHERE NationalIDNumber = 112457891
go

It seems pretty innocuous.  The HumanResources.Employee table has an index that begins with NationalIDNumber so executing the query should just be a matter of seeking to the location of 112457891 and then doing a lookup to get the table row.  But the statistics and the query plan show otherwise.  Here are the messages:

 EmployeeID NationalIDNumber LoginID
----------- ---------------- ----------------------
          4 112457891        adventure-works\rob0
(1 row(s) affected)
Table 'Employee'. Scan count 1, logical reads 6, 
physical reads 0, read-ahead reads 0, 
lob logical reads 0, lob physical reads 0, 
lob read-ahead reads 0.

(1 row(s) affected)

The statistics show 1 scan and that's the problem.  Adventureworks.HumanResources.Employee only has 291 rows, so this probably ran really fast and doesn't seem like a problem.  The table that I was working on had millions of rows and the table scan was a killer taking several seconds for each query.. 

Since the NationalIDNumber column is at the start of an index and the only column for this index, why was there a scan and not a seek?  The query plan below tells us why.  Here's the overall plan where you can see the index scan:

 The tool tip for the index scan gives the details that make all the difference, as shown below.

The red arrow points to the problem.  The function call CONVERT_IMPLICIT(int, [AdventureWorks].[HumanResources].[Employee].[NationalIDNumber, 0) is modifying the NationalIDNumber column before it is compared to the integer constant 1124579811 which we are passing into this query.  If you look at the table definition you'll see that NationalIDNumber is declared to be nvarchar(15).   Once there is a function, even an implicit one as we see here, used on the column SQL Server can't use the index on NationalDNumber and it falls back on a scan. 

Changing the query to compare to a string constant and the problem goes away:

SELECT EmployeeID, NationalIDNumber, LoginID
FROM HumanResources.Employee
WHERE NationalIDNumber = '112457891'
go

The messages now shows that there are zero scans, which is what we want.  In this case the difference in logical reads is only 2, that's because the Employee table is so small.  Working with a million row table, the difference in logical reads grows into the thousands.

 EmployeeID NationalIDNumber LoginID
----------- ---------------- ----------------------
          4 112457891        adventure-works\rob0
(1 row(s) affected)

Table 'Employee'. Scan count 0, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 

(1 row(s) affected)

The query plan shows the seek and the key lookup which is what we would have expected to occur.

This problem is common where a character string has been used to store a numeric key.  SQL Server will dutifully perform the implicit conversion and return the correct result, but at the cost of poor performance by doing a scan instead of a seek and using the index correctly.

Next Steps

  • Always be alert for implicit conversions, particularly when there are character strings storing numeric keys. 
  • I've even seen this problem when varchar columns are compared to nvarchar columns. 
  • Fixing the problem is straight forward, just make sure you are performing the comparison on like data types.


Last Update:


signup button

next tip button



About the author
MSSQLTips author Andy Novick Andy Novick is a SQL Server Developer in the Boston area with 25 years of database and application development experience.

View all my tips
Related Resources





Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Notify for updates 


SQL tips:

*Enter Code refresh code     



Monday, May 11, 2009 - 9:41:24 AM - mardukes Back To Top

Much to the contrary, you missed the point.

The point is not left side / right side but implicit conversion.  This is a technical topic and being "technically correct" is the requirement.  In terms of "proper practices" you've under-complicated the query and are doing a disservice to yourself and those that follow you in supporting the system.


Monday, May 11, 2009 - 9:12:57 AM - edcarden Back To Top

[quote user="mardukes"]

...and t.PubDate BETWEEN CONVERT(datetime, '19910101', 112) AND CONVERT(datetime, '19920101', 112)

Note that I changed it to Jan 1, 1992 because what you wrote would miss anything on the last day of 1991 -- time of 00:00:000000.

Now anyone maintaining this code knows the stored data is a datetime and not a string.

[/quote]

 While you are technically correct with the date range, you missed the point.  But if you want to go that route then..

I have re-rewritten your query because it will include any data with a date in January 1st, 1992 and the query should return only that data that is within 1991.

SELECT T.title, T.PubDate, T.ytd_sales
FROM dbo.TITLES T
WHERE 1 = 1
  AND T.PubDate >= '1991-01-01'
  AND T.PubDate < '1992-01-01'

Now that we have that out of the way..

 While the explicit conversion does make it very clear what the query is looking for, comparison of a date & time value verses a string, it isn't necessary to do this because the column being searched (PubDate) is of DATETIME and so the query engine knows the value listed is a date and not a string.  You can perform the explcit conversion if you want to and there are times most certainly when a query should do this instead of using implict conversion logic but when working with datetime data types your just over complicating your query by doing this. 


Thursday, April 30, 2009 - 2:24:21 PM - Micah Stockton Back To Top

 Wow, you sure know your sql.


Thursday, April 30, 2009 - 2:11:52 PM - mardukes Back To Top

...and t.PubDate BETWEEN CONVERT(datetime, '19910101', 112) AND CONVERT(datetime, '19920101', 112)

Note that I changed it to Jan 1, 1992 because what you wrote would miss anything on the last day of 1991 -- time of 00:00:000000.

Now anyone maintaining this code knows the stored data is a datetime and not a string.


Thursday, April 30, 2009 - 11:17:26 AM - edcarden Back To Top

[quote user="mardukes"]

The analogy being the Microsoft magic.  Smart quotes "does you a favor" and ends up hurting; implicit converts "does you a favor" and ends up hurting.  Likewise, a scripter shouldn't use Word nor allow the mystery of not EXPLICITLY casting/converting values -- let alone not KNOW that it's happening.

And, yes, I do blame the tool -- TSQL is not some amatuer or web scripting tool.  It should error out for mismatched data types.  Nothing drives me more nuts than having people believe that "4/29/2009" is a date.  (FYI, it's a string!)  A misbelief fostered by implicit conversion.

[/quote]

Mardukes -

I'm curious as to how you think a date should be displayed/written/shown within a T-SQL statement if not within a set of quotes?  How would you re-write the below so that the date does not include quotes?

SELECT T.title, T.PubDate, T.ytd_sales
FROM dbo.TITLES T
WHERE 1 = 1
  AND T.PubDate Between '1991-01-01' AND '1991-12-31'


Wednesday, April 29, 2009 - 3:15:44 PM - ray.herring Back To Top

Well, we can agree to disagree then. "Smart Quotes" are a feature in a Word Processor for document processing purposes.  If someone chooses to use the chisel as a drill then that is not the chisel's fault.  Implicit conversion is and always has been part of T-SQL.

Your position seems to be that comparing strings of different lengths or BigInts to Ints or Floats to Ints or Varchar to Char should yield an exception since Implicit Conversion is used in all of these cases.  That extention of strongly typed data rules would probably break more code than requiring "Option Strict" in every VB program :)

Implicit data conversions, coersions, etc. have been and will continue to be, part of the specifications for many programming environments.The strongly typed languages like Pascal quickly lost following, as strongly typed as C++, etc are there are execptions and folks still manange to create interesting bugs.  Free for all languages like Perl, C, and Basic have issues also.

So far no one has found the right way.  By which I mean that programmers can and will abuse any feature, cross any line, ignore any warning, blah blah.  I no longer argue with them about how "efficient" their coding practices are, particularly associated with database access.

 


Wednesday, April 29, 2009 - 10:39:20 AM - mardukes Back To Top

The analogy being the Microsoft magic.  Smart quotes "does you a favor" and ends up hurting; implicit converts "does you a favor" and ends up hurting.  Likewise, a scripter shouldn't use Word nor allow the mystery of not EXPLICITLY casting/converting values -- let alone not KNOW that it's happening.

And, yes, I do blame the tool -- TSQL is not some amatuer or web scripting tool.  It should error out for mismatched data types.  Nothing drives me more nuts than having people believe that "4/29/2009" is a date.  (FYI, it's a string!)  A misbelief fostered by implicit conversion.


Wednesday, April 29, 2009 - 10:09:04 AM - ray.herring Back To Top

MarDukes, You imply you are using MSWord as your SQL editor.  I am sure you are joking aren't you?


Wednesday, April 29, 2009 - 9:50:52 AM - imSQrLy Back To Top

 I understand now about what side of the =. That is why i do not have the issue in my tests. Thanks for the clarification.


Wednesday, April 29, 2009 - 9:27:21 AM - edcarden Back To Top

[quote user="imSQrLy"]

 I was testing this out and it doesnt seem to be true for the reverse. Meaning if your where clause is a string but the field is an int, it seems the implicit conversion does not cause a scan as it is doing a seek on my test data base. I wouldn't recommend writing queries like that, but this caught my eye as i had to write bad queries in a application i support as the application is 3rd party and i could only work within the codeset we have. Anyway my queries are implicitly converted but still doing seeks...YAY!

[/quote]

 The performance drop doesn't occurs for the revrse because the Implcit conversion is then being performed only once and on an item that is not neing searched for or locate don an indexed.  If the item on the right side of the equals where a numeric value in a table (I.e.e TABLEA.VARCol1 = TABLEB.NUMCol2) then you;d see the same kind of thing. 

 The m,ain SQL DB we use in our accounting software packge is plagued with query's doing implict conversions because of improperly trained/skilled develpers employeed by the vendor who produces this acounting suite we use.  Luckily the implict conversions are most often the case that the value being searched is properly typed (as  a NUMERIC value) and the search value is listed as a string like this:

 TABLEA.NUMCol1 = '12345'

 Why they do this, even though they are not experienced in T-SQL or DB Logic/Set Logic in general is beyound me but I see it happening a lot when I trace the apps activity via Profiler.


Tuesday, April 28, 2009 - 9:38:51 AM - aprato Back To Top

I've run across this also when issuing prepared statements.  For instance, the Microsoft JDBC driver will implicitly convert String data types to NVARCHAR. If you prepare a statement and the indexed column is defined as VARCHAR, you'll get a table scan.


Tuesday, April 28, 2009 - 8:23:58 AM - ray.herring Back To Top

The issue is really in two parts.  First, the data type mismatch and second the QueryEngine not using an index in the query.

The datatype mismatch is simply a programmer error and it is very common.  Seems unfair to blame a tool for a programmer mistake.  We have dealt with data type mismatch issues since day one in this business.  Various languages adopt differnt rules.  For example Pascal is strongly typed and forces the programmer to explicitly convert.  On the other hand VBScript pretty much converts anything to anything.  BOL clearly outlines T-SQL rules for data type implicit conversions.  Obviously, if the tool is going to do an Implicit datatype conversion then a function call is going to be involved.

And that leads to the second issue.  When a function appears on the left side of the compare in Where clause, the Query Engine cannot use indexes in the plan.  This also well documented behavior.  In general, all terms on the left side of the compares in a where clause should be direct table(column) references that are not wrapped in function calls (eg. DateDiff, Convert, SubString, ...).

 


Tuesday, April 28, 2009 - 5:35:11 AM - imSQrLy Back To Top

 I was testing this out and it doesnt seem to be true for the reverse. Meaning if your where clause is a string but the field is an int, it seems the implicit conversion does not cause a scan as it is doing a seek on my test data base. I wouldn't recommend writing queries like that, but this caught my eye as i had to write bad queries in a application i support as the application is 3rd party and i could only work within the codeset we have. Anyway my queries are implicitly converted but still doing seeks...YAY!


Monday, April 27, 2009 - 1:22:22 PM - mardukes Back To Top

You had to go to the query plan to check for data type mismatch.  Shouldn't this article be about how insidiously harmful Microsoft magical converts are (even worse than weak data typing!)  Or how your job is to use accurate, proper code?

I got another one for you.  If you're having problems moving your SQL from Microsoft Word to SQL Server, check to see if smart quotes is on.  (They aren't the same ascii or unicode characters.)


Monday, April 27, 2009 - 12:15:31 PM - grobido Back To Top

To turn on the Graphical Plan you can use Ctrl-M in a query window.

Then in the Graphical Plan you can right click and select "Save Execution Plan As..." to save an XML representation of the query plan as a ".sqlplan" file.

The images in the tip are just using an image capture program.


Monday, April 27, 2009 - 10:07:05 AM - ray.herring Back To Top

Nice tip.  I am working on something similar right now and your tip gives me something else to look for :)

BTW, How do you capture the Show Plan graphical output from the SSMS GUI?  I have been trying to do that.


Monday, April 27, 2009 - 7:29:44 AM - ThomasLL Back To Top

We had the same exact problem. It was a field that was varchar(5) compared to a numeric.

WHERE var5 = 9999 

The lookup was run 100s times a minute and after updating the SP, the CPU usage went down 10-15%

 ThomasLL


Learn more about SQL Server tools