![]() |
|
|
By: Jeremy Kadlec | Read Comments (4) | Print Jeremy is the CTO @ Edgewood Solutions, co-founder of MSSQLTips.com and SQL Server MVP since 2009. Related Tips: More |
|
Problem
Determining if two rows or expressions are equal can be a difficult and resource intensive process. This can be the case with UPDATE statements where the update was conditional based on all of the columns being equal or not for a specific row. To address this need in the SQL Server environment the CHECKSUM, CHECKSUM_AGG and BINARY_CHECKSUM functions are available in SQL Server 2005 to natively create a unique expression, row or table for comparison or other application needs. In this tip we will focus on the common questions related to the CHECKSUM code and provide an example to begin to leverage the CHECKSUM commands in your T-SQL code.
Solution
What is the purpose of the using the CHECKSUM functionality?
The CHECKSUM is intended to build a hash index based on an expression or column list.
When would I use the CHECKSUM function?
One example of using a CHECKSUM is to store the unique value for the entire row in a column for later comparison. This would be helpful in a situation where all of the rows in a table need to be compared in order to perform an UPDATE. Without a CHECKSUM you would need to do the following:

Download the sample code here from the image above.
Compare the UPDATE code from the first example to this one using the CHECKSUM function.

Download the sample code here from the image above.
In order for this query to be successful, it is necessary to build the CHECKSUM value ahead of time when inserting the data in order to perform the comparison in subsequent code. So if your performing very few entire row (or just about every column in the row) comparisons then ad-hoc comparisons may be optimal. However, if significant number of comparisons are made with a large number of columns, then this option should be researched further and tested for performance improvements over individual comparisons outlined in the first set of code.
What are some of the caveats with using any of the CHECKSUM functions?
Next Steps
| Share: | Share | Tweet |
|
![]() |
![]() |
Connect with MSSQLTips.com |
| Tuesday, March 11, 2008 - 11:02:52 AM - papachec | Read The Tip |
|
I have been using the checksum function successfully for quite some time. Just recently I encountered 2 instances where a different list of values produced identical checksums. I'd like to understand the calculation that is done by checksum 'under the hood' to know how this is possible and how safe it is to continue to use 'checksum'. EXAMPLE: select checksum('51;52;56;2205;') produces 1726190963 I expected that I would get 4 different results because each of the 4 examples is a different list of values. But I get only 2 different results. Your comments and suggestions are welcomed. |
|
| Tuesday, March 11, 2008 - 6:57:26 PM - aprato | Read The Tip |
|
In the Remarks section of the SQL 2005 BOL it says this
CHECKSUM applied over any two lists of expressions returns the same value if the corresponding elements of the two lists have the same type and are equal when compared using the equals (=) operator. For this definition, null values of a specified type are considered to compare as equal. If one of the values in the expression list changes, the checksum of the list also generally changes. However, there is a small chance that the checksum will not change. Based on the last 2 sentences, I'm not sure of it's reliability. I don't generally use it (In fact, I've never used it). Are you using this for checking data changes? If so, maybe a table flag would be a safer option? |
|
| Thursday, April 03, 2008 - 10:48:57 AM - glauco.basilio | Read The Tip |
|
I try unsucefull use checksum and binary_checksum to identify duplicated rows in my database. If you have a table with a large amount of rows you will see that both functions generate the same "hash" for rows with diferent data. |
|
| Friday, April 04, 2008 - 7:10:12 AM - admin | Read The Tip |
|
glauco.basilio, Agreed that the hashes may be the same on two different rows. Can you include or exclude particular columns to see if the hash will still meet your business rules and be unique? Thank you, |
|
|
privacy | disclaimer | copyright | advertise | about authors | contribute | feedback | giveaways | user groups Some names and products listed are the registered trademarks of their respective owners. Edgewood Solutions LLC | MSSharePointTips.com | MSSQLTips.com |