Understanding the SQL Server HASHBYTES hashing algorithms

By:   |   Comments (11)   |   Related: > Security


Problem

I'm trying to use the HASHBYTES() function and I see there is an assortment of hashing algorithms available. What is the difference between each one?

Solution

HASHBYTES(), as the name implies, is a function or algorithm that generates a hash from some input. If you're not familiar with what a hash is, here's a good working definition for a hash function:

hash function - A hash function takes in data and returns back a fixed length block of bits such that any change to the data should result in a different block.

HASHBYTES() is actually a function which provides access to several hashing algorithms. SQL Server 2005 and up have the following protocols (how you specify them in HASHBYTES is in parentheses):

  • MD 2 (MD2)
  • MD 4 (MD4)
  • MD 5 (MD5)
  • SHA-0 (SHA)
  • SHA-1 (SHA1)

SQL Server 2012 introduces these additional hashing algorithms:

  • SHA-2 256 bits AKA SHA-256 (SHA2_256)
  • SHA-2 512 bits AKA SHA-512 (SHA2_512)

With these options, which should you choose?

If you're looking for something just to quickly differentiate between two blocks of data, then it doesn't make a whole lot of difference.

If, however, you're dealing with hashes for passwords or anything related to security, then your choice can make a big difference.

Hashing Algorithms You Should Probably Avoid

Since a hash function will return a fixed length block, obviously, there are a finite set of possibilities for the output. Therefore, there's bound to be different inputs that will result in the same output. We call these situations collisions. What is of concern is if a collision can be engineered fairly quickly/cheaply. This is referred to as a collision attack. If a hashing algorithm is susceptible to this type of attack or other attacks with reasonable resources that either (a) allow you to create an identical hash with different input or (b) figure out the input from the hash, then those algorithms should be avoided. As of the writing of this tip, the following algorithms fall into that list:

  • MD 2 - developed in 1989 by Ronald Rivest, it generates a 128-bit hash value. It is susceptible to several attacks and is no longer considered cryptopgraphically secure.
  • MD 4 - Also developed by Rivest (1990), it generates a 128-bit hash value like MD 2. And like MD 2, it is susceptible to a couple of attacks and is no longer considered cryptographically secure.
  • MD 5 - Developed by Rivest in 1992, it also generates a 128-bit hash value. While we see it used often to "digitally fingerprint" files and documents, it is also considered "broken" and no longer cryptographically secure.
  • SHA-0 - SHA-0 was withdrawn shortly after being made public due to a "significant" flaw. It generates a 160-bit hash value. It was replaced by SHA-1.

Hashing Algorithms Which Are Good for Now

There is one algorithm in this group, and that's SHA-1 (SHA 1). While it is theoretically possible to generate collisions, in 2012 it's rather expensive to do so, as this Bruce Schneier post points out. In a few years, especially with cloud computing, it should be within the financial resources of organized crime to be able to break the algorithm. As a result, the US government had recommended that government agencies start phasing it out start back in 2010. However, it's still very much in use.

If you're not on SQL Server 2012, SHA-1 is the best choice to use.

Hashing Algorithms Good for the Foreseeable Future

The SHA-2 algorithm comes in several block sizes, of which SQL Server 2012 implements two:

  • SHA-2 256 bit block size (called SHA-256)
  • SHA-2 512 bit block size (called SHA-512)

Other than the block size, is there a real difference between the two? Yes, there is. The SHA-512 implementation utilizes 80 rounds, whereas the SHA-256 implementation only uses 64. A round is basically a repeated series of steps in the algorithm itself. Therefore, most attacks should have a harder time with SHA-512. That being said, SHA-256 is more widely implemented.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author K. Brian Kelley K. Brian Kelley is a SQL Server author and columnist focusing primarily on SQL Server security.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Monday, January 26, 2015 - 3:28:04 PM - Password hashing notes Back To Top (36061)
The big difference between SHA-512 and all the others is that SHA-512 uses 64-bit operations, which (as of early 2015) still reduce the advantage most attackers have using their GPU's compared to the defenders using CPU's, since most GPU's are not as good at 64-bit operations as they are at 32-bit operations. As far as password hashing goes, you should not use one round of any hashing algorithm for passwords. Instead, use a cryptographically sound iterative approach, which means BCrypt, SCrypt, or PBKDF2/RFC2898. These are best implementated at the application layer, but if you absolutely must do them in T-SQL, see some T-SQL implementations available at StackOverflow: http://stackoverflow.com/questions/7837547/is-there-a-sql-implementation-of-pbkdf2

Monday, October 27, 2014 - 12:20:54 PM - arif Back To Top (35081)

Thanks Brian.. Very straightforward and easy to understand

 

 


Wednesday, May 7, 2014 - 3:22:44 AM - Manikanth Back To Top (30640)

I am using SHA2_256 in SQL Server 2012
I get to see the lenght of the hashbytes return either 31 or 32 (using like below)

LEN(HASHBYTES('SHA2_256','Samplestring'))

when do i get 31 and when do i get 32 ?

Hash1 ; 0xD2F3C3D588292A28938E9FC7AC6F86AA9A2FCA270CE958A8CB7620B11295D320 (Length 31)

Hash2 : 0xA9A7529D790E43C103741F7EBE0FE3FC09F455A7E91AA92FD223D8B50E3D882B (Length 32)

 


Wednesday, October 30, 2013 - 12:31:24 PM - Key stretching Back To Top (27338)

Most importantly, if you're using hashing for password hasing, you should use key stretching, as using just one hash iteration is far too fast - and more importantly, is very likely to be much faster on an attacker's hardware than it is on your own (see empirical example, below), even if the attacker is a bored teenager with a single gaming video card.  PBKDF2 (also known as PKCS #5 v2.0 and RFC2898), BCRYPT, and SCRYPT are all examples of this, and sufficient numbers of itertaions need to be chosen. 

Per NIST SP800-131A, http://csrc.nist.gov/publications/nistpubs/800-131A/sp800-131A.pdf      SHA-1 is Disallowed for digital signature generation after 2010, Legacy-Use for digital signature verification after 2010, but remains Acceptable for non-digital signature use with no end date given.  Therefore, if NIST special publications matter to you, you need to pay attention to what your use case is.

Note that SHA-512 uses 64-bit words vs. the 32-bit words fo SHA-256, and thus at this time, an attacker computing SHA-512 on graphics cards does not have as much of an advantage as computing SHA-256 on said graphics cards. 

As an empircal example, oclHashcat-lite with a single Radeon 7970 v0.15 computes about 1.1 billion (with a B) SHA-256 tries per second, but only about 81 million (with an M) SHA-512 tries per second.  Note that the same system computes about 2.8 billion (with a B) SHA-1 tries per second, and 8.2 billion (with a B) MD5 tries per second.  As a contrast, oclHashcat-plus v0.15 on similar hardware computes WPA2 at 133 thousand (with a T) tries per second.  WPA2 is defined as PBKDF2(HMAC−SHA1, passphrase, ssid, 4096, 256).

Note that multiple graphics cards can be put in a single machine for roughly linear speed increases, and multiple machines can likewise be used for linear speed increases.

For password storage, see https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet


Tuesday, August 27, 2013 - 2:04:25 PM - mbourgon Back To Top (26495)

If you're not worried about attacks, which ones have the overall lowest likelihood of collisions?  I've recommended MD5 for a while because the collision chance is low - some of our devs use binary_checksum (No! Bad dev! No cookie!) to check to see if the values have changed, and MD5 seems to be a much better option. 

 

Thanks!


Monday, July 15, 2013 - 7:50:37 PM - TimothyAWiseman Back To Top (25844)

As always, a great post.  But since you mention storing passwords, it might be worth going a bit further.  For instance, it can help to salt as well as hash the passwords, and as ArsTechnica pointed out even that is probably not enough against a determined security expert.  Proper password storage is a difficult thing.


Sunday, July 14, 2013 - 1:20:15 PM - Sweet!!! Back To Top (25829)

Sweet!!!


Friday, July 12, 2013 - 1:37:30 PM - Ed - sqlscripter Back To Top (25819)

Great posting...


Thursday, July 11, 2013 - 5:37:15 PM - D Back To Top (25804)

nice one, thank you.


Thursday, July 11, 2013 - 10:15:41 AM - CodePro Back To Top (25798)

Excellent post, thanks.


Thursday, July 11, 2013 - 9:09:28 AM - K. Brian Kelley Back To Top (25796)

I missed a link reference. The definition is adapted from wikipedia's hash function definition. I've simplified it. If you want to see the original, it is here: http://en.wikipedia.org/wiki/Cryptographic_hash_function















get free sql tips
agree to terms