Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Scripts          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

Storing passwords in SQL Server things to know to keep the data secure


By:   |   Updated: 2016-04-25   |   Comments (6)   |   Related: 1 | 2 | 3 | More > Security

Problem

This is the first in a planned mini-series about common design anti-patterns in SQL Server. What do we mean by 'anti-pattern'? An anti-pattern is 'a common response to a recurring problem that is usually ineffective and risks being highly counterproductive.' (Wikipedia, 2016). This article considers the security implications of storing passwords in the database, examining how this is commonly implemented; by trying basic attack methods on these implementations we show how even salted password hashes can be broken; and offer insights on how to strengthen the security of stored passwords to counter the risk of breach.

Solution

When developers need to create an access control system, they often rely on a data storage medium such as SQL Server to store username and password information. Other metadata might also be stored such as the number of failed logins, the last successful login date, and this authentication information may be linked to customer records in other tables.

This presents an opportunity for the hacker. Imagine that someone was able to fetch the username and password for a customer of an e-commerce site. This someone may be able to:

  • Place fake orders for the customer
  • Fetch card or bank account details from the customer profile
  • View order history
  • Fetch addresses and security questions for the user
  • Use the information above to perpetrate attacks on other sites

It is immensely important that passwords are stored securely in SQL Server to ensure that even if the tables containing this information were infiltrated, the passwords themselves remain secure. For the scenarios in the following text, we will assume that a hacker has already gained access to the authentication table, so we can focus on data protection.

We will start by examining a basic example of an authentication table. If you would like to follow along, please consult the 'preparation.sql' script attached to create this table.

SELECT * FROM dbo.Auth_Basic


Original data set

You will note that two critical columns are exposed here, the username and the password (in plaintext). This means there is no barrier to the hacker - as they have this information, they can simply select a victim and log in via the application interface. Some passwords are random and some are popular. Let's do a frequency analysis using a simple GROUP BY to illustrate this:

SELECT	passwd, COUNT(*) [occurrences] 
FROM	dbo.Auth_Basic 
GROUP	BY passwd 
ORDER	BY COUNT(*) DESC 

Password with the associated count

SQL Server supports cryptographic hash functions. These are functions that take a string input and obfuscate it to a specific length and format using a one-way cryptographic function. These can be very useful - for our example, we could amend all the passwords to use one of these functions. SHA-256 is (at the time of writing) considered cryptographically secure. This article is written to be compatible with SQL Server 2008 and older and using SHA-256 is tricky in versions prior to 2012, so let's use MD5 for backward-compatibility. As part of our security hardening effort, we will create a new authentication table with hashed passwords from the old table using the HASHBYTES() function:

SELECT	UserName, HASHBYTES('MD5', CAST(passwd AS NVARCHAR(4000))) [HashedPasswd], UserID 
INTO	dbo.Auth_Hashed
FROM	dbo.Auth_Basic

SELECT * FROM dbo.Auth_Hashed


User name, hashed password and user ID

Now, the application developer can amend the application code to encrypt user input with the SHA-256 algorithm and compare the output hashes only, returning a simple TRUE/FALSE if user input matches the stored password.

However, we still have a problem. If the hacker has access to this table, he cannot see the plaintext passwords, but he is certainly able to perform the same frequency analysis as we did on the Basic_Auth table. This would show him which hashes were most popular. He could then search publicly-available databases or websites to find the appropriate plaintext, enabling him to guess the passwords that would affect hundreds of accounts.

For example:

SELECT	HashedPasswd, COUNT(*) [occurrences] 
FROM	dbo.Auth_Hashed 
GROUP	BY HashedPasswd
ORDER	BY COUNT(*) DESC 


Number of occurrences for the hashed password

If I took the top hash by frequency, 0xC31AC605793F580B386C0FB53F1B9775 and remove the 0x to leave a hexadecimal hash string, I could plug this term into a website like pastebin.com to see if the hash has been cracked:

Convert the hexadecimal string to a clear text string

As we see, the plaintext equivalent is 12345. Now the hacker has access to 223 accounts with a simple query to determine the usernames using this password:

SELECT	UserName 
FROM	dbo.Auth_Hashed 
WHERE	HashedPasswd = 0xC31AC605793F580B386C0FB53F1B9775


User Names with a Hashed Password that matches 1 2 3 4 5

It is, of course possible to perform this attack in the opposite direction. Let's compute the hash for the password 'letmein' and see if it matches any of the password hashes in the table:

SELECT	UserName, HashedPasswd 
FROM	dbo.Auth_Hashed 
WHERE	HashedPasswd = HASHBYTES('MD5', CONVERT(NVARCHAR(4000),'letmein'))


User Name and Hashed Password with the value of letmein

To overcome this deficiency, it has become common practice to salt a password. What is salting? It refers to the practice of modifying a string in a deterministic way i.e. by concatenating another string to it, or inserting/replacing characters in the string according to rules. Salting a password is typically done in two ways - either a unique randomly-generated salt can be used alongside the username/password credentials, or a single salt can be used, defined in the application-side source code.

Let's take our original plaintext table and create another table, generating a random salt for each password along the way and hashing this, so we can examine if this method is secure. For simplicity, we'll generate a random integer in the range 0-99999 to act as the salt, although in reality salts can be much longer and more complex strings, and simply append this integer to the end of the string.

;WITH generateSalt AS (
	SELECT	UserName, passwd, CAST(ABS(CHECKSUM(NEWID())) % 100000 AS NVARCHAR(6)) [salt], UserID 
	FROM	dbo.Auth_Basic ) 
SELECT	UserName, HASHBYTES('MD5',CONVERT(NVARCHAR(4000),passwd + salt)) [HashedSaltedPwd], UserID 
INTO	dbo.Auth_Hashed_Salted 
FROM	generateSalt 

SELECT * FROM dbo.Auth_Hashed_Salted


Passwords with a random salt

This approach adds complexity for the hacker. Repeating his frequency analysis on this new table isn't going to work, since even those passwords that are identical will have different hashes. Let's try it now:

SELECT	HashedSaltedPwd, COUNT(*) 
FROM	dbo.Auth_Hashed_Salted 
GROUP	BY HashedSaltedPwd 
ORDER	BY COUNT(*) DESC 


Number of occurences of the same password has dropped with the salt

We can see we have just one collision, where an identical salt has been generated for an identical password. All other hashed, salted passwords are now unique. So what can the hacker do to counter this? One approach is still possible, but more time-consuming - the hacker knows the range of salts is 0-99999 from the table data, and can speculate that a simple method like appending it to the password might be used. Let's take ten common passwords and compute the hashes for them with all salts ranging from 0-99999 appended, then join this result set on to our hashed, salted table to see if any passwords can be identified. We can do all of this in SQL. Let's first generate the hash list. 10 passwords x 100,000 salt values = 1,000,000 hashes to check and put this information into the table #computedHashes:

IF EXISTS ( SELECT name FROM tempdb.sys.tables WHERE name LIKE ('%computedHashes%') )
	DROP TABLE #computedHashes 
GO
CREATE TABLE #computedHashes ( 
	ptPwd VARCHAR(50), 
	salt INT, 
	hsPwd VARBINARY(MAX) )

;WITH top10pwds AS ( 
	SELECT 'password' [pwd] UNION SELECT '123456' UNION SELECT '12345678' UNION SELECT '1234' UNION SELECT 'qwerty' 
	UNION SELECT '12345' UNION SELECT 'dragon' UNION SELECT 'letmein' UNION SELECT 'abc123' UNION SELECT 'admin' ), 
-- on-the-fly numbers table
salts AS ( 
	SELECT TOP 100000 ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL ) ) [num] FROM sys.objects o1 CROSS JOIN sys.objects o2 CROSS JOIN sys.objects o3 )

INSERT INTO #computedHashes 
	SELECT	top10pwds.pwd, salts.num [salt], HASHBYTES('MD5',CONVERT(NVARCHAR(4000),pwd + CAST(salts.num AS VARCHAR(6)))) [hash]
	FROM	top10pwds		
	CROSS	JOIN salts 
	ORDER	BY top10pwds.pwd, salts.num 

Here's what the #computedHashes table contains:

SELECT TOP 1000 * FROM #computedHashes


Here is what the #computedHashes table contains

It is now trivial to join this information back to our dbo.Auth_Hashed_Salted table, selecting the username and ptPwd (plaintext password) where the calculated hash matches the hash we have stored:

SELECT	auth.Username, comp.ptPwd [Password], auth.UserID 
FROM	dbo.Auth_Hashed_Salted auth 
INNER	JOIN #computedHashes comp ON auth.HashedSaltedPwd = comp.hsPwd 


Joining back to get the original user name, password and user id

We can see we have broken 2,081 user passwords from our data table of 10,001 in this manner.

IMPORTANT! Even if we are not using integer-based salts, we can simply select all distinct salts from the table and apply each salt to each password we are testing. For this reason, storing the salt in the table may not be a great idea. If we do not know the cryptographic function, we can try a range. If we do not know how the salt has been applied, we can guess at some common techniques - i.e. applying the salt to the beginning or end of a password. Nor are potential passwords limited to 10 - within a click or two of the right Google search, it is possible to download millions of password samples.

This is computationally expensive, but only the result of the Cartesian products of all of these factors:

Check operations = Number of passwords to check * number of different cryptographic schemes * number of distinct salts known * number of different salt application techniques

Instead, some developers choose to store the salt in the application-side code. This has one major benefit and one major drawback - the benefit is that the source code (or server-side code) would have to be compromised to access the salt value, which would probably present many other, more urgent problems if that occurred, and can be difficult to do. The drawback is that using this approach, typically a single salt value is used for all passwords. If this value is compromised, all passwords could be at risk of compromise.

Other theoretical attack vectors include packet sniffing. Is the application-side code using SSL to communicate with the database? If not, why not? SQL Server supports key/certificate-based authentication with client connections. Without this, a successful login attempt including username, user-entered password and salt could be sniffed en-route.

So what lessons can we learn from this demonstration?

  • Strong passwords make dictionary-like attacks less likely. Make sure your users adhere to password policies to create strong passwords. Enforce this at the client-side if necessary.
  • Consider the role of table-level security in SQL Server. Ensure your users only have the access they need.
  • Consider options like row-level security where available to you.
  • Consider storing your username and password information in separate tables or databases.
  • Consider the use of SSL for database traffic encryption during user authentication.
  • Never store plaintext passwords in the database.
  • Consider storing your salts separately from your passwords or obfuscate these salts through a hidden, reversible methodology (i.e. an application-side mathematical function).
Next Steps


Last Updated: 2016-04-25


get scripts

next tip button



About the author
MSSQLTips author Derek Colley Derek Colley is a UK-based DBA and BI Developer with more than a decade of experience working with SQL Server, Oracle and MySQL.

View all my tips




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Email me updates 


Signup for our newsletter
 I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



    



Tuesday, April 26, 2016 - 9:35:12 AM - Marc Jellinek Back To Top

Thanks for pointing out the anti-patterns related to storing usernames and passwords.  My question is:  how does one do this properly? 


Monday, April 25, 2016 - 5:04:15 PM - Justin J Stark Back To Top

Use BCrypt and do the hashing in server code and not in the database. In .NET, use BCrypt.NET (available in NuGet as BCrypt-Official) which is fool proof and gives you a nice 60 character string (which includes the algorithm, cost value, salt, and hash) to store in the database.


Monday, April 25, 2016 - 1:51:00 PM - Secure Password Storage Back To Top

For anyone actually intending to store passwords in ANY database, please start by reading

https://security.stackexchange.com/questions/211/how-to-securely-hash-passwords

The summary of which is: Use PBKDF2, BCrypt, or SCrypt, with a sufficiently large iteration count/work factor.  For PBKDF2, don't ask for more bytes of output than the native size of the hash function used, and "large" is low hundreds or high tens of thousands of iterations.

Your salt should be in the range of 12-24 bytes and be generated by a cryptographically random source; note that the code above is using only 4 bytes from a non-crypto source, both of which are significant flaws.

For any application, you're best off generating the salt and hashing in the app, before it gets to the database.

For SQL Server, if for some bizarre reason you can't use the application to generate a cryptographically random salt of 12-24 hash the password, try:

crypt_gen_random to generate, perhaps, BINARY(16) bytes of random salt
https://msdn.microsoft.com/en-us/library/cc627408.aspx

and then if you must, use a native SQL Server implementation of PBKDF2-HMAC-SHA-512 at:
https://stackoverflow.com/questions/7837547/is-there-a-sql-implementation-of-pbkdf2
which is also at
https://github.com/Anti-weakpasswords/PBKDF2-MSSQL-Custom-A

As a warning, SQL Server is an awful platform for hashing; try doing so in your native platform.  For .NET, there's a .NET implementation of PBKDF2-HMAC-SHA-512 at:
https://github.com/Anti-weakpasswords/PBKDF2-CSharp-DeriveBytes-custom-A




As always, in the app, you take whatever the user enters when they log in, retrieve their salt, and then hash the entry with the retrieved salt to see if you get the same result.


Monday, April 25, 2016 - 12:46:52 PM - Justin J Stark Back To Top

I have a problem with this article. DON'T USE MD5 TO HAS PASSWORDS. Storing your salt in the database is perfectly cool if you combine it with bcrypt. Just use a unique salt per password. The point of salt is to protect against rainbow tables. With bcrypt, an attacker cannot generate a rainbow table for every salt in your table because your bcrypt settings should be set to make it too slow to be plausible (and bcrypt is not parallelizable and therefore not GPU optimizable). The article is saying we should keep our salt out of the database for invalid reasons. The real problem here is the hashing algorithm, not where we store the salt. Storing your salt in the database is a good method as long as you use bcrypt and a unique salt per password. This article is steering developers in the wrong direction.

Also, it's better to do you salting and hashing in server-side code rather than at the database. This way you can use better algorithms like bcrypt which your database might not support and it will be easier to replace if a better algorithm comes around.

“Salting the soup” is a good slide deck on this:

 

http://www.martinstoeckli.ch/hash/en/hash_salt.php


Monday, April 25, 2016 - 12:18:43 PM - David Rogers Back To Top

Nice description and exposition! I am not new to this, but I had never seen the idea of applying an application-level function to the salt as an additional level of securty.

@Stephen Roughley Thanks for the reference!

David


Monday, April 25, 2016 - 9:15:35 AM - Stephen Roughley Back To Top

Nice intro with very helpful examples but really could do with at least discussing the current standard of key stretching (running the hash routine through [at least thousands of] iterations) as this makes the stored hashes much more difficult to calculate. Could you mention bcrypt PBKDF2 so that your readers can continue reading up on this?

I know this may be tricky to pull off in SQL Server alone but it is still an essential part of a secure hashing subsystem: https://crackstation.net/hashing-security.htm#properhashing

Thanks,

 


Learn more about SQL Server tools