Storing passwords in SQL Server – things to know to keep the data secure
This is the first in a planned mini-series about common design anti-patterns in SQL Server. What do we mean by 'anti-pattern'? An anti-pattern is 'a common response to a recurring problem that is usually ineffective and risks being highly counterproductive.' (Wikipedia, 2016). This article considers the security implications of storing passwords in the database, examining how this is commonly implemented; by trying basic attack methods on these implementations we show how even salted password hashes can be broken; and offer insights on how to strengthen the security of stored passwords to counter the risk of breach.
When developers need to create an access control system, they often rely on a data storage medium such as SQL Server to store username and password information. Other metadata might also be stored such as the number of failed logins, the last successful login date, and this authentication information may be linked to customer records in other tables.
This presents an opportunity for the hacker. Imagine that someone was able to fetch the username and password for a customer of an e-commerce site. This someone may be able to:
- Place fake orders for the customer
- Fetch card or bank account details from the customer profile
- View order history
- Fetch addresses and security questions for the user
- Use the information above to perpetrate attacks on other sites
It is immensely important that passwords are stored securely in SQL Server to ensure that even if the tables containing this information were infiltrated, the passwords themselves remain secure. For the scenarios in the following text, we will assume that a hacker has already gained access to the authentication table, so we can focus on data protection.
We will start by examining a basic example of an authentication table. If you would like to follow along, please consult the 'preparation.sql' script attached to create this table.
SELECT * FROM dbo.Auth_Basic
You will note that two critical columns are exposed here, the username and the password (in plaintext). This means there is no barrier to the hacker - as they have this information, they can simply select a victim and log in via the application interface. Some passwords are random and some are popular. Let's do a frequency analysis using a simple GROUP BY to illustrate this:
SELECT passwd, COUNT(*) [occurrences] FROM dbo.Auth_Basic GROUP BY passwd ORDER BY COUNT(*) DESC
SQL Server supports cryptographic hash functions. These are functions that take a string input and obfuscate it to a specific length and format using a one-way cryptographic function. These can be very useful - for our example, we could amend all the passwords to use one of these functions. SHA-256 is (at the time of writing) considered cryptographically secure. This article is written to be compatible with SQL Server 2008 and older and using SHA-256 is tricky in versions prior to 2012, so let's use MD5 for backward-compatibility. As part of our security hardening effort, we will create a new authentication table with hashed passwords from the old table using the HASHBYTES() function:
SELECT UserName, HASHBYTES('MD5', CAST(passwd AS NVARCHAR(4000))) [HashedPasswd], UserID INTO dbo.Auth_Hashed FROM dbo.Auth_Basic SELECT * FROM dbo.Auth_Hashed
Now, the application developer can amend the application code to encrypt user input with the SHA-256 algorithm and compare the output hashes only, returning a simple TRUE/FALSE if user input matches the stored password.
However, we still have a problem. If the hacker has access to this table, he cannot see the plaintext passwords, but he is certainly able to perform the same frequency analysis as we did on the Basic_Auth table. This would show him which hashes were most popular. He could then search publicly-available databases or websites to find the appropriate plaintext, enabling him to guess the passwords that would affect hundreds of accounts.
SELECT HashedPasswd, COUNT(*) [occurrences] FROM dbo.Auth_Hashed GROUP BY HashedPasswd ORDER BY COUNT(*) DESC
If I took the top hash by frequency, 0xC31AC605793F580B386C0FB53F1B9775 and remove the 0x to leave a hexadecimal hash string, I could plug this term into a website like pastebin.com to see if the hash has been cracked:
As we see, the plaintext equivalent is 12345. Now the hacker has access to 223 accounts with a simple query to determine the usernames using this password:
SELECT UserName FROM dbo.Auth_Hashed WHERE HashedPasswd = 0xC31AC605793F580B386C0FB53F1B9775
It is, of course possible to perform this attack in the opposite direction. Let's compute the hash for the password 'letmein' and see if it matches any of the password hashes in the table:
SELECT UserName, HashedPasswd FROM dbo.Auth_Hashed WHERE HashedPasswd = HASHBYTES('MD5', CONVERT(NVARCHAR(4000),'letmein'))
To overcome this deficiency, it has become common practice to salt a password. What is salting? It refers to the practice of modifying a string in a deterministic way i.e. by concatenating another string to it, or inserting/replacing characters in the string according to rules. Salting a password is typically done in two ways - either a unique randomly-generated salt can be used alongside the username/password credentials, or a single salt can be used, defined in the application-side source code.
Let's take our original plaintext table and create another table, generating a random salt for each password along the way and hashing this, so we can examine if this method is secure. For simplicity, we'll generate a random integer in the range 0-99999 to act as the salt, although in reality salts can be much longer and more complex strings, and simply append this integer to the end of the string.
;WITH generateSalt AS ( SELECT UserName, passwd, CAST(ABS(CHECKSUM(NEWID())) % 100000 AS NVARCHAR(6)) [salt], UserID FROM dbo.Auth_Basic ) SELECT UserName, HASHBYTES('MD5',CONVERT(NVARCHAR(4000),passwd + salt)) [HashedSaltedPwd], UserID INTO dbo.Auth_Hashed_Salted FROM generateSalt SELECT * FROM dbo.Auth_Hashed_Salted
This approach adds complexity for the hacker. Repeating his frequency analysis on this new table isn't going to work, since even those passwords that are identical will have different hashes. Let's try it now:
SELECT HashedSaltedPwd, COUNT(*) FROM dbo.Auth_Hashed_Salted GROUP BY HashedSaltedPwd ORDER BY COUNT(*) DESC
We can see we have just one collision, where an identical salt has been generated for an identical password. All other hashed, salted passwords are now unique. So what can the hacker do to counter this? One approach is still possible, but more time-consuming - the hacker knows the range of salts is 0-99999 from the table data, and can speculate that a simple method like appending it to the password might be used. Let's take ten common passwords and compute the hashes for them with all salts ranging from 0-99999 appended, then join this result set on to our hashed, salted table to see if any passwords can be identified. We can do all of this in SQL. Let's first generate the hash list. 10 passwords x 100,000 salt values = 1,000,000 hashes to check and put this information into the table #computedHashes:
IF EXISTS ( SELECT name FROM tempdb.sys.tables WHERE name LIKE ('%computedHashes%') ) DROP TABLE #computedHashes GO CREATE TABLE #computedHashes ( ptPwd VARCHAR(50), salt INT, hsPwd VARBINARY(MAX) ) ;WITH top10pwds AS ( SELECT 'password' [pwd] UNION SELECT '123456' UNION SELECT '12345678' UNION SELECT '1234' UNION SELECT 'qwerty' UNION SELECT '12345' UNION SELECT 'dragon' UNION SELECT 'letmein' UNION SELECT 'abc123' UNION SELECT 'admin' ), -- on-the-fly numbers table salts AS ( SELECT TOP 100000 ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL ) ) [num] FROM sys.objects o1 CROSS JOIN sys.objects o2 CROSS JOIN sys.objects o3 ) INSERT INTO #computedHashes SELECT top10pwds.pwd, salts.num [salt], HASHBYTES('MD5',CONVERT(NVARCHAR(4000),pwd + CAST(salts.num AS VARCHAR(6)))) [hash] FROM top10pwds CROSS JOIN salts ORDER BY top10pwds.pwd, salts.num
Here's what the #computedHashes table contains:
SELECT TOP 1000 * FROM #computedHashes
It is now trivial to join this information back to our dbo.Auth_Hashed_Salted table, selecting the username and ptPwd (plaintext password) where the calculated hash matches the hash we have stored:
SELECT auth.Username, comp.ptPwd [Password], auth.UserID FROM dbo.Auth_Hashed_Salted auth INNER JOIN #computedHashes comp ON auth.HashedSaltedPwd = comp.hsPwd
We can see we have broken 2,081 user passwords from our data table of 10,001 in this manner.
IMPORTANT! Even if we are not using integer-based salts, we can simply select all distinct salts from the table and apply each salt to each password we are testing. For this reason, storing the salt in the table may not be a great idea. If we do not know the cryptographic function, we can try a range. If we do not know how the salt has been applied, we can guess at some common techniques - i.e. applying the salt to the beginning or end of a password. Nor are potential passwords limited to 10 - within a click or two of the right Google search, it is possible to download millions of password samples.
This is computationally expensive, but only the result of the Cartesian products of all of these factors:
Check operations = Number of passwords to check * number of different cryptographic schemes * number of distinct salts known * number of different salt application techniques
Instead, some developers choose to store the salt in the application-side code. This has one major benefit and one major drawback - the benefit is that the source code (or server-side code) would have to be compromised to access the salt value, which would probably present many other, more urgent problems if that occurred, and can be difficult to do. The drawback is that using this approach, typically a single salt value is used for all passwords. If this value is compromised, all passwords could be at risk of compromise.
Other theoretical attack vectors include packet sniffing. Is the application-side code using SSL to communicate with the database? If not, why not? SQL Server supports key/certificate-based authentication with client connections. Without this, a successful login attempt including username, user-entered password and salt could be sniffed en-route.
So what lessons can we learn from this demonstration?
- Strong passwords make dictionary-like attacks less likely. Make sure your users adhere to password policies to create strong passwords. Enforce this at the client-side if necessary.
- Consider the role of table-level security in SQL Server. Ensure your users only have the access they need.
- Consider options like row-level security where available to you.
- Consider storing your username and password information in separate tables or databases.
- Consider the use of SSL for database traffic encryption during user authentication.
- Never store plaintext passwords in the database.
- Consider storing your salts separately from your passwords or obfuscate these salts through a hidden, reversible methodology (i.e. an application-side mathematical function).
- Karwin, B. (2010) SQL Antipatterns, Publisher: Pragmatic Programmers, ISBN 978-1-93435-655-5
- Sergey Gigoyan: Storing passwords in a secure way in a SQL Server database
- Manvendra Singh: Identify blank and weak passwords for SQL Server logins
- Brent Shaub: SQL Server stored procedure to generate random passwords
Last Updated: 2016-04-25
About the author
View all my tips