Storing passwords in SQL Server – things to know to keep the data secure

Problem

This is the first in a planned mini-series about common design anti-patterns in SQL Server. What do we mean by ‘anti-pattern’? An anti-pattern is ‘a common response to a recurring problem that is usually ineffective and risks being highly counterproductive.’ (Wikipedia, 2016). This article considers the security implications of storing passwords in the database, examining how this is commonly implemented; by trying basic attack methods on these implementations we show how even salted password hashes can be broken; and offer insights on how to strengthen the security of stored passwords to counter the risk of breach.

Solution

When developers need to create an access control system, they often rely on a data storage medium such as SQL Server to store username and password information. Other metadata might also be stored such as the number of failed logins, the last successful login date, and this authentication information may be linked to customer records in other tables.

This presents an opportunity for the hacker. Imagine that someone was able to fetch the username and password for a customer of an e-commerce site. This someone may be able to:

Place fake orders for the customer
Fetch card or bank account details from the customer profile
View order history
Fetch addresses and security questions for the user
Use the information above to perpetrate attacks on other sites

It is immensely important that passwords are stored securely in SQL Server to ensure that even if the tables containing this information were infiltrated, the passwords themselves remain secure. For the scenarios in the following text, we will assume that a hacker has already gained access to the authentication table, so we can focus on data protection.

We will start by examining a basic example of an authentication table. If you would like to follow along, please consult the ‘preparation.sql‘ script attached to create this table.

SELECT * FROM dbo.Auth_Basic

You will note that two critical columns are exposed here, the username and the password (in plaintext). This means there is no barrier to the hacker – as they have this information, they can simply select a victim and log in via the application interface. Some passwords are random and some are popular. Let’s do a frequency analysis using a simple GROUP BY to illustrate this:

SELECTpasswd, COUNT(*) [occurrences] 
FROMdbo.Auth_Basic 
GROUPBY passwd 
ORDERBY COUNT(*) DESC

SQL Server supports cryptographic hash functions. These are functions that take a string input and obfuscate it to a specific length and format using a one-way cryptographic function. These can be very useful – for our example, we could amend all the passwords to use one of these functions. SHA-256 is (at the time of writing) considered cryptographically secure. This article is written to be compatible with SQL Server 2008 and older and using SHA-256 is tricky in versions prior to 2012, so let’s use MD5 for backward-compatibility. As part of our security hardening effort, we will create a new authentication table with hashed passwords from the old table using the HASHBYTES() function:

SELECTUserName, HASHBYTES('MD5', CAST(passwd AS NVARCHAR(4000))) [HashedPasswd], UserID 
INTOdbo.Auth_Hashed
FROMdbo.Auth_Basic
SELECT * FROM dbo.Auth_Hashed

Now, the application developer can amend the application code to encrypt user input with the SHA-256 algorithm and compare the output hashes only, returning a simple TRUE/FALSE if user input matches the stored password.

However, we still have a problem. If the hacker has access to this table, he cannot see the plaintext passwords, but he is certainly able to perform the same frequency analysis as we did on the Basic_Auth table. This would show him which hashes were most popular. He could then search publicly-available databases or websites to find the appropriate plaintext, enabling him to guess the passwords that would affect hundreds of accounts.

For example:

SELECTHashedPasswd, COUNT(*) [occurrences] 
FROMdbo.Auth_Hashed 
GROUPBY HashedPasswd
ORDERBY COUNT(*) DESC

Number of occurrences for the hashed password

If I took the top hash by frequency, 0xC31AC605793F580B386C0FB53F1B9775 and remove the 0x to leave a hexadecimal hash string, I could plug this term into a website like pastebin.com to see if the hash has been cracked:

Convert the hexadecimal string to a clear text string

As we see, the plaintext equivalent is 12345. Now the hacker has access to 223 accounts with a simple query to determine the usernames using this password:

SELECTUserName 
FROMdbo.Auth_Hashed 
WHEREHashedPasswd = 0xC31AC605793F580B386C0FB53F1B9775

User Names with a Hashed Password that matches 1 2 3 4 5

It is, of course possible to perform this attack in the opposite direction. Let’s compute the hash for the password ‘letmein’ and see if it matches any of the password hashes in the table:

SELECTUserName, HashedPasswd 
FROMdbo.Auth_Hashed 
WHEREHashedPasswd = HASHBYTES('MD5', CONVERT(NVARCHAR(4000),'letmein'))

User Name and Hashed Password with the value of letmein

To overcome this deficiency, it has become common practice to salt a password. What is salting? It refers to the practice of modifying a string in a deterministic way i.e. by concatenating another string to it, or inserting/replacing characters in the string according to rules. Salting a password is typically done in two ways – either a unique randomly-generated salt can be used alongside the username/password credentials, or a single salt can be used, defined in the application-side source code.

Let’s take our original plaintext table and create another table, generating a random salt for each password along the way and hashing this, so we can examine if this method is secure. For simplicity, we’ll generate a random integer in the range 0-99999 to act as the salt, although in reality salts can be much longer and more complex strings, and simply append this integer to the end of the string.

;WITH generateSalt AS (
SELECTUserName, passwd, CAST(ABS(CHECKSUM(NEWID())) % 100000 AS NVARCHAR(6)) [salt], UserID 
FROMdbo.Auth_Basic ) 
SELECTUserName, HASHBYTES('MD5',CONVERT(NVARCHAR(4000),passwd + salt)) [HashedSaltedPwd], UserID 
INTOdbo.Auth_Hashed_Salted 
FROMgenerateSalt 
SELECT * FROM dbo.Auth_Hashed_Salted

This approach adds complexity for the hacker. Repeating his frequency analysis on this new table isn’t going to work, since even those passwords that are identical will have different hashes. Let’s try it now:

SELECTHashedSaltedPwd, COUNT(*) 
FROMdbo.Auth_Hashed_Salted 
GROUPBY HashedSaltedPwd 
ORDERBY COUNT(*) DESC

Number of occurences of the same password has dropped with the salt

We can see we have just one collision, where an identical salt has been generated for an identical password. All other hashed, salted passwords are now unique. So what can the hacker do to counter this? One approach is still possible, but more time-consuming – the hacker knows the range of salts is 0-99999 from the table data, and can speculate that a simple method like appending it to the password might be used. Let’s take ten common passwords and compute the hashes for them with all salts ranging from 0-99999 appended, then join this result set on to our hashed, salted table to see if any passwords can be identified. We can do all of this in SQL. Let’s first generate the hash list. 10 passwords x 100,000 salt values = 1,000,000 hashes to check and put this information into the table #computedHashes:

IF EXISTS ( SELECT name FROM tempdb.sys.tables WHERE name LIKE ('%computedHashes%') )
DROP TABLE #computedHashes 
GO
CREATE TABLE #computedHashes ( 
ptPwd VARCHAR(50), 
salt INT, 
hsPwd VARBINARY(MAX) )
;WITH top10pwds AS ( 
SELECT 'password' [pwd] UNION SELECT '123456' UNION SELECT '12345678' UNION SELECT '1234' UNION SELECT 'qwerty' 
UNION SELECT '12345' UNION SELECT 'dragon' UNION SELECT 'letmein' UNION SELECT 'abc123' UNION SELECT 'admin' ), 
-- on-the-fly numbers table
salts AS ( 
SELECT TOP 100000 ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL ) ) [num] FROM sys.objects o1 CROSS JOIN sys.objects o2 CROSS JOIN sys.objects o3 )
INSERT INTO #computedHashes 
SELECTtop10pwds.pwd, salts.num [salt], HASHBYTES('MD5',CONVERT(NVARCHAR(4000),pwd + CAST(salts.num AS VARCHAR(6)))) [hash]
FROMtop10pwds
CROSSJOIN salts 
ORDERBY top10pwds.pwd, salts.num

Here’s what the #computedHashes table contains:

SELECT TOP 1000 * FROM #computedHashes

Here is what the #computedHashes table contains

It is now trivial to join this information back to our dbo.Auth_Hashed_Salted table, selecting the username and ptPwd (plaintext password) where the calculated hash matches the hash we have stored:

SELECTauth.Username, comp.ptPwd [Password], auth.UserID 
FROMdbo.Auth_Hashed_Salted auth 
INNERJOIN #computedHashes comp ON auth.HashedSaltedPwd = comp.hsPwd

Joining back to get the original user name, password and user id

We can see we have broken 2,081 user passwords from our data table of 10,001 in this manner.

IMPORTANT! Even if we are not using integer-based salts, we can simply select all distinct salts from the table and apply each salt to each password we are testing. For this reason, storing the salt in the table may not be a great idea. If we do not know the cryptographic function, we can try a range. If we do not know how the salt has been applied, we can guess at some common techniques – i.e. applying the salt to the beginning or end of a password. Nor are potential passwords limited to 10 – within a click or two of the right Google search, it is possible to download millions of password samples.

This is computationally expensive, but only the result of the Cartesian products of all of these factors:

Check operations = Number of passwords to check * number of different cryptographic schemes * number of distinct salts known * number of different salt application techniques

Instead, some developers choose to store the salt in the application-side code. This has one major benefit and one major drawback – the benefit is that the source code (or server-side code) would have to be compromised to access the salt value, which would probably present many other, more urgent problems if that occurred, and can be difficult to do. The drawback is that using this approach, typically a single salt value is used for all passwords. If this value is compromised, all passwords could be at risk of compromise.

Other theoretical attack vectors include packet sniffing. Is the application-side code using SSL to communicate with the database? If not, why not? SQL Server supports key/certificate-based authentication with client connections. Without this, a successful login attempt including username, user-entered password and salt could be sniffed en-route.

So what lessons can we learn from this demonstration?

Strong passwords make dictionary-like attacks less likely. Make sure your users adhere to password policies to create strong passwords. Enforce this at the client-side if necessary.
Consider the role of table-level security in SQL Server. Ensure your users only have the access they need.
Consider options like row-level security where available to you.
Consider storing your username and password information in separate tables or databases.
Consider the use of SSL for database traffic encryption during user authentication.
Never store plaintext passwords in the database.
Consider storing your salts separately from your passwords or obfuscate these salts through a hidden, reversible methodology (i.e. an application-side mathematical function).

Next Steps

Karwin, B. (2010) SQL Antipatterns, Publisher: Pragmatic Programmers, ISBN 978-1-93435-655-5
Sergey Gigoyan: Storing passwords in a secure way in a SQL Server database
Manvendra Singh: Identify blank and weak passwords for SQL Server logins
Brent Shaub: SQL Server stored procedure to generate random passwords

Derek Colley

Dr. Derek Colley is the lead technical consultant for Optimal Computing, a boutique data consultancy based in the UK NW helping clients troubleshoot, configure, plan and deploy their data platforms. He specializes in data integration and database performance optimization, and is a PhD candidate with Staffordshire University, looking into methods of integrating AL query selection algorithms with the cost-based query optimizer. In his spare time, he enjoys building robots, going outdoors, and spending time with his four children.

MSSQLTips Awards: Trendsetter (25+ tips)

To make the database password more secure in the event of a database hack, we can create several columns with the email address and password (the email address should also be encrypted in my opinion), e.g.: columns from email_1 to email_7 and columns from password_1 to password_7. The password can be stored in one of the email columns and the email address in one of the password columns. In the remaining six we can store an encrypted random string of characters. Secondly, when hashing, we can use various hashing functions, e.g. our password = $password,

$password = hash(‘sha384’, $password);
$password = hash(‘sha256’, $password);
$password = hash(‘sha512’, $password);
$password = hash(‘guest’, $password);
$h1 = hash(‘md5’, $password);
$h2 = hash(‘md5’, $h1);
$h3 = hash(‘md5’, $h2);
$h4 = hash(‘md5’, $h3);
$hash_db = $h1 . $h2 . $h3 . $h4;

We store $hash_db in the database. The hash is 128 characters long, it looks like the sha512 algorithm is used.

2 Comments

Joel
November 15, 2023 / 11:49 pm Reply
To make the database password more secure in the event of a database hack, we can create several columns with the email address and password (the email address should also be encrypted in my opinion), e.g.: columns from email_1 to email_7 and columns from password_1 to password_7. The password can be stored in one of the email columns and the email address in one of the password columns. In the remaining six we can store an encrypted random string of characters. Secondly, when hashing, we can use various hashing functions, e.g. our password = $password,
$password = hash(‘sha384’, $password);
$password = hash(‘sha256’, $password);
$password = hash(‘sha512’, $password);
$password = hash(‘guest’, $password);
$h1 = hash(‘md5’, $password);
$h2 = hash(‘md5’, $h1);
$h3 = hash(‘md5’, $h2);
$h4 = hash(‘md5’, $h3);
$hash_db = $h1 . $h2 . $h3 . $h4;
We store $hash_db in the database. The hash is 128 characters long, it looks like the sha512 algorithm is used.
Mahesh
January 13, 2022 / 9:13 pm Reply
This. Was nice explanation, thanks

2 Comments

Leave a ReplyCancel Reply