SQL Server Dynamic Data Masking in Practice

By:   |   Updated: 2024-01-08   |   Comments (1)   |   Related: > Security


Problem

Our organization needs to protect sensitive SQL Server data for specific users in our applications. How can we universally guarantee data protection whether the data is accessed in one of our business applications or an IT team member queries the data from SSMS?

Solution

Microsoft SQL Server Dynamic Data Masking (DDM) is a security feature that hides sensitive data when queried without changing the underlying data with any coding changes. Microsoft SQL Server Dynamic Data Masking includes five types of data masking functions at a column level:

  • Default
  • Email
  • Random
  • Custom String
  • Datetime

SQL Server Dynamic Data Masking Demonstration

Let's start with creating a test environment:

CREATE DATABASE TestDB
GO
 
USE TestDB
GO
 
CREATE TABLE Employee
(  
   ID INT IDENTITY (1,1) NOT NULL PRIMARY KEY CLUSTERED,
   FirstName VARCHAR(50),
   LastName VARCHAR(50),
   Email VARCHAR(100) MASKED WITH (FUNCTION = 'email()') NULL,
   WorkPhoneNumber VARCHAR(50) MASKED WITH (FUNCTION = 'partial(1,"XXXXXXX",0)'),
   ServicePeriodInYears INT MASKED WITH (FUNCTION = 'random(100, 300)'),
   BirthDate DATE MASKED WITH (FUNCTION = 'datetime("Y")') NULL,
   Salary MONEY MASKED WITH (FUNCTION = 'default()')
)
GO
 
-- Insert random data into the Employee table
INSERT INTO Employee (FirstName, LastName, Email, WorkPhoneNumber, ServicePeriodInYears, BirthDate, Salary)
VALUES
    ('John', 'Doe', '[email protected]', '1234567', 5, '1990-05-15', 55000.00),
    ('Jane', 'Smith', '[email protected]', '7654321', 8, '1985-11-22', 65000.00),
    ('Michael', 'Johnson', '[email protected]', '2345678', 3, '1995-07-10', 48000.00),
    ('Aram', 'Melikyan', '[email protected]', '8765432', 6, '1993-02-18', 60000.00),
    ('Sos', 'Sargsyan', '[email protected]', '3456789', 10, '1980-08-30', 75000.00),
    ('Ann', 'Petrosyan', '[email protected]', '1234567', 5, '1990-05-15', 55000.00);
 
GO

The code above creates a test database – TestDB - and a sample table – Employee - and inserts demo data. As we can guess from the table definition, we applied masking rules for five columns. We specially chose five to illustrate all five masking functions.

So, let's assume employees' emails, work phone numbers, service years in the company, birthdates, and salaries should be hidden for some users. Now, let's query the table using the same user-creator of the table:

SELECT *
FROM Employee

As we can see, nothing is hidden, and we see all values:

All values show

This is because we ran the code under the user who has no restrictions in viewing sensitive masked data.

Next, we will create a test user who has only SELECT permissions on the table and will run the same SELECT statement under that user:

--TestUser
CREATE USER TestUser WITHOUT LOGIN
 
GRANT SELECT ON Employee TO TestUser
  
EXECUTE AS USER = 'TestUser'
 
SELECT *
FROM Employee
 
REVERT -- reverts back to logged in user

Let's explore the result:

Values are not original and have different formats

We can see that the values in the masked columns do not contain the original values and have different formats.

Let's review all masked columns from the left to the right:

  • We can see that the values in the Email column have [email protected] format instead of containing the actual email addresses. We used the email() masking function on the Email column to hide emails this way.
  • Next, for the WorkPhoneNumber column, we used the Custom String masking function and partially masked it, leaving only the first number visible. We can adjust the visibility as well as the padding symbol by providing the corresponding parameters-prefix, padding, and suffix.
    • MASKED WITH (FUNCTION = 'partial(prefix,[padding],suffix)')
  • For the ServicePeriodInYears column, we can see the strange, unrealistic numbers (>100) for the service period. The reason for this is that we used the random masking function for this column to generate random numbers from 100 to 300 instead of the real values.
    • MASKED WITH (FUNCTION = 'random(100, 300)')
  • Looking at the BirthDate column, we notice that 2000 is shown for all employees' birth year. This is the result of the date masking function applied to that column. It is possible to mask other date parts based on the function parameters.
    • MASKED WITH (FUNCTION = 'datetime("Y")'
  • Finally, we can see that 0.00 appears as a value in the Salary column for all employees. This is because the default masking uses 0 for numeric datatypes. It uses other default values for strings, dates, and binary types.

Suppose we want to make the Salary data visible to the accountant. In the code below, we grant the corresponding unmask permission on the Salary column to the imaginary accountant's user-TestAccountant:

--Accountant
CREATE USER TestAccountant WITHOUT LOGIN
 
GRANT SELECT ON Employee TO TestAccountant
 
GRANT UNMASK ON Employee(Salary) TO TestAccountant
  
EXECUTE AS USER = 'TestAccountant'
 
SELECT *
FROM Employee
 
REVERT

Thus, we can see that for the accountant, employees' salaries are visible, but the remaining masked columns are still masked as expected:

Only shows values for TestAccountant

Maybe we want to make the salaries, service periods, work phone numbers, and emails visible to the managers:

--Manager
CREATE USER TestManager WITHOUT LOGIN
 
GRANT SELECT ON Employee TO TestManager
 
GRANT UNMASK ON Employee(Salary, ServicePeriodInYears, WorkPhoneNumber, Email) TO TestManager
  
EXECUTE AS USER = 'TestManager'
 
SELECT *
FROM Employee
 
REVERT

The result shows that all the columns mentioned above are visible to the TestManager user, and only the BirthDate data appears in the masked format:

Only shows values for TestManager

Hence, we explored how dynamic data masking works in practice with these simple examples.

Conclusion

To protect sensitive data and control who can see it, we use Dynamic Data Masking (DDM) in SQL Server. DDM is a straightforward way to limit access to confidential information for users without special permissions. It doesn't require complex coding, making it user-friendly. It keeps sensitive data hidden in query results without changing the actual database, providing a practical solution for data protection in various applications.

Next Steps

For additional information, please follow the links below:



sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Sergey Gigoyan Sergey Gigoyan is a database professional with more than 10 years of experience, with a focus on database design, development, performance tuning, optimization, high availability, BI and DW design.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2024-01-08

Comments For This Article




Monday, January 8, 2024 - 9:16:53 PM - John Back To Top (91844)
I see an error when running the create table script.

> Invalid data masking function in column 'BirthDate'

I'm on W11, SQL2022 Dev Edition.














get free sql tips
agree to terms