New Command in SQL Server 2019 ADD SENSITIVITY CLASSIFICATION


By:   |   Updated: 2018-10-12   |   Comments (3)   |   Related: More > SQL Server 2019


Problem

Microsoft added a task in SQL Server Management Studio 17.5 called "Classify Data." The purpose is to help identify columns that are potentially sensitive in nature, and that may need to be protected in order to remain compliant under various regulations, including SOX, HIPAA, PCI, and GDPR. It does a pretty good job of identifying vulnerable columns and helping you classify them depending on the risk and the type of exposure they present. But aside from showing you the results of this classification, this feature does not raise any additional visibility to these sensitive columns, never mind any suspicious access to them.

Solution

For background on the functionality introduced in SQL Server 2017 and SSMS 17.5, imagine we have this table:

CREATE TABLE dbo.Contractors
(
  ContractorID  int,
  FirstName     sysname,
  LastName      sysname,
  SSN           char(9),
  Email         varchar(320),
  PasswordHash  varbinary(256),
  HourlyRate    decimal(6,2)
);

When I run the wizard, I get the following recommendations:

Data sensitivity classification recommendations in SSMS 17.5­

You might not agree with all of the classifications, so you can change them, and even can add your own, specifying various information types and sensitivity levels. These are:

When you select the columns you want to classify, click Accept, and then Save, it adds extended properties around those columns. The report in SSMS uses those extended properties, and displays them as follows:

SQL Data Classification Report in SSMS 17.5

You could of course write your own app that consumes them, and maybe there are already tools out there that do, but nothing really happens here inside of SQL Server, except these extended properties exist and they surface the relevant columns on the report.

SQL Server SENSITIVITY CLASSIFICATION

In SQL Server 2019, there is a lot more automation built into the system, allowing you to easily add sensitive data labels to columns, which will get pulled into audits by default. Let's take the same table above, and add our own sensitivity classifications similar to the ones we added above with extended properties, using the ADD SENSITIVITY CLASSIFICATION command (already available in Azure SQL Database):

ADD SENSITIVITY CLASSIFICATION TO
    dbo.Contractors.FirstName, 
    dbo.Contractors.LastName
WITH (LABEL = 'Confidential - GDPR', INFORMATION_TYPE = 'Contact Info');

ADD SENSITIVITY CLASSIFICATION TO dbo.Contractors.SSN
WITH (LABEL = 'Highly Confidential', INFORMATION_TYPE = 'National ID');

ADD SENSITIVITY CLASSIFICATION TO
    dbo.Contractors.email, 
    dbo.Contractors.PasswordHash
WITH (LABEL = 'Confidential', INFORMATION_TYPE = 'Credentials');

ADD SENSITIVITY CLASSIFICATION TO dbo.Contractors.HourlyRate
WITH (LABEL = 'Highly Confidential', INFORMATION_TYPE = 'Financial');

These don't create extended properties, but rather we can see these in sys.sensitivity_classifications:

If we are creating an audit, we don't have to do anything special to pick up these classifications, except be auditing the table (in other words, existing audits will simply start inheriting these classifications as they are added). So, if we have a server audit:

USE master;
GO

CREATE SERVER AUDIT GDPRAudit TO FILE (FILEPATH = 'C:\temp\Audit\');
GO  

ALTER SERVER AUDIT GDPRAudit WITH (STATE = ON);
GO			

Then a database audit that is monitoring read activity on the table:

USE HR;
GO

CREATE DATABASE AUDIT SPECIFICATION AuditContractors
 FOR SERVER AUDIT GDPRAudit
 ADD (SELECT ON dbo.Contractors BY dbo) WITH (STATE = ON);

With the audit enabled, if we run a couple of queries:

SELECT * FROM dbo.Contractors;
SELECT FirstName, LastName, HourlyRate FROM dbo.Contractors;

We can observe access to specific types of information this way, with the new column data_sensitivity_information that is included in the audit:

SELECT
  session_server_principal_name,
  event_time,
 [host_name],
  [object] = [database_name] + '.' + [schema_name] + '.' + [object_name],
  [statement],
  data_sensitivity_information = CONVERT(xml, data_sensitivity_information)
FROM sys.fn_get_audit_file ('c:\temp\Audit\GDPRAudit_*.sqlaudit', default, default)
WHERE action_id = 'SL'; -- SELECT

Results:

Results of querying the audit log

You can click on any XML column value to see these results:

-- from the first row, with SELECT *:
<sensitivity_attributes>
  <sensitivity_attribute label="Confidential - GDPR" information_type="Contact Info" />
  <sensitivity_attribute label="Highly Confidential" information_type="National ID" />
  <sensitivity_attribute label="Confidential" information_type="Credentials" />
  <sensitivity_attribute label="Highly Confidential" information_type="Financial" />
</sensitivity_attributes>
 
-- from the second row, with specific columns:
<sensitivity_attributes>
  <sensitivity_attribute label="Confidential - GDPR" information_type="Contact Info" />
  <sensitivity_attribute label="Highly Confidential" information_type="Financial" />
</sensitivity_attributes>

It is interesting to note that FirstName and LastName combined to yield only one element in the XML. This means that whether you have one or 50 columns with a specific label and information type combination, it won’t add any unnecessary data to the audit. Anyway, for those of you that are XML-savvy, you can likely see how easy it would be to extract that information as nodes and then run queries against it, making it very easy to identify all the accesses to specific labels or label/type combinations.

I feel like there might be a market for a conversion tool that would translate these data classifications you may have already invested in (or built tools that consume) from extended properties to sensitivity classifications, or vice-versa. I'm not going to handle synchronization here, but I'll offer up a start, taking the extended properties and building dynamic statements to migrate them:

DECLARE @sql nvarchar(max) = N'';

SELECT @sql += N'ADD SENSITIVITY CLASSIFICATION TO ' 
  + QUOTENAME(s.name) + '.' + QUOTENAME(o.name) + '.' + QUOTENAME(c.name)
  + ' WITH (LABEL = ''' 
  + REPLACE(CONVERT(nvarchar(256), l.value), '''', '''''') 
  + ''', INFORMATION_TYPE = ''' 
  + REPLACE(CONVERT(nvarchar(256), t.value), '''', '''''') 
  + ''');' + CHAR(13) + CHAR(10)
FROM sys.extended_properties AS t
INNER JOIN sys.extended_properties AS l
ON t.class = l.class AND t.major_id = l.major_id AND t.minor_id = l.minor_id
INNER JOIN sys.objects AS o
ON t.major_id = o.[object_id]
INNER JOIN sys.columns AS c
ON t.major_id = c.[object_id]
AND t.minor_id = c.column_id
INNER JOIN sys.schemas AS s
ON o.[schema_id] = s.[schema_id]
WHERE t.name = N'sys_information_type_name'
  AND l.name = N'sys_sensitivity_label_name';

PRINT @sql;

That's just a quick exercise. To keep things relatively consistent between the SSMS report and the audit (and any other tool that consumes either or both), and potentially between different versions of SQL Server in your environment, you'll also want to work out the opposite: how to convert sensitivity classifications to sp_addextendedproperty calls.

Summary

There is still going to be some subjectivity here; only you can truly know how sensitive each entity in your data model might be. But SQL Server continues to introduce innovate features that help you protect your sensitive data and remain compliant with various standards and regulations.

Next Steps

Read on for related tips and other resources:



Last Updated: 2018-10-12


get scripts

next tip button



About the author
MSSQLTips author Aaron Bertrand Aaron Bertrand (@AaronBertrand) is a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com, and also blogs at sqlblog.org.

View all my tips
Related Resources





Comments For This Article




Monday, September 14, 2020 - 8:41:43 AM - Donna Back To Top (86469)
Aaron,

thank you for an informative and useful article.

One question, if I may?

Do you happen to know where the rules for identifying Information Type and Recommended Label are stored? And if the base ruleset can be edited?

I need to apply different classifications e.g. Staff Ethnic Origin is OFFICIAL - MEDIUM and DPA SENSITIVE.

Cheers,
Donna Kelly

Tuesday, May 12, 2020 - 11:47:31 AM - Rob Back To Top (85636)

In the script to move extended_properties to SENSITIVITY CLASSIFICATION is a small mistake missing dots between names

it is 

SELECT @sql += N'ADD SENSITIVITY CLASSIFICATION TO '  + QUOTENAME(s.name) + QUOTENAME(o.name) + QUOTENAME(c.name)

and should be 

SELECT @sql += N'ADD SENSITIVITY CLASSIFICATION TO '  + QUOTENAME(s.name) +'.'+ QUOTENAME(o.name) +'.'+QUOTENAME(c.name)

Tuesday, April 02, 2019 - 8:14:38 PM - Joe McBrat Back To Top (79449)

Good article. Seems MS has a long way to go on this. I would like to customize my own categories CJIS, NFIRS and a few more I have to worry about and report on. The Alerting aspects is nice also. I am not using Azure, yet. So I will have to wait for some of these features. But this helps get me going.  



download





Recommended Reading

Overview of SQL Server 2019 Features

Resumable Online Index Create in SQL Server 2019

Accelerated Database Recovery in SQL Server 2019

Impact of UTF-8 support in SQL Server 2019

Memory-Optimized TempDB Metadata in SQL Server 2019








get free sql tips
agree to terms


Learn more about SQL Server tools