Identifying PII Data to Lock Down in SQL Server (Part 1 of 2)
We're pretty confident that we have locked down and encrypted our financial data, but a lot of our customer's PII (Personally Identifiable Information) data is still held in unencrypted form. This data is able to be selected directly by read only business users on many of our downstream reporting, datawarehouse and standby servers. The rise of identity theft makes protecting this data imperative. DBAs are the custodians of this information and must protect it like we protect our own personal information. Recent publicity over the theft of Sony PSN data underscores both the economic and ethical importance of protecting personal data.
In Part 1 of this series I will provide a quick way to identify potential PII information in your databases. In Part 2, I will present a method to lock down access to your PII data while still retaining user access to other non PII fields.
Personally Identifiable Information is defined as information that can be used to distinguish or trace a person's identity. Examples include social security number, medical records, or information that, when combined or used with other identifying information, is linked or linkable to a specific individual.
If we can identify all the columns in our databases that hold PII data then we can plan a suitable lockdown strategy.
Get the PII columns
The simple script below will iterate through each database and use wildcards to search for column names that could be PII data. We are specifically concerned with first name, last name, addresses, gender, SSN, email and other common PII sounding fields.
The script below can be edited to suit your environment and the nature of the PII data that you retain.
-- script examines all columns in all DBs for PII sounding names EXEC sp_msforeachDB 'USE [?] SELECT db_name() as [DB], object_name(o.object_id) AS [Object], c.name AS [Column] FROM sys.columns c join sys.objects o ON c.object_id = o.object_id WHERE o.type IN (''U'',''V'') /* tables and views */ AND (c.name like ''%mail%'' OR c.name like ''%first%name%'' OR c.name like ''%last%name%'' OR c.name like ''%birth%'' OR c.name like ''%sex%'' OR c.name like ''%address%'' OR c.name like ''%phone%'' OR c.name like ''%social%'' OR c.name like ''%ssn%'' OR c.name like ''%gender%'') AND db_name() NOT IN (''msdb'',''tempdb'',''master'')'
Here is sample output from the AdventureWorksDW2008 database.
As can be seen from the results we have identified potential columns which contain PII data and should be locked down from users running SELECT queries.
- Run this script on all your SQL Servers and review the results. You may be surprised at the number of places where PII data is stored and needs to be "locked down".
- Watch out for "Identifying and locking down access to PII Data - Part 2"
- Review these other great tips on securing SQL Server data:
About the author
View all my tips