Problem
As privacy regulations tighten like General Data Protection Regulation (GDPR), Health Insurance Portability and Accountability Act (HIPAA), Payment Card Industry Data Security Standards (PCI DSS) organizations and more, there is an increased focus to protect sensitive information within databases. Tokenization is an option to adhere to those regulations. Let’s see how to implement SQL tokenization in SQL Server.
Solution
Tokenization is a powerful mechanism often overlooked and unlike encryption, which mathematically scrambles data. It replaces sensitive information with meaningless placeholders called tokens, storing the original data securely elsewhere. Tokenization complements other security techniques and helps organizations meet compliance while reducing data exposure. A well-designed vault and strict controls make it a reliable data protection strategy.
Terms and definitions
Tokenization
Tokenization is a data protection technique that substitutes sensitive data with a non-sensitive substitute value. The values called tokens carry no exploitable data. The mapping between the token and the original data is stored in a secure, access-controlled token vault. SQL Tokenization in SQL Server is useful for storing payment data, anonymizing personal health information, protecting personal identifiers like names, phone numbers, SSNs, etc. and securing audit trails.
Anonymization
Anonymization is the technique to transform personal data into anonymous information in such a way that a data subject can no longer be identified directly or indirectly.
Encryption
Encryption is a technique to encode data using an algorithm and key. Using encryption is flexible but exposes the original data if the key is compromised. Encryption implementations require encryption and decryption when data is accessed, which uses additional CPU and memory resources.
Personally Identifiable Information (PII)
Personally Identifiable Information (PII) is any data that can directly or indirectly identify a person or a household.
Sensitive Personal Information
Sensitive Personal Information is any information that can determine personal preferences, racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, genetic data, health-related data, and data concerning a person´s sexual orientation.
Benefits of tokenization:
- Data Security – Removes all sensitive data elements from databases reducing the vulnerability for data breaches.
- Digital Payments – A tokenization system can substitute a payment token for credit card account or a primary account number reducing fraud.
- Natural Language Processing – Tokenization is a preprocessing technique that processes text in linguistic units transforming into smaller tokens that algorithms can understand.
Tokenization process:
- Identification: Identify the sensitive data that needs tokenization.
- Generation: Generates a random token.
- Mapping: The generated token and the original data are mapped to each other in a secure token database called a token vault which is encrypted and store separately from the main database. It also can be saved in a third-party tokenization provider.
- Replacement: The sensitive data is replaced by the token.
- Storage: The original data is securely stored in the token vault, accessible only through the tokenization system.
Types of tokens:
- Irreversible: Data cannot be converted back to their original data values.
- Reversible: Through detokenization the data can be converted back into their original values.
- Format-preserving: Consistent format as the replaced data.
Tokenization approaches:
- Vault-base: A database holds the token and its original sensitive value. The vault must be highly protected, each tokenization requires a database lookup, and should be optimized. It is used in highly regulated environments.
- Deterministic: An algorithm is used to always produce the same token for the same input. It is less secure than a randomized value due to frequency analysis and dictionary attacks. It is used for tokens that must be searchable or joinable.
- Randomized: Produces a different token each time and requires secure lookup for detokenization.
- Custom Functions: Defined by the tokenization logic tailored to your needs. It can have higher development cost, harder to audit or maintain, and risk of weak security, if poorly implemented.
Recommendations when designing a tokenized database:
- Data Scope: Identify what data needs to be tokenized, which data elements are regulated, where the data will be stored, who will access the data and why.
- Strategy: Choose your tokenization method as outlined above.
- Vault: Choose a method to encrypt the real values. Setup the vault in a different database, and if needed for auditing purposes, log each token generation event.
- Functions: Plan how to tokenize, detokenize and a purge process to remove all mappings. If needed, define how to retokenize the data which is a process to rotate or change the token mappings.
- Security: Build a process to encrypt the real values using always encrypted or transparent data encryption (TDE) and role-based access. Plan the access control of the vault enabling auditing and follow the best practices for security.
- Performance: Tune the detokenization process to improve response time.
- Regulatory: Ensure the tokenization process meets the necessary compliance regulations that apply to your business.
- Backup: Create a backup strategy for the token vault and its encryption keys. Test to validate the token mapping can be restored reliably.
- Integration: Tokenize the data during entry and detokenize the data only when explicitly authorized.
- Validation: Verify token consistency, permission controls, query correctness, encryption in backups, and simulate data breach scenarios to understand the potential data exposure.
SQL Server Setup
Encryption by certificate
The first step to do is to prepare our database to handle encryption. We will store the original value using Transparent Data Encryption (TDE). I chose to encrypt by certificate.
-- MSSQLTips (TSQL)
USE [MSSQLTips]; -- change to your DB name
GO
--Create database master key which will be used to encrypt the certificate private key
IF NOT EXISTS
(SELECT 1 FROM sys.symmetric_keys WHERE name = '##MS_DatabaseMasterKey##')
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'MSSQLTips@2025';
-- Create the certificate which will be used to encrypt the symmetric key
IF NOT EXISTS
(SELECT 1 FROM sys.certificates WHERE [name] = 'MSSQLTipsCert')
CREATE CERTIFICATE MSSQLTipsCert
WITH SUBJECT = 'Certificate for Transparent Data Encryption';
-- Create the symmetric key
IF NOT EXISTS
(SELECT 1 FROM sys.symmetric_keys WHERE [name] = 'MSSQLTipsSymKey')
CREATE SYMMETRIC KEY MSSQLTipsSymKey
WITH ALGORITHM = AES_256
ENCRYPTION BY CERTIFICATE MSSQLTipsCert;
GO
Testing to validate it is working:
-- MSSQLTips (TSQL)
OPEN SYMMETRIC KEY MSSQLTipsSymKey DECRYPTION BY CERTIFICATE MSSQLTipsCert;
-- Encryption
DECLARE @Encrypted AS varbinary(MAX) = EncryptByKey(Key_GUID('MSSQLTipsSymKey'), 'MSSQLTips');
-- Decryption
SELECT @Encrypted AS vEncrypted, CONVERT(varchar(MAX), DecryptByKey(@Encrypted)) AS vDecrypted;
CLOSE SYMMETRIC KEY MSSQLTipsSymKey
GO
Verify the results:

Tables
Next, we will create the table token vault. Ideally, this table should be created in a separated database to improve logical isolation with different database-level access controls. For this tip, I will simplify the process and create it in the same database. Installing this table in a separate server gives a stronger network level isolation, which can be accessed using secure linked servers or APIs. Also, there is an option to install a third-party tokenization service where tokens are exchanged via secure HTTPS or gRPC.
-- MSSQLTips (TSQL)
CREATE TABLE [dbo].[TokenVault](
[TokenId] [uniqueidentifier] NOT NULL DEFAULT (newsequentialid()),
[TokenValue] [varbinary](max) NULL,
[CreatedAt] [datetime2](7) DEFAULT (sysutcdatetime()),
CONSTRAINT [PK_TokenVaultEncrypted] PRIMARY KEY CLUSTERED
(
[TokenId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
Now I will create a table that will hold Patient data for a fictious Healthcare system:
-- MSSQLTips (TSQL)
CREATE TABLE [dbo].[TokenPatients](
[PatientId] [int] IDENTITY(1,1) NOT NULL,
[FullName] [uniqueidentifier] NULL,
[SSN] [uniqueidentifier] NULL, -- Social security number
[MRN] [uniqueidentifier] NULL, -- Medical Record Number
[DOB] [uniqueidentifier] NULL, -- Date of Birth
[IsInactive] [bit] DEFAULT ((0)),
PRIMARY KEY CLUSTERED
(
[PatientId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
User Defined Function
This function will retrieve individual encrypted values based on the TokenId.
-- =============================================
-- Author: SCP - MSSQLTips
-- Create date: 20250703
-- Description: Token value from Token Vault
-- =============================================
CREATE OR ALTER FUNCTION [dbo].[ufnTokenGetValue]
(@TokenId uniqueidentifier)
RETURNS varbinary(MAX)
AS
BEGIN
DECLARE @TokenValue varbinary(MAX);
SELECT @TokenValue = [TokenValue]
FROM [dbo].[TokenVault]
WHERE [TokenId] = @TokenId;
RETURN @TokenValue;
END
GO
Stored Procedures
This stored procedure will handle the sensitive information, encryption and tokenization process. I will use the following rule for the value of the PatientId:
- If PatientID = zero it is a new record to insert
- If PatientID is greater than zero update an existing record
- If PatientID is negative deactivate the record, since deletions are not recommended
-- =============================================
-- Author: SCP - MSSQLTips
-- Create date: 20250703
-- Description: Token Patients
-- =============================================
CREATE OR ALTER PROCEDURE [dbo].[uspTokenPatients]
(@PatientId int
,@FullName nvarchar(120)
,@SSN nvarchar(50)
,@MRN nvarchar(50)
,@DOB nvarchar(50))
WITH EXECUTE AS CALLER
AS
BEGIN
SET NOCOUNT ON;
IF @FullName IS NULL OR LEN(@FullName) = 0
RETURN;
BEGIN TRY
-- Data encryption to be stored at the Token Vault
OPEN SYMMETRIC KEY MSSQLTipsSymKey DECRYPTION BY CERTIFICATE MSSQLTipsCert;
DECLARE @FullNameBin varbinary(MAX) = EncryptByKey(Key_GUID('MSSQLTipsSymKey'), @FullName);
DECLARE @SSNBin varbinary(MAX) = EncryptByKey(Key_GUID('MSSQLTipsSymKey'), @SSN);
DECLARE @MRNBin varbinary(MAX) = EncryptByKey(Key_GUID('MSSQLTipsSymKey'), @MRN);
DECLARE @DOBBin varbinary(MAX) = EncryptByKey(Key_GUID('MSSQLTipsSymKey'), @DOB);
CLOSE SYMMETRIC KEY MSSQLTipsSymKey
-- Update existing patient
IF @PatientId > 0 BEGIN
DECLARE @FullNameId uniqueidentifier
,@SSNId uniqueidentifier
,@MRNId uniqueidentifier
,@DOBId uniqueidentifier;
SELECT @FullNameId = [FullName]
,@SSNId = [SSN]
,@MRNId = [MRN]
,@DOBId = [DOB]
FROM [dbo].[TokenPatients]
WHERE [PatientId] = @PatientId;
IF [dbo].[ufnTokenGetValue] (@FullNameId) <> @FullNameBin
UPDATE [dbo].[TokenVault]
SET [TokenValue] = @FullNameBin
WHERE [TokenId] = @FullNameId;
IF [dbo].[ufnTokenGetValue] (@SSNId) <> @SSNBin
UPDATE [dbo].[TokenVault]
SET [TokenValue] = @SSNBin
WHERE [TokenId] = @SSNId;
IF [dbo].[ufnTokenGetValue] (@MRNId) <> @MRNBin
UPDATE [dbo].[TokenVault]
SET [TokenValue] = @MRNBin
WHERE [TokenId] = @MRNId;
IF [dbo].[ufnTokenGetValue] (@DOBId) <> @DOBBin
UPDATE [dbo].[TokenVault]
SET [TokenValue] = @DOBBin
WHERE [TokenId] = @DOBId;
END
-- Inserting a new patient
IF @PatientId = 0 BEGIN
DECLARE @OutputTable
TABLE (Token uniqueidentifier);
INSERT INTO [dbo].[TokenVault]
([TokenValue])
OUTPUT INSERTED.TokenId
INTO @OutputTable
VALUES (@FullNameBin);
INSERT INTO [dbo].[TokenVault]
([TokenValue])
OUTPUT INSERTED.TokenId
INTO @OutputTable
VALUES (@SSNBin);
INSERT INTO [dbo].[TokenVault]
([TokenValue])
OUTPUT INSERTED.TokenId
INTO @OutputTable
VALUES (@MRNBin);
INSERT INTO [dbo].[TokenVault]
([TokenValue])
OUTPUT INSERTED.TokenId
INTO @OutputTable
VALUES (@DOBBin);
SELECT @FullNameId = (SELECT Token
FROM @OutputTable
ORDER BY (SELECT NULL) OFFSET 0 ROWS FETCH NEXT 1 ROWS ONLY)
,@SSNId = (SELECT Token
FROM @OutputTable
ORDER BY (SELECT NULL) OFFSET 1 ROWS FETCH NEXT 1 ROWS ONLY)
,@MRNId = (SELECT Token
FROM @OutputTable
ORDER BY (SELECT NULL) OFFSET 2 ROWS FETCH NEXT 1 ROWS ONLY)
,@DOBId = (SELECT Token
FROM @OutputTable
ORDER BY (SELECT NULL) OFFSET 3 ROWS FETCH NEXT 1 ROWS ONLY);
INSERT INTO [dbo].[TokenPatients]
([FullName]
,[SSN]
,[MRN]
,[DOB])
VALUES (@FullNameId
,@SSNId
,@MRNId
,@DOBId);
END
-- Delete patient
IF @PatientId < 0 BEGIN
UPDATE [dbo].[TokenPatients]
SET [IsInactive] = ~[IsInactive]
WHERE [PatientId] = ABS(@PatientId);
END
END TRY
BEGIN CATCH
IF @@TRANCOUNT > 0
BEGIN
ROLLBACK TRANSACTION;
END
-- Print error information.
PRINT 'Error: ' + CONVERT(varchar(50), ERROR_NUMBER()) +
', Severity: ' + CONVERT(varchar(5), ERROR_SEVERITY()) +
', State: ' + CONVERT(varchar(5), ERROR_STATE()) +
', Procedure: ' + ISNULL(ERROR_PROCEDURE(), '-') +
', Line: ' + CONVERT(varchar(5), ERROR_LINE());
PRINT ERROR_MESSAGE();
END CATCH;
END
GO
This stored procedure will return the real values for a Patient:
-- =============================================
-- Author: SCP - MSSQLTips
-- Create date: 20250703
-- Description: Patient real values
-- =============================================
CREATE OR ALTER PROCEDURE [dbo].[uspTokenPatientChk]
(@PatientId int)
WITH EXECUTE AS CALLER
AS
BEGIN
SET NOCOUNT ON;
OPEN SYMMETRIC KEY MSSQLTipsSymKey DECRYPTION BY CERTIFICATE MSSQLTipsCert;
SELECT [PatientId]
,CONVERT(nvarchar(MAX), DecryptByKey([dbo].[ufnTokenGetValue] ([FullName]))) AS [FullName]
,CONVERT(nvarchar(MAX), DecryptByKey([dbo].[ufnTokenGetValue] ([SSN]))) AS [SSN]
,CONVERT(nvarchar(MAX), DecryptByKey([dbo].[ufnTokenGetValue] ([MRN]))) AS [MRN]
,CONVERT(nvarchar(MAX), DecryptByKey([dbo].[ufnTokenGetValue] ([DOB]))) AS [DOB]
,[IsInactive]
FROM [dbo].[TokenPatients]
WHERE [PatientId] = @PatientId;
CLOSE SYMMETRIC KEY MSSQLTipsSymKey
END
GO
Testing the solution
Insert data for a new patient:
-- MSSQLTips (TSQL)
EXEC [dbo].[uspTokenPatients]
@PatientId = 0,
@FullName = N'George Washington',
@SSN = N'123-45-1111',
@MRN = N'55VT32',
@DOB = N'1732-02-22';
GO
SELECT * FROM [dbo].[TokenPatients]
GO
SELECT * FROM [dbo].[TokenVault]
GO
Resulting in:

Checking the values:
-- MSSQLTips (TSQL)
EXEC [dbo].[uspTokenPatientChk]
@PatientId = 1
GO
Resulting in:

Changing the SSN value for a Patient:
-- MSSQLTips (TSQL)
EXEC [dbo].[uspTokenPatients]
@PatientId = 1, -- Type the Patient Id number to be updated
@FullName = N'George Washington',
@SSN = N'123-45-6789', -- Change from 123-45-1111
@MRN = N'55VT32',
@DOB = N'1732-02-22';
GO
EXEC [dbo].[uspTokenPatientChk]
@PatientId = 1
GO
Resulting in:

About the tables
Token Patients
-- MSSQLTips (TSQL)
SELECT * FROM [dbo].[TokenPatients]
GO
This table contains mapping values and is not a concern if the table is compromised because it shows only references to the real values stored in the secure vault.

Token Vault
-- MSSQLTips (TSQL)
SELECT * FROM [dbo].[TokenVault]
GO
This tables contains a reference for an encrypted value and if stored in another database or location, it does not have the encryption certificate to decrypt it, and it is secure if compromised.

Next Steps
- Read these other articles related to Encryption in SQL Server
- https://en.wikipedia.org/wiki/Tokenization_(data_security)