SQL Server Integration Services 2016 Lookups With Always Encrypted Data
I have a table with data encrypted by the Always Encrypted feature of SQL Server 2016. I need to use this table as a lookup table in Integration Services (SSIS), but I can't make this work. How can I solve this issue?
SQL Server 2016 Preview
At the time of writing, SQL Server 2016 is still in preview (currently CTP 3.2 has been released). This means functionality or features of the Always Encrypted or Integration Services might change, disappear or be added in the final release.
For an introduction to Always Encrypted, check out the tip SQL Server 2016 Always Encrypted by Aaron Bertrand. Also, this tip uses the same test set-up as the tip Reading Always Encrypted Data with Integration Services 2016 and the tip Writing Always Encrypted Data with Integration Services 2016. I recommend you read those tips first, as they outline the various steps needed to interact with data encrypted by Always Encrypted. To summarize, you need:
- a connection to be made with the additional connection property column encryption setting=enabled
- the column master key, which was created in the database, to be imported into the local certificate store
We will use the same table, Sales.CustomerPII, of the AdventureWorks2016CTP3 sample database, which can be downloaded on Codeplex. However, this time we will read a sample line from this table, perform a lookup on the same table to retrieve sensitive data and then write the results to another table, which is also encrypted by Always Encrypted. Don't forget to download the sample scripts as well, as they contain the private key file needed to interact with the Always Encrypted data.
You can use the following script to create a copy of the Sales.CustomerPII table, albeit with less columns.
USE [AdventureWorks2016CTP3] GO CREATE TABLE [Sales].[CustomerPIICopy_Encrypted]( [CustomerID] [int] NOT NULL, [SSN] [nvarchar](11) COLLATE Latin1_General_BIN2 ENCRYPTED WITH (COLUMN_ENCRYPTION_KEY = [CEK_Auto1] ,ENCRYPTION_TYPE = Deterministic ,ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256') NULL, [CreditCardNumber] [nvarchar](25) COLLATE Latin1_General_BIN2 ENCRYPTED WITH (COLUMN_ENCRYPTION_KEY = [CEK_Auto1] ,ENCRYPTION_TYPE = Deterministic ,ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256') NULL );
The binary collations for the encrypted columns are a prerequisite of Always Encrypted. The following statement will be used to extract one single CustomerID from the original Sales.CustomerPII table.
SELECT TOP 1 [CustomerID] FROM [Sales].[CustomerPII];
Now we will try to use the Sales.CustomerPII table as a lookup table to retrieve the SSN and CreditCardNumber columns. Then the results will be written to the Sales.CustomerPIICopy_Encrypted table.
First attempt: using the Lookup with Full Cache
Let's try to use a Lookup component configured with a full cache. The following query is used to retrieve the lookup data:
SELECT [CustomerID] ,[SSN] ,[CreditCardNumber] FROM [Sales].[CustomerPII];
The Lookup columns are configured as follows:
As explained in the tip Writing Always Encrypted Data with Integration Services 2016, an ADO.NET destination needs to be used to write data to the encrypted table. The resulting data flow looks like this:
But, when we run the package, we get the following error:
The reason for this is that the Lookup component doesn't support ADO.NET connections, so we were forced to use an OLE DB connection. But OLE DB connections don't support Always Encrypted. The Lookup component runs successfully, but it returns encrypted data (using a byte array) instead of the actual textual data. When we try to write this data to the destination, a conversion error is thrown.
Fortunately, the Lookup component doesn't only support OLE DB connections, but also cache connection managers.
Second attempt: using a cache connection manager
The concept of cache connection managers is explained in the tip Lookup and Cache Transforms in SQL Server Integration Services. First we need to retrieve the lookup data and write it to a cache transform. The data flow looks like this:
An ADO.NET source needs to be used to retrieve the decrypted data from SQL Server. This means that the data is vulnerable inside the SSIS data flow though. The cache connection manager has the following configuration:
The cache connection manager doesn't use a file cache. The cache transform has the following column mappings:
In the lookup component, a cache connection is now chosen:
In the connection pane, you simple need to choose the connection manager created earlier. The column mappings stay the same. When we now run the data flow, the cache is first populated with decrypted data. Then the data is read from the source and matched against this cache. The lookup columns are now fetched in their decrypted state and written to the destination by the ADO.NET destination component. While writing to the destination table, the columns are encrypted again.
We can verify in the destination table that the data arrived correctly.
Using a Lookup transformation in SSIS like you normally would is not possible when working with Always Encrypted data. This is because the Lookup component doesn't support ADO.NET connections, which are the only type of connections supporting Always Encrypted. However, using a work around with cache connections, we can still use lookups with encrypted data.
- For more information on Always Encrypted, check out the following tips:
- For more information on using cache connections in SSIS, please refer to the tip Lookup and Cache Transforms in SQL Server Integration Services.
- Don't forget to download the sample databases and the sample scripts from Microsoft to get you started.
- For more SQL Server 2016 tips, you can use this overview.
About the author
View all my tips