Using Azure Storage and Azure SQL Database for SSIS Development

By:   |   Comments (8)   |   Related: > Azure Integration Services


Problem

I am starting a new data warehousing project for a new client where I will use SQL Server Integration Services (SSIS) for the Extract, Transform and Load (ETL) operations. The client wants to create a hybrid environment for development where I will only install SQL Server Data Tools (SSDT) and SQL Server Management Studio (SSMS) on my laptop.  The goal is to leverage Azure for file storage and the SQL Server databases. Since this is a development environment, I will use Azure SQL Database for the databases.  Ultimately my client wants to transition this ETL application to Azure Data Factory at some point in the future.  Can you walk me through the steps to accomplish this?

Solution

The following diagram shows the solution to be built in this tip:

Data architecture workflow demonstrated in this article

The main points are:

  • An On-Premises SSIS package extracts data from the Backoffice Web Application's Azure SQL Database and writes pipe-delimited files to Azure Blob Storage on a daily basis for the various types of transactions that are processed and of interest to the data warehouse
  • An On-Premises SSIS package reads the pipe-delimited files from Azure Blob Storage and inserts the records into staging tables in the Azure SQL Database

Note that for demonstration purposes, I'm creating two SSIS packages to walk through the scenario where the Backoffice Web Application and the ETL process are loosely integrated.  Certainly, I could create a single SSIS package that extracts data from the Backoffice Web Application's database and inserts that data into the staging database.  I want to demonstrate extracting data from an Azure SQL Database and writing to Azure Blob Storage as well as reading data from Azure Blob Storage and inserting into an Azure SQL Database.

Prerequisites

There are a couple of things that we need to take care of before we walk through building the on-premise SSIS packages:

  • Create Azure Subscription
  • Create Azure Storage Account
  • Create Azure SQL Database
  • Install SQL Server Data Tools (SSDT)
  • Install Azure Feature Pack for SQL Server Integration Services (SSIS)
  • Install SQL Server Management Studio (SSMS)

I will provide a brief description of the setup steps for each of these assuming that you are starting from scratch.

Create Azure Subscription

Go to https://azure.microsoft.com/ and setup an account.  You can probably get a free subscription for a period of time or sign up for a pay-as-you-go plan.  After you have an Azure Subscription, go to https://portal.azure.com/ and sign in.  You will see the following on the home page (just the relevant portion of the page is shown):

Azure Subscription

Create Azure Storage Account

Click on the Storage accounts icon on the Azure Home Page shown above to create an Azure Storage Account.  Follow the steps in Create a storage account to get your storage account created.  Make a note of the Storage account name and Container name; you will need them later.

Create Azure SQL Database

Click on the SQL databases icon on the Azure Home Page shown above to create an Azure SQL Database.  Follow the steps in Quickstart: Create a single database in Azure SQL Database using the Azure portal to get your Azure SQL Database created.  Make a note of the Server name, Database name, Server admin name and password when you create a new server and database; you will need them later. 

After you create your Azure SQL Database, make sure to Create a server-level firewall rule so you can access the database outside of the Azure portal.

Note that when you create a new Azure SQL Database, you have the following options for the Source:

  • Blank database
  • AdventureWorksLT
  • Backup

For this tip we need two databases: chose AdventureWorksLT for the application database (choose AdventureWorksLT as the Source) and a Blank database for the staging database.

Install SQL Server Data Tools (SSDT)

SSDT is a free, stand-alone version of Visual Studio that you can use to develop Analysis Services, SQL Server Integration Services and SQL Server Reporting Services project.  Follow this link to download and install SQL Server Data Tools (SSDT) for Visual Studio.

Install Azure Feature Pack for SSIS

Download and install the Azure Feature Pack for SSIS which includes a handful of components for working with Azure from SSIS.

Install SQL Server Management Studio (SSMS)

SSMS is the tool we use to work with a SQL Server database.  For this tip I am using version 14.0.17289.0.  Follow this link to download SQL Server Management Studio.  I like to use the current general availability version.  Note that if you have an older version of SSMS installed, it may not have the features for working with Azure SQL Databases.

Get Azure Values

Before we start creating SSIS packages, we need to gather all of the parameter values from our Azure Storage Account and Azure SQL Databases that we will need for this tip. This section assumes these items have been created.

Azure Storage Account Values

Follow these steps to get the Azure Storage Account values we need:

  1. Navigate to the Azure home page
  2. Click on Storage accounts
  3. Select your storage account
  4. Click on Access Keys (under Settings) and copy the key value under Key1
  5. Click on Overview (top left)
  6. Click on Blobs (under Services)
  7. Select your Container name

The following are the parameter values for this tip based on the above steps (sensitive values shown as asterisks):

Parameter Value
Storage Account Name backofficestaging
Access Key ***********
Container Name subscriptions

Azure SQL Database Values

I use two Azure SQL Databases in this tip: an application database and a staging database.  Follow these steps to get the Azure SQL Database parameter values that we need:

  1. Navigate to the Azure home page
  2. Click on SQL databases
  3. Select the database server and database name

The following are the parameter values for this tip based on the above steps (sensitive values shown as asterisks):

Parameter Value
Application Database Server ***********
Application Database Name AdventureWorksLT
User Name ***********
Password ***********
Staging Database Server ***********
Staging Database Name staging
User Name ***********
Password ***********

Note that the User Name and Password were specified when you created the server.

Create SSIS Packages

Now we are ready to create the SSIS packages.  Open SSDT and create a new SSIS project.  Add new SSIS packages named EXTRACT_SUBSCRIPTIONS and STAGE_SUBSCRIPTIONS.

I will walk through the following steps to build the SSIS packages:

  1. Create the Azure Storage Connection Manager
  2. Create an OLE DB Connection Manager for an Azure SQL Database
  3. Connect to Azure SQL Databases using SSMS
  4. Create the SUBSCRIPTIONS table in the staging Azure SQL Database
  5. Create the EXTRACT_SUBSCRIPTIONS SSIS package
  6. Create the STAGE_SUBSCRIPTIONS SSIS package

Create the Azure Storage Connection Manager

We need to create an Azure Storage Connection Manager in order to save a delimited file to Azure Blob Storage and open a delimited file from Azure Blob Storage.  Follow these steps to create the Azure Storage Connection Manager:

  1. Click on the Control Flow tab in an SSIS package
  2. Open an SSIS package or add one to your SSIS project
  3. Right-click in the Connection Managers area of the SSIS package designer
  4. Select New Connection from the popup menu
  5. Select Azure Storage as the Connection Type
  6. Click Add

The following form will be displayed (fill in the Storage account name and Account key from the Azure Storage Account Values section above):

Azure Storage Connection Manager Editor

Click Test Connection to make sure you can connect to your Azure Storage Account.  Click OK to save the Azure Storage Connection Manager.  Finally right click on the connection manager and select Convert to Project Connection.  This connection manager will be used in both of the SSIS packages we will create.

Create an OLE DB Connection Manager for an Azure SQL Database

There really isn't much difference between creating an OLE DB connection manager for an Azure SQL Database and an OLEDB connection manager for an on-premises SQL Server database. Follow these steps to create an OLE DB connection manager for an Azure SQL Database:

  1. Click on the Control Flow tab in an SSIS package
  2. Right-click in the Connection Managers area of the SSIS package designer
  3. Select New OLE DB Connection from the popup menu
  4. Select New from the OLE DB Connection Manager form

The following form will be displayed (fill in the YOURXXX with your values from the Azure SQL Database Values section above):

SSIS Connection Manager

Note that the only difference for an Azure SQL Database connection versus an on-premises SQL Server database is that the server name has “.database.windows.net” appended to it.

Click Test Connection to verify that your connection manager works.  Click OK to complete the Connection Manager form.  Click OK again on the OLE DB Connection Manager form.  Right-click on the Connection Manager and select Rename; change to AdventureWorksLT.  Right-click on the Connection Manager and select Convert to Project Connection so that the connection manager will be available for use in any SSIS package in the project.

Repeat the above steps to create the staging connection manager.

Connect to Azure SQL Databases Using SSMS

Open SSMS, click Connect in the Object Explorer, and select Database Engine from the popup menu.  Fill in the Connect to Server form replacing YOURXXX with the values from the Azure SQL Database Values section:

Connect to SQL Server in Management Studio

Click the Options button and fill in the Connect to database name as shown below:

SSMS Connection Properties

Click Connect and you will see the staging database in the Object Explorer. 

Create the SUBSCRIPTIONS table in the staging Azure SQL Database

Connect to the staging Azure SQL Database using the steps in the previous section.  Click on the staging database in the Object Explorer and click New Query on the toolbar; enter and execute the following script to create the SUBSCRIPTIONS table; you can copy and paste from the download:

Create Table dbo.Subscriptions

Create the EXTRACT_SUBSCRIPTIONS SSIS package

Open the EXTRACT_SUBSCRIPTIONS package in your SSIS project (or add it if you haven't already).  This package is used to extract data from the Backoffice Web Application's Azure SQL Database and write pipe-delimited files to Azure Blob Storage. 

The Data Flow for the package is shown below:

Create the EXTRACT_SUBSCRIPTIONS SSIS package

The following are the main points on the above Data Flow:

  • Subscriptions is an OLE DB Source that executes the query below on the AdventureWorksLT Azure SQL Database
  • The query results are written to Azure Blob storage using the Azure Blob Destination component which comes from the Azure Feature Pack for SSIS

The following query is used to extract data from the AdventureWorksLT Azure SQL Database (you can copy from the download code and paste into the OLE DB Source):

Extract data from the AdventureWorksLT Azure SQL Database

Since this is just a demonstration of using Azure Blob Storage in an ETL process, the above query does not have any filtering criteria to only grab the rows that have changed since the last time the query was run.

The following shows the configuration of the Azure Blob Destination component:

 Azure Blob Destination Editor

The Azure Blob Destination component is creating a pipe-delimited file that is saved to Azure Blob Storage. The following are the main points on the configuration of the Azure Blob Destination component:

  • The Azure storage connection manager was configured above (Create the Azure Storage Connection Manager section)
  • The Blob container name comes from the Azure Storage Account Values section
  • The Blob name is what I used; it's a folder name and a file name
  • The file is text and will have column names in the first row

I found one limitation with the Azure Blob Destination component; it does not provide an Expressions property.  Ideally, I might like the Blob container name and Blob name properties to be set at runtime via an expression.  On the other hand, it's a bit less complicated to always look for a specific file in a specific container.

When you run this SSIS package, you can navigate to the Azure home page and drill down into your storage account and see the file that was created in Azure Blob Storage; e.g.:

Review the file that was created in Azure Blob Storage

Create the STAGE_SUBSCRIPTIONS SSIS package

Open the STAGE_SUBSCRIPTIONS package in your SSIS project (or add it if you haven't already).  This package is used to read a pipe-delimited file from Azure Blob Storage and insert every record into the SUBSCRIPTIONS table in the staging Azure SQL Database.

The following is the Control Flow for the package:

Create the STAGE_SUBSCRIPTIONS SSIS package 

The following are the main points on the Control Flow:

  • TRUNCATE SUBSCRIPTIONS is an Execute SQL task that truncates the SUBSCRIPTIONS table in the staging Azure SQL Database
  • Load SUBSCRIPTIONS from Azure Blob Storage is a Data Flow detailed below

The following is the Load SUBSCRIPTIONS from Azure Blob Storage Data Flow:

Load SUBSCRIPTIONS from Azure Blob Storage Data Flow

The following are the main points on the above Data Flow:

  • The Azure Blob Source component reads a pipe-delimited file from Azure Storage; it comes from the Azure Feature Pack for SSIS
  • SUBSCRIPTIONS is an OLE DB Destination which inserts the rows in the pipeline into the SUBSCRIPTIONS table in the staging Azure SQL Database

The configuration of the Azure Blob Source component is shown below:

Azure Blob Source Editor

The following are the main points on the configuration of the Azure Blob Source component:

  • The Azure storage connection manager was configured above (Create the Azure Storage Connection Manager section)
  • The Blob container name comes from the Azure Storage Account Values section
  • The Blob name is what I used; it's a folder name and a file name
  • The blob is a pipe-delimited text file with column names in the first row

The Azure Blob Source component has the same limitation as the Azure Blob Destination component; you can't use Expressions to set any values at runtime.

After you run this SSIS package, you can query the SUBSCRIPTIONS table in the staging Azure SQL Database.  In my case I see 417 rows.

Conclusion

My goal for this tip was to demonstrate how you can leverage Azure Storage and Azure SQL Database in your development environment.  As with any demo, there are always shortcuts taken; e.g. I didn't address security that you would need in a production environment.  However, I'm happy with the outcome and I plan to leverage Azure for my development efforts going forward.

While the sample code for this tip is available (see the link in the Next Steps section below), the SSIS project and packages have the Protection Level set to EncryptSensitiveWithUserKey. This is to keep the parameters of my Azure subscription private.  When you open the project, you will get errors and warnings about not being able to decrypt things.  If your try to run the SSIS packages, you will experience a significant delay before you get error messages that the package cannot execute due to errors. Your best bet is to create the packages from scratch.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Ray Barley Ray Barley is a Principal Architect at IT Resource Partners and a MSSQLTips.com BI Expert.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Tuesday, June 21, 2022 - 2:22:17 PM - Sudheer Back To Top (90179)
The expressions property to change blob container and blob names are present in the properties tab of Dataflow task.
Click on data flow task , press F4 .In the expressions tab you should be able to give expressions to blobs name and container

Sunday, November 17, 2019 - 6:40:14 AM - Raymond Barley Back To Top (83114)

The out-of-the-box SSIS components generally do not handle reading data where you have not already defined the columns or fields. You have to write .NET code to do that. I'll give you the high-level answer of how I would do this. I would create an SSIS package and use Biml (see the tutorial on MSSQLTips.com: https://www.mssqltips.com/sqlservertutorial/9089/introduction-to-the-biml-language/) to generate an SSIS package to read a blob or any other delimited file. Biml allows you to embed .NET code in an SSIS package to generate an SSIS package.

Let's say I wanted to read a blob that's a delimited file into a SQL Server table. Assuming I copied the blob to local storage, here would be the steps that I would implement with Biml:

- Open the file

- Parse the first line to get the column names

- Create a SQL Server table with the column names

- Generate a BCP format file for the table

- Import the blob into the SQL Server table using BCP

I do not have an example of this that I can share easily. The process I've described actually took a coule of weeks to fully implement so that I could load any flat file with any delimiter into a SQL Server table.


Thursday, November 14, 2019 - 2:39:45 PM - nitin Back To Top (83094)

Hi Ray,

Thanks a lot!

The article helped a lot.

I am able to process multiple files from Azure blob to azure sql db using ssis.

Although, do you have any idea how can we make the azure blob source as dynamic. I mean to say if the same files in Azure blob source 

has different number of columns ( like if first csv file say abc_1.csv  has 10 columns and another file abc_2.csv with 11 columns. How can we 

make the azure blob source to process both files using ssis. (maybe a script but script should be able to read the file from azure blob right?)

I am stuck with this issue from quite a long time and any help would be great.

Regards,

Nitin


Thursday, November 14, 2019 - 7:24:19 AM - Ray Barley Back To Top (83091)

In this tip I used the Azure Blob Source to read a single blob. There is a Foreach Azure Blob Enumerator component that will retrieve a list of blobs. For each blob name it retrieve you can read the blob with the Azure Blob Source.


Wednesday, November 13, 2019 - 1:59:17 PM - nitin Back To Top (83082)

Hi,

I am trying to dump data from Azure blob source with multiple csv files to azure sql db using ssis but i am unable to do it.

Can you suggest on this how to do it.

Thanks,

Nitin


Monday, September 16, 2019 - 11:02:21 AM - David Stein Back To Top (82470)

Great article Ray. Have you tried using the Compression option? To me, this is the whole point of using Azure Blob Storage in this scenario. I can create files using the Azure Blob Destination with the supported Compression types without issue. However, when I try creating the Azure Blob Source with a compressed file created with the Destination, Visual Studio (2017) just hangs. It will never return with the metadata so I cannot create a package. 

I've tried all of the compression types, multiple different file settings, etc... but it simply will not work. This happens on multiple machines, including a brand new laptop with fresh installs. Do you have any experience with this issue? 


Thursday, February 28, 2019 - 9:29:54 PM - Ray Barley Back To Top (79168)

Koen

I tried Azure File Services and I ran in to a strange situation.  It worked fine accessing from file explorer but not from an SSIS package  I’ve used blog storage so I used it rather than figure out my issue with Azure File Services  In addition I like simplicity of Services.


Thursday, February 28, 2019 - 3:58:40 AM - Koen Verbeeck Back To Top (79149)

Hey Ray,

any specific reason why you choose blob storage in Azure over Azure File Services? With Azure File Services, you can use UNC paths, just like on-premises and you can use the regular file connections etc.

Regards,

Koen















get free sql tips
agree to terms