SSIS Transformation Tasks Overview

By:   |   Comments (1)   |   Related: More > Integration Services Development


Problem

SQL Server Integration Services is a powerful tool to create ETL solutions.  When developing a solution to import data from other sources there are often similar tasks that need to be performed such as: removing duplicates, inserting the same data into multiple destinations and only inserting new records.  In this tip we will cover these different scenarios to give you an idea on how you can use some of the SSIS features.

Solution

We will cover these three tasks in our SSIS package:

  • Import the source data into multiple destinations
  • Remove duplicate records
  • Only insert new records

So let's get started.

Import Source Data into Multiple Destinations with SSIS

First, we want to insert our source data into multiple destinations. For example, when you have a CSV file with 40 columns and your database structure has 3 tables that hold that type of data. In this case, we can use the Multicast Task.

This task is very simple, it is put right after the data source and connected to multiple destinations.  We can then specify the different destination tables and map the columns you need for each table.

SSIS data flow task

Remove Duplicates from Source Data in SSIS

Next, we want to remove duplicates from the source.  We will cover two ways to do this.

  • One way is to sort the data using the Sort Task.  To do this, you select the column or columns you want to sort and then check the option "Remove rows with duplicate sort values" as shown below.
  • Another option is to group the data using the Aggregate Transformation Task. In this case, the columns in the aggregation will be available for other tasks.

Below you can see both how to remove duplicates using the Sort Task:

SSIS sort transformation task

Below you can see both how to remove duplicates using the Aggregate Transformation Task:

SSIS aggregate transformation task

The difference between these is that the Aggregate Transformation Task has to include all fields that should be used to group the data and these fields will then be available for tasks that come after this task. The Sort Task sorts the data based on the selected columns and based on these values determines which records are duplicates.

Insert Only New Records with SSIS

The next task I will demonstrate is the Lookup Task, which is used to join columns to the data flow by looking up columns in a table. You then have the ability to send the rows that don't match as one output and rows that do match as another output.

Below the entire data flow is shown.

ssis package flow

In the Lookup Transformation Task, we configure what happens with rows that don't have matching entries. Also, we configure the connection specifying a data source to use for the lookup.

lookup transformation task in SSIS

For the columns page, we specify how columns will be matched for the Lookup.

SSIS lookup transformation task

Running the SSIS Package

Here we are running the package for the first time and inserting 3 records.  If we run it again these records won't be inserted again since they already exist in the table from the first run, based on how we setup this SSIS package.

entire ssis package

Conclusion

SQL Server Integration Services provide tasks to transform and validate data during the load process and transformations to insert data into your destination. Rather than create a stored procedure with T-SQL to validate or change data, is good to know about the different SSIS tasks and how they can be used. 

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Douglas Correa Douglas Correa is a database professional, focused on tuning, high-availability and infrastructure.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Thursday, June 22, 2017 - 12:49:12 PM - Louis Back To Top (57940)

 A nice article for people (such as myself) who have understand what the individual SSIS components do, and this shows how they work together in the bigger picture.

 















get free sql tips
agree to terms