Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

Free SQL Server Webcast > Building Really Fast SQL Server VMs
 

SSIS Transformation Tasks Overview


By:   |   Read Comments (1)   |   Related Tips: More > Integration Services Development

Problem

SQL Server Integration Services is a powerful tool to create ETL solutions.  When developing a solution to import data from other sources there are often similar tasks that need to be performed such as: removing duplicates, inserting the same data into multiple destinations and only inserting new records.  In this tip we will cover these different scenarios to give you an idea on how you can use some of the SSIS features.

Solution

We will cover these three tasks in our SSIS package:

  • Import the source data into multiple destinations
  • Remove duplicate records
  • Only insert new records

So let's get started.

Import Source Data into Multiple Destinations with SSIS

First, we want to insert our source data into multiple destinations. For example, when you have a CSV file with 40 columns and your database structure has 3 tables that hold that type of data. In this case, we can use the Multicast Task.

This task is very simple, it is put right after the data source and connected to multiple destinations.  We can then specify the different destination tables and map the columns you need for each table.

SSIS data flow task

Remove Duplicates from Source Data in SSIS

Next, we want to remove duplicates from the source.  We will cover two ways to do this.

  • One way is to sort the data using the Sort Task.  To do this, you select the column or columns you want to sort and then check the option "Remove rows with duplicate sort values" as shown below.
  • Another option is to group the data using the Aggregate Transformation Task. In this case, the columns in the aggregation will be available for other tasks.

Below you can see both how to remove duplicates using the Sort Task:

SSIS sort transformation task

Below you can see both how to remove duplicates using the Aggregate Transformation Task:

SSIS aggregate transformation task

The difference between these is that the Aggregate Transformation Task has to include all fields that should be used to group the data and these fields will then be available for tasks that come after this task. The Sort Task sorts the data based on the selected columns and based on these values determines which records are duplicates.

Insert Only New Records with SSIS

The next task I will demonstrate is the Lookup Task, which is used to join columns to the data flow by looking up columns in a table. You then have the ability to send the rows that don't match as one output and rows that do match as another output.

Below the entire data flow is shown.

ssis package flow

In the Lookup Transformation Task, we configure what happens with rows that don't have matching entries. Also, we configure the connection specifying a data source to use for the lookup.

lookup transformation task in SSIS

For the columns page, we specify how columns will be matched for the Lookup.

SSIS lookup transformation task

Running the SSIS Package

Here we are running the package for the first time and inserting 3 records.  If we run it again these records won't be inserted again since they already exist in the table from the first run, based on how we setup this SSIS package.

entire ssis package

Conclusion

SQL Server Integration Services provide tasks to transform and validate data during the load process and transformations to insert data into your destination. Rather than create a stored procedure with T-SQL to validate or change data, is good to know about the different SSIS tasks and how they can be used. 

Next Steps


Last Update:


next webcast button


next tip button



About the author
MSSQLTips author Douglas Correa Douglas Correa is a database professional, focused on tuning, high-availability and infrastructure.

View all my tips





Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Email me updates 


Signup for our newsletter
 I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



    



Thursday, June 22, 2017 - 12:49:12 PM - Louis Back To Top

 A nice article for people (such as myself) who have understand what the individual SSIS components do, and this shows how they work together in the bigger picture.

 


Learn more about SQL Server tools