In this section, we will introduce the Integration Services Data Flow. It’s one of the more important features of SSIS and one of the reasons SSIS is considered one of the fastest ETL tools. We’ll give also an overview of the more important transformations you can do in the data flow.
The data flow is a special task of the control flow. It needs a canvas of its own, so there’s an extra tab for the data flow, right next to the control flow.
The data flow is a construct where you can read data from various sources into the memory of the machine that is executing the SSIS package. While the data is in memory, you can perform different kinds of transformations. Because it’s in memory, these are very fast. After the transformations, the data is written to one or more destinations (a flat file, an Excel file, a database, etc.). In most cases, not all data is read into the memory at once - although this is possible if you use certain kind of transformations – but the data is read into buffers. Once a buffer is filled by the source component, it is passed on to the next transformation which does it logic on the buffer. Then the buffer is passed to the following transformation and so on until it is written to the destination. You can imagine the data flow is like a pipeline, with data flowing through.
To create a data flow, you can drag it from the SSIS toolbox to the control flow canvas.
Another option is to simply go to the data flow tab, where you will be greeted with the following message:
Clicking the link will create a new data flow task for you. You end up with an empty canvas, just like in the Control Flow.
As you can see in the screenshot above, the SSIS toolbox will change once you go to the data flow canvas. All the tasks are now replaced with transformations, sources and destinations for the data flow. At the top, you also have a dropdown box that lets you easily switch between multiple data flows if you have any.
These are some of the most common sources and destinations for the data flow:
There are other types of sources and destinations available. You can take a look at the documentation to learn more. You also have the possibility to use a .NET script component to make your own source or destination. This is similar to a .NET script task; you can use C# or VB, but now there are special methods and classes included to handle the buffers of the data flow.
Some of the more common transformations:
In the next two sections of the tutorial, we’ll configure a source, some transformations and a destination.