By: Koen Verbeeck
Overview
In the last chapter of this tutorial we’ll look at a couple of performance optimizations you can implement in your SSIS packages. After all, you want to move data around as quickly as possible.
SSIS Control Flow Performance Optimizations
There is typically not much you can do in the control flow to improve performance. There are two key points however:
- Think about if you want to perform tasks in SSIS or if you can do them somewhere else. For example, sorting data will be faster in SQL Server T-SQL code than in SSIS.
- Perform tasks in parallel if possible, but don’t overdo it. Going into parallel will surely improve performance, but this is heavily influenced by available memory and the number of processors. There is a certain overhead to parallelism. If there’s too much parallelism, the system will go slower instead of faster. Carefully test to find the optimum balance.
SSIS Data Flow Performance Optimizations
Most performance issues are related to the data flow. As with the control flow, think if SSIS or transformations in SQL will be faster. Try to visualize the data flow as a pipeline with data flowing through. You want to maximize the flow rate to get data to the destination as quickly as possible. There are some important properties you can set to influence the memory buffers.
- DefaultBufferMaxRows: This is by default set to 10,000, which was made a good setting in 2005. Today we have more power in our machines, so you can set this to a higher number. Too large a buffer will take a while to fill and will make the destination sit idle, so you need to find a good optimum. 50,000 rows is a good start.
- DefaultBufferSize: This is by default 10MB which is again quite small. You can bump this up to 50MB or even more.
The actual buffer size will be determined by which of the two properties is reached first. You can set AutoAdjustBufferSize to True to make sure that the specified number of rows in DefaultBufferMaxRows is always met.
SSIS Reading Data Performance Optimizations
Some guidelines:
- Don’t use the dropdown box to select the source table. Write a SQL statement and include filtering, grouping and sorting in the SQL code.
- Only select columns you actually need.
- Keep the data types of the columns small. The more rows you can fit in a single memory buffer, the better.
SSIS Transforming Data Performance Optimizations
Here we have some best practices as well:
- Don’t use blocking transformations (e.g. sort and aggregate component). They read all data in memory before even sending one single row to the output. Asynchronous transformations are to be avoided as well since they modify the memory buffer. You can find a good overview in this blog post.
- Avoid the Slowly Changing Dimension Wizard. It uses the OLE DB Command, which executes SQL statements row-by-row, which is slow and results in excessive logging.
- Don’t use the OLE DB Command, as stated in the previous point.
SSIS Writing Data Performance Optimizations
Writing data is typically the slowest part of the process. Here are some tips to optimize the process:
- The OLE DB Destination is the fastest adaptor for SQL Server at the moment. If you use the Fast Load option of course.
- Make sure you use a table lock (which is enabled by default).
- To speed up inserts, you can disable constraints and drop and recreate indexes.
- Try to go for minimally-logged inserts, which means the destination needs to be in Simple recovery mode. It’s a broad topic, but you can find more info in the Data Loading Performance Guide.
- Make sure the database files have an appropriate size. Time is lost when the database has to be constantly growing its files.
Additional Information
- For more related tips check out the Integration Services Tips on this page.