SQL Integration Services SSIS Troubleshooting Best Practices

By: Arshad Ali | Comments (3) | Related: 1 | 2 | 3 | 4 | > Integration Services Best Practices

Problem

In the previous tips (SQL Server Integration Services (SSIS) - Best Practices - Part 1 and Part 2) of this series I briefly talked about SSIS and a few of the best practices to consider when designing SSIS packages. Continuing on the same rhythm I am going to discuss some more best practices for SSIS package design, how you can design high performing packages with parallelism, troubleshooting performance problems etc.

Solution

In this tip my recommendations are around how you can achieve high performance with achieving a higher degree of parallelism, how you can identify the cause of poorly performing packages, how distributed transaction work within SSIS and finally what you can do to restart a package execution from the last point of failure.

As you can see this tip starts at best practice #10. See these other two tips (Part 1 and Part 2) for best practices 1-9.

Best Practice #10 - Better performance with parallel execution

SSIS has been designed to achieve high performance by running the executables of the package and data flow tasks in parallel. This parallel execution of the SSIS package executables and data flow tasks can be controlled by two properties provided by SSIS as discussed below.

MaxConcurrentExecutables - It's the property of the SSIS package and specifies the number of executables (different tasks inside the package) that can run in parallel within a package or in other words, the number of threads SSIS runtime engine can create to execute the executables of the package in parallel. As I discussed in Best Practice #6,the SSIS runtime time engine which executes the package and every task defined in it (except data flow task) in the defined workflow. So as long as you have sequential workflow of the package (one task after another, precedence defined with precedence constraints between tasks) this property would not make any difference. But if you have your package workflow with parallel tasks, this property will make a difference. Its default value is -1, which means total number of available processors + 2, also if you have hyper-threading enabled then it is total number of logical processors + 2.
EngineThreads - As I said, MaxConcurrentExecutables is a property of the package and used by SSIS runtime engine for parallel execution of package executables, likewise data flow tasks have the EngineThreads property which is used by the data flow pipeline engine and has a default value of 5 in SSIS 2005 and 10 in SSIS 2008. This property specifies the number of source threads (does data pull from source) and worker thread (does transformation and upload into the destination) that can be created by data flow pipeline engine to manage the flow of data and data transformation inside a data flow task, it means if the EngineThreads has value 5 then up to 5 source threads and also up to 5 worker threads can be created. Please note, this property is just a suggestion to the data flow pipeline engine, the pipeline engine may create less or more threads if required.

Example - Let's consider you have a package with 5 data flow tasks in parallel and MaxConcurrentExecutables property has value 3. When you start the package execution, the runtime will start executing 3 data flow tasks of the package in parallel, the moment any of the executing data flow task completes, the execution of the next waiting data flow task will start and so on. Now what happens inside the data flow task is controlled by EngineThreads property. As I described, how the work of a data flow task is broken down into one or more execution trees in Best Practice #6 (you can see how many execution trees are created for a data flow task by turning on logging for PipelineExecutionTrees event of the data flow task), the data flow pipeline engine might create source and worker threads as many as you have set the value of EngineThreads to execute one or more execution trees in parallel. Here also if you have set EngineThreads property to 5 and also your data flow task is broken down into 5 execution trees, it does not mean all execution trees will run in parallel, to summarize this, sequence matters here as well.

Be very careful while changing these properties, do thorough testing before final implementation. Because these properties, if properly configured within the constraints of available system resources, improves the performance by achieving parallelism; on the other hand if it is poorly configured then it will hurt performance because of too much context switching from one thread to another. So as a general rule of thumb do not create and run more threads in parallel than the number of available processors.

Best Practice #11 - When to use events logging and when to avoid...

Logging (or tracing the execution) is a great way of diagnosing the problem occurring during runtime, it helps a lot when your code does not work as expected. Nowadays, almost every programming language provides a logging mechanism to identify the root cause of unexpected problems or runtime failures and SSIS is not an exception. SSIS allows you to enable logging, a powerful, flexible and extremely helpful feature, for your packages and its executables. Not only that, it also allows you to choose different events of a package and components of the packages to log as well as the location where the log information is to be written (text files, SQL Server, SQL Server Profiler, Windows Events etc).

By now you would be convinced that logging saves you from several hours of frustration that you might get while finding out the causes of problem if you are not using logging, but the story doesn't end here. Its true, it helps you in identifying the problem and its root causes, but at the same time it's an overhead for SSIS and ultimately hits performance as well, especially if you are using logging excessively. So the recommendation here is to enable logging if required, you can dynamically set the value of the LoggingMode property (of a package and its executables) to enable or disable logging without modifying the package. Also you should choose to log for only those executables which you suspect to have problems and further you should only log those events which are absolutely required for troubleshooting.

Best Practice #12 - Monitoring the SSIS Performance with Performance Counters

In Best Practice #11, I discussed how you can turn on event logging for your package and its component and analyze the performance related problems. Apart from that, SSIS also introduced (was not available with DTS) system performance counters to monitor the performance of your SSIS runtime and data flow pipeline engines. For example, SSIS Package Instance counter indicates the number of SSIS packages running on the system; Rows read and Rows written counters indicate the total number of rows coming from the source and total number of rows provided to destination; Buffers in use and Buffer memory counters indicate the total number buffers created and amount of memory used by them; Buffer spooling is a very important counter and tells about number of buffers (which are not currently in use) written to the disk when physical memory runs low; BLOB bytes read, BLOB bytes written and BLOB files in use counters give detail about the BLOB data transfer and tells about number of BLOB bytes read, written and total number of files that the data flow engine currently is using for spooling BLOB data etc. An exhaustive list of all the SSIS performance counters can be found here.

If you upgrade Windows Server 2003, on which you you have installed SQL Server and SSIS, to Windows Server 2008, SSIS performance counters will disappear, this happens because the upgrade process removes the SSIS performance counters from the system. To restore them back, refer to this KB article.

Best Practice #13 - Distributed Transaction in SSIS and its impact...

SSIS allows you to group two more executables to execute within a single transaction by using distributed transaction (however you need to start Distributed Transaction Coordinator windows service). Though at first glance it sounds cool, but it might have blocking issues especially if you have a task in the middle which takes longer to execute. For example, let's consider a scenario (though it is a very vague example but will explain the scenario), you have a group of data flow tasks, a Web service task and then again a data flow task in sequence. First data flow task pulls data from source to staging and completes in minutes, web service task pulls data from a web service to staging and takes hours to complete and the last data flow task merges these data and uploads into a final table. Now if you execute all these three task in single transaction, the resource blocked by first data flow task will be blocked until the end even though its not being used anymore while the web service task is executing.

So recommendation here is, even though SSIS provides distributed transaction support on a group of executables but you should use only when it's absolutely necessary. Even if you are using it, try to keep time taking tasks outside the group and set the IsolationLevel property prudently for the group transaction. Wherever possible you should avoid it and use some other alternatives which suits your particular scenario.

Best Practice #14 - How Checkpoint features helps in package restarting

SSIS has a cool new feature called Checkpoint, which allows your package to start from the last point of failure on next execution. By enabling this feature you can save a lot of time for successfully executed tasks and start the package execution from the task which failed in last execution. You can enable this feature for your package by setting three properties (CheckpointFileName, CheckpointUsage and SaveCheckpoints ) of the package. Apart from that you need to set FailPackageOnFailure property to TRUE for all tasks which you want to be considered in restarting, what I mean here is if you set this property then only on failure of that task, the package fails, the information is captured in the checkpoint file and on subsequent execution, the execution starts from that task.

So how does it work? When you enable checkpoint for a package, execution status is written in the checkpoint file (name and location of this file is specified with CheckpointFileName property). On subsequent executions, the runtime engine refers to the checkpoint file to see last execution status before starting the package execution, if it finds failure in last execution it knows where the last execution failed and starts the execution from that task only. So if you delete this file before the subsequent execution, the package execution will start from the beginning even though the last execution failed as it has no way to identify this.

By enabling this feature, you can save lots of time (as data pull or transformation on huge data volume takes much time) during subsequent execution by skipping the tasks which executed successfully in the last run and start from the task which failed. One very important point to note here, you can enable a task to participate in checkpoint including data flow task but it does not apply inside the data flow task. In other words, at data flow task level only you can enable it, you cannot set checkpoint inside the data flow task, for example at a transformation level inside it. Let's consider a scenario, you have a data flow task for which you have set FailPackageOnFailure property to TRUE to participate in checkpoint. Now inside the data flow task, you have five transformations in sequence and execution fails at 5th (earlier 4 transformations have completed successfully). On subsequent execution the execution will start from the data flow task and the first 4 transformations will run again before coming to 5th one.

Note

The above recommendations have been done on the basis of experience gained working with DTS and SSIS for the last couple of years. But as noted before there are other factors which impact the performance, one of the them is infrastructure and network. So you must do thorough testing before putting these changes into your production environment.

Next Steps

Review SQL Server Integration Services (SSIS) - Best Practices - Part 1 tip
Review SQL Server Integration Services (SSIS) - Best Practices - Part 2 tip
Review Sending email from SQL Server Integration Services (SSIS) tip
Review Sending HTML formatted email in SQL Server using the SSIS Script Task tip

About the author

Arshad Ali is a SQL and BI Developer focusing on Data Warehousing projects for Microsoft.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips