We have a number of SSIS packages that routinely fail for various reasons such as a particular file is not found, an external FTP server is unavailable, etc. In most cases these error conditions are just a temporary situation and we can simply rerun the package at a later time and it will be successful. The issue, however, is that we do not want to rerun the tasks in the package that have have already completed successfully. Is there a way that we can restart an SSIS package at the point of failure and skip any tasks that were successfully completed in the previous execution of the package?
SSIS provides a Checkpoint capability which allows a package to restart at the point of failure. The Checkpoint implementation writes pertinent information to an XML file (i.e. the Checkpoint file) while the package is executing to record tasks that are completed successfully and the values of package variables so that the package's "state" can be restored to what it was when the package failed. When the package completes successfully, the Checkpoint file is removed; the next time the package runs it starts executing from the beginning since there will be no Checkpoint file found. When a package fails, the Checkpoint file remains on disk and can be used the next time the package is executed to restore the values of package variables and restart at the point of failure.
The starting point for implementing Checkpoints in a package is with the SSIS package properties. You will find these properties in the Properties window under the Checkpoints heading:
CheckpointFileName - Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it's a good idea to use an expression that concatenates a path defined in a package variable and the package name.
CheckpointUsage - Determines if/how checkpoints are used. Choose from these options: Never (default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
SaveCheckpoints - Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.
After setting the Checkpoint SSIS package properties, you need to set these properties under the Execution heading at the individual task level:
FailPackageOnFailure - Choose from these options: True or False (default). True indicates that the SSIS package fails if this task fails; this implements the restart at the point of failure behavior when the SSIS package property SaveCheckpoints is True and CheckpointFileUsage is IfExists.
FailParentOnFailure - Choose from these options: True or False (default). Select True when the task is inside of a container task such as the Sequence container; set FailPackageOnFailure for the task to False; set FailPackageOnFailure for the container to True.
Keep in mind that both the SSIS package Checkpoint properties and the individual task properties need to be set appropriately (as described above) in order to implement the restart at the point of failure behavior.
Before wrapping up the discussion on Checkpoints, let's differentiate the restart from the point of failure behavior with that of a database transaction. The typical behavior in a database transaction where we have multiple T-SQL commands is that either they all succeed or none of them succeed (i.e. on failure any previous commands are rolled back). The Checkpoint behavior, essentially, is that each command (i.e. task in the SSIS package) is committed upon completion. If a failure occurs the previous commands are not rolled back since they have already been committed upon completion.
Let's wrap up this discussion with a simple example to demonstrate the restart at the point of failure behavior of Checkpoints. We have an SSIS package with Checkpoint processing setup to restart at the point of failure as described above. The package has two Execute SQL tasks where the first will succeed and the second will fail. We will see the following output when running the package in BIDS:
Task 1 is green; it executed successfully. Task 2 is red; it failed. If we run the package a second time we will see the following output:
Notice that Task 1 is neither green nor red; in fact it was not executed. The package began execution with Task 2; Task 1 was skipped because it ran successfully the last time the package was run. The first run ended when Task 2 failed. The second run demonstrates the restart at the point of failure behavior.
SSIS does not persist the value of Object variables in the Checkpoint file.
When you are running an SSIS package that uses Checkpoints, remember that when you rerun the package after a failure, the values of package variables will be restored to what they were when the package failed. If you make any changes to package configuration values the package will not pickup these changes in a restart after failure. Where the failure is caused by an erroneous package configuration value, correct the value and remove the Checkpoint file before you rerun the package.
For a Data Flow task you set the FailPackageOnFailure or FailParentOnFailure properties to True as discussed above. However, there is no restart capability for the tasks inside of the Data Flow; in other words you can restart the package at the Data Flow task but you cannot restart within the Data Flow task.
Keep the Checkpoint capability in mind and implement it in your packages where appropriate.
When using Checkpoints make sure that all of the appropriate properties are set as described above. You really have to include failures in your testing to make sure that you have Checkpoints setup correctly.
Stay tuned for an upcoming tip on developing SSIS master packages that execute a number of child packages in a particular order. The Checkpoint restart at the point of failure behavior will be implemented in the master package.
Download a copy of the Checkpoint sample SSIS package
here to experiment with Checkpoints. Note that the package assumes you have a SQL Server instance running locally and the CheckpointFileName is hard-coded to c:\mssqltips\Checkpoint.xml. Change these as appropriate for your environment.
Last Update: 1/9/2008
About the author
Ray Barley is a Principal Architect at RDA Corporation and a MSSQLTips.com BI Expert.
If you have a billion records to load what you want is to load a batch of records in a transaction (say 10,000, 20,000, 50,000 records). If there is any failure within the batch you rollback the transaction, fix the problem and restart with the failed batch.
Take a look at this tip http://www.mssqltips.com/sqlservertip/1504/how-to-implement-batch-processing-in-sql-server-integration-services-ssis/ which walks through an example of how to implement batch processing in SSIS.
If you want to restart at the point of a single failed record (rather than a batch) you would have to have each insert in it's own transaction. With a billion records this would be very inefficient.
I think I would say it like this - if a task's parent is the package (i.e. it is not inside of a container like a foreach loop, sequence container, etc.) then set FailPackageOnFailure to True. If a task is inside a container then set FailPackageOnFailure to true for the container and set FailParentOnFailure to true for the tasks inside the container. the idea is when tasks are inside of a container and one fails you want to restart the entire container.
"FailParentOnFailure - Choose from these options: True or False (default). Select True when the task is inside of a container task such as the Sequence container; set FailPackageOnFailure for the task to False; set FailPackageOnFailure for the container to True."
I tried to paraphrase, but it seems doing so will make it less clearer.
The Checkpoint feature allows your package to resume at the point of failure. So if you had a Data Flow task that failed, when you rerun your package you would begin with the failed Data Flow task, bypassing any steps prior. However, the Checkpoint feature doesn't record the fact that a portion of a task has completed successfully; it only records the fact that a task has completed successfully. You can still use checkpoints but you need a little something else as well.
The way I would do this is to implement a batching type of scenario. Use a FOR LOOP. The FOR LOOP executes until there are no more rows to process. Inside of the FOR LOOP you have a Data Flow task and an Execute SQL Task. The Data Flow task selects the next batch of rows from your source table to process and flags them as in process. You can call a stored procedure to update a batch of rows as in process then select those rows, for instance using an OLE DB Source in the Data Flow. Also inside the Data Flow you insert those rows into your destination table using an OLE DB Destination. Use the Execute SQL task to update the source table rows from inprocess to processed. Each time the Data Flow task executes within the FOR LOOP, it grabs the next batch of rows to be processed.
It was a nice overview of SSIS checkpoint funcationality.
We are trying to implement the following scenario.
We have total 1000 records to be inserted into staging area.
We want to implement a logic in SSIS 2005 which will allow us to commit after evrey 50 records inserted. That way if the package fails, then
instead of insterting all the records from the scratch, we will need to insert only records after the point of failure. Means all the records which got inserted before the point of failure are commited and insertion process starts from the record which did not get inserted. Basically we would like to know how to implement logic in the ssis task so that checkpoint after every 50 records are executed.