Problem I'm trying to build an SSIS package where the entire package is encapsulated in a transaction. In addition there is a table that needs to remain locked for the duration of the SSIS package execution. Can you provide an example of how to do this?
Solution The transaction handling that is built in to SSIS can easily support your requirements. Before we get in to the specifics of implementing this in SSIS, let's discuss the transaction isolation level, transactions in SSIS at a high level, then walk through an example of using transactions in an SSIS package to solve your problem.
Transaction Isolation Levels
The transaction isolation level determines the duration that locks are held. We'll use SQL Server as an example. The following transaction isolation levels are available in SQL Server:
READ UNCOMMITTED - reads do not acquire share locks and they don't wait on locks. This is often referred to as a dirty read because you can read modified data that hasn't been committed yet and it could get rolled back after you read it.
READ COMMITTED - reads acquire share locks and wait on any data modified by a transaction in process. This is the SQL Server default.
REPEATABLE READ - same as READ COMMITTED but in addition share locks are retained on rows read for the duration of the transaction. In other words any row that is read cannot be modified by another connection until the transaction commits or rolls back.
SERIALIZABLE - same as REPEATABLE READ but in addition no other connection can insert rows if the new rows would appear in a SELECT statement already issued. In other words if you issue a select statement in a transaction using the SERIALIZABLE isolation level you will get the same exact result set if you issue the select statement again within the same transaction.
SQL Server 2005 added two new options:
A variation on READ COMMITTED where you set READ_COMMITTED_SNAPHOT ON at the database level and any transaction that uses the READ COMMITTED isolation level will not acquire share locks and will not wait on any locks. Rather, you will get the committed version of all rows at the time the SELECT statement begins.
A new isolation level called SNAPSHOT where you set ALLOW_SNAPSHOT_ISOLATION ON at the database level and any transaction that explicitly sets the transaction isolation level to snapshot will not acquire share locks and will not wait on any locks. Rather, you will get the committed version of all rows at the time the transaction begins.
Both of the above SQL Server 2005 enhancements are made possible by maintaining committed versions of rows in tempdb (referred to as the version store). When a read encounters a row that has been modified and not yet committed, it retrieves the appropriate latest committed row from the version store. The maintenance and traversing of the version store is performed by SQL Server automatically; there are no code changes required.
Transactions in SSIS
Transaction support is built in to SSIS. The TransactionOption property exists at the package level, container level (e.g. For Loop, Foreach Loop, Sequence, etc.), as well as just about any Control Flow task (e.g. Execute SQL task, Data Flow task, etc.). TransactionOption can be set to one of the following:
Required - if a transaction exists join it else start a new one
Supported - if a transaction exists join it (this is the default)
NotSupported - do not join an existing transaction
The built-in transaction support in SSIS makes use of the Distributed Transaction Coordinator (MSDTC) service which must be running. MSDTC also allows you to perform distributed transactions; e.g. updating a SQL Server database and an Oracle database in the same transaction. If you execute an SSIS package that utilizes the built-in transaction support and MSDTC is not running, you will get an error message like the following:
Error: 0xC001401A at Transaction: The SSIS Runtime has failed
to start the distributed transaction due to error 0x8004D01B
"The Transaction Manager is not available.". The DTC transaction
failed to start. This could occur because the MSDTC Service is not running.
Note also that the SSIS package elements also have an IsolationLevel property with a default of Serializable. As discussed above in the section on Transaction Isolation Levels, this setting impacts the duration of locks as well as whether shared locks are acquired.
SSIS Package Example
Let's take a look at a sample SSIS package that we will use to demonstrate how to implement transactions at the package level and lock a table for the duration of the package's execution:
The Test Initialization sequence container is used to create a test environment. Two tables are created (TranQueue and TranQueueHistory) and a row is inserted into TranQueue. This will allow us to simulate a process where the SSIS package processes a group of rows inside of a transaction. The TransactionOption setting for the Test Initialization sequence container is NotSupported since it only exists to create the test environment; i.e. we don't need any transaction support here which would rollback any successful steps in the event of a failure.
The Process sequence container has its TransactionOption set to Supported; since the package setting for TransactionOption is set to Required, a transaction is created at the package level and the container will join that transaction. Process TranQueue is an Execute SQL task that executes the following SQL command to simulate processing a group of rows in the TranQueue table:
DELETE TOP(10) dbo.TranQueue
FROM dbo.TranQueue WITH (TABLOCKX)
The main points about this SQL command are:
It deletes the first ten rows from the TranQueue table to simulate pulling them out for processing
It uses the OUTPUT clause to insert the message column of each deleted row into the TranQueueHistory table to simulate processing has completed and history is being updated
It uses the TABLOCKX table hint to lock the TranQueue table
The Placeholder for Breakpoint Execute SQL task does not execute a command; it's there so we can set a breakpoint and run some queries while the package is running and the transaction is open (discussed below). The Simulate Failure Execute SQL task is executed if the package variable v_SimulateFailure = 1; it does a SELECT 1/0 to generate an error (i.e. a divide by zero) which will cause a rollback on the package transaction.
The above example is intentionally short just for demonstration purposes. You can certainly have multiple tasks in the Process sequence container, all of which would participate in the transaction, and either all succeed on none succeed (i.e. rollback on failure).
You can download the project containing the sample SSIS package here. The package is hard-coded to use a local database named mssqltips; create it if it doesn't exist. Open the project using SQL Server Business Intelligence Development Studio (BIDS) and double click on the package Transaction.dtsx. Follow these steps to see the transaction handling in an SSIS package:
Make sure the value of the variable v_SimulateFailure = 1; this will demonstrate the rollback
Make sure there is a breakpoint on the Placeholder for Breakpoint Execute SQL task
Execute the package; your screen should look like this (stopping at the breakpoint):
Open a new query window in SQL Server Management Studio, connect to the mssqltips database and execute the command below. You should see a single row result set; e.g. Test Message2008-09-08 14:22:31.043 (your date and time will be different of course). The NOLOCK hint ignores locks; the row you see is not committed yet.
* FROM dbo.TranQueueHistory WITH (NOLOCK)
Open another new query window in SQL Server Management Studio, connect to the mssqltips database and execute the command below. You will be blocked waiting for the transaction executing in the SSIS package to either rollback or commit since we added the TABLOCKX hint which will keep the TranQueue table locked for the duration of the transaction. Alternatively you could issue an INSERT INTO the dbo.TranQueue table and you will see that it also is blocked until the transaction either commits or does a rollback.
* FROM dbo.TranQueue
Click Continue in BIDS (or click Debug on the top-level menu then Continue) and you will see the package fail. Execute the SELECT statement above on the TranQueueHistory table again and you will see no rows; the select statement above on the TranQueue table will complete showing a single row. Thus the error caused the transaction to rollback. After the rollback the deleted row(s) in the TranQueue table are restored and the inserted row(s) in the TranQueueHistory table are not committed (i.e. they will disappear).
You can change the value of the v_SimulateFailure variable to 0 and run the package and queries above again to validate that the transaction commit works as we expect.
Download the sample SSIS package here to experiment with transaction handling in SSIS.
Check the Books on Line topic "Incorporating Transactions in Packages" for additional details on handling transactions in SSIS.
Last Update: 9/17/2008
About the author
Ray Barley is a Principal Architect at RDA Corporation and a MSSQLTips.com BI Expert.
The package setting for TransactionOption is Required; this means that the package must run within a transaction. The idea of the required setting at the package level is that everything in the package is running in a transaction; if any part of the package fails everything previously done will be rolled back. If everything in the package succeeds then the transaction will be committed. If you execute the package using DTEXEC or DTEXECUI, the Required setting will cause the package to create a transaction and either commit or roll back as appropriate; if you called the package (let's call this the child package) from another package (i.e. a parent package) and the Execute Package task in the parent package is executing within a transaction, then the Required setting causes the child package to execute under the transaction in the parent. If anything in the child fails it is rolled back and whatever steps have already completed in the parent package will also be rolled back. If the parent package Execute Package task is not executing within a transaction then the child package will create its own transaction. So to sum up the Required setting, it means that you must execute within a transaction; if your caller is already executing a transaction you execute in that same transaction else you create a new transaction.
The Supported option means if your called is executing a transaction you will join that transaction; if you fail the rollback will apply to your completed work and the work completed in the caller's transaction. If your called is not executing in a transaction then you don't execute within a transaction. The difference then between Required and Supported is Reqauired always executes within a transaction (either the caller's transaction or a new one) and Supported only executes in a transaction if the caller is executing one.
Not SUpported means you don't want to participate in a transaction at all.
Do you know if the above technique works when reading from and then writing to Teradata tables? I have a package that reads from a SQL Server database and writes to some Teradata tables using the .net OLEDB provider for Teradata. Everything works correctly but if one of the tasks in the sequence container fails it does not rollback the inserts and we would like to implement this feature but I'm not sure if it will work when writing to Teradata.
probably the easiest way is to go to the Object Explorer in SQL Server Management Studio, right click on the database that has the data you want to export, select tasks, export data then go through the wizard. This will create an SSIS package for you that you can save and run immediately or later.
Wednesday, July 25, 2012 - 8:43:51 AM - G Bryant McClellan
One thing to keep in mind...the default Isolation Level for all SSIS tasks is SERIALIZABLE. Be sure you understand the potential impact and only use the setting when required. If READCOMMITTED is ufficient, use it. You will be on better terms with your DBAs :-)
I have a problem with ETL - SSIS...I hope you can help me.
My ETL load data in a Dababase in SQL Server 2000(one machine in the domain) from a flat file. I work with OLE DB connection and Windows Authentification. It worked correctly. But when I changed the transaction Option in the DataFlow component from Supported to Required, it fail. I checked the service "Distributed Transaction Coordinator (MSDTC)" and it is turn on. After, I did another test in my local(in my computer) SQL Intance and it worked correctly. I red in this forum probably the resolution is the protection level, but also did not work http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/e9ee5277-dd9e-4e95-a8f5-318d5a8440c3.
After that I tried with many advices such as “retainsameconnection = true”, change de configuration of the service http://msdn.microsoft.com/en-us/library/aa561924(v=bts.20).aspx (section Set the appropriate MSDTC Security Configuration options on Windows Server 2003 SP1, Windows XP SP2, Windows Server 2008, and Windows Vista).
Finally I test my package with other instance of SQL Server, but 2008 in the network but nothing work.
The error says:
[DailyAccountingSnapshot ] Error: SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER.The AcquireConnection method call to the connection manager "MyConexion" failed with error code 0xC0202009.There may be error messages posted before this with more information on why the AcquireConnection method call failed.
[SSIS.Pipeline] Error: component "DailyAccountingSnapshot" (16) failed the pre-execute phase and returned error code 0xC020801C.
Wednesday, December 12, 2012 - 10:53:05 AM - Ray Barley
You need the built-in transaction handling in SSIS if you want multiple steps in your package to participate in the same transaction and you want step(s) that complete successfully to rollback if a later step in the same transaction fails.
A simpler alternative may be to not use the built-in transaction handling in SSIS and instead use the SSIS checkpoint feature. The idea with checkpoints is that each step in your package commits to the datbaase if successful; if not then you fix the problem and restart the package at the point of failure. You don't use rollback.
You can look at a tip of checkpoints here: http://www.mssqltips.com/sqlservertip/1408/integration-services-checkpoints-to-restart-package-from-failure/
I used the built-in transaction handling, how this post says, but it worked only in my local instance of SQL when I tried with other instance in the network it failed (SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER).
I did some checking; built-in trnsaction handling requires MS DTC to be running on each machine; i.e. you run the package on machine A and connect to a database on machine B; take a look at this thread: http://social.msdn.microsoft.com/forums/en-US/sqlintegrationservices/thread/da076e51-8149-4948-add1-6192d8966ead/ (answer by Brian Knight)
But this I had tested with this (running the service on each machine) and it did not work. Also I tested with this http://msdn.microsoft.com/en-us/library/aa561924.aspx (Set the appropriate MSDTC Security Configuration options on Windows Server 2003 SP1, Windows XP SP2, Windows Server 2008, and Windows Vista)...
Here's an example that sheds some light on the question about truncate asked in the previous post; I use 2 query windows in SQL Server Management Studio and execute the statements as shown below. I use explicit transactions and 2 scenarios.
begin transaction; truncate table dbo.tablename
select * from tablename
result: window 2 is blocked by the open transaction in window 1 since the transaction isn't committed yet; if window 1 does a rollback then window 2 is unblocked, the select statement is executed, and a result set is returned; if window 1 does a commit then window 2 is unblocked, the select statement is executed and no rows are returned since the truncate has committed
set transaction isolation level serializable; begin transaction; select * from tablename
begin transaction; truncate table dbo.tablename
result: window 2 is blocked until window 1 does a commit or rollback; once the window 1 transaction completes (either commit or rollback) window 2 transaction is executed and the table is truncated; window 2 needs to do a commit to complete the truncate transaction
I have a package that copies a bunch of tables from a SQL Server 2008 R2 source to SQL Server 2008 R2 destination; If the load fails no harm, I can truncate all the tables and reload again. I do not require any transaction support. My main requirement is limit the size of transaction log. Package transaction option is at default "supported" and Isolation Level is Serializable (this is also default). Same setting for each of the data flow tasks in the package. I want the package to commit each data flow task when completed. What can I do to limit the transaction log usage by this package
If you want to limit the size of the transaction log, you can check in to setting the database recovery model to SIMPLE or BULK_LOGGED. Maybe you are using the default recovery model of FULL which allows for point in time recovery at the expense of alot more logging. If it will work for you BULK_LOGGED is probably the best choice to reduce logging. Check http://msdn.microsoft.com/en-us/library/ms190422.aspx to see if you can enable the things necessary for minimal logging.