By: Arshad Ali | Comments (7) | Related: > Integration Services Configuration Options
Problem
During SSIS package development we can add a data viewer to the data path of the data flow task to analyze the data passing through the data path or the pipeline, but once the package is deployed (or when not using Business Intelligence Development Studio) there was no built-in support for data tapping to analyze data for troubleshooting. I have heard we can do this now with SQL Server 2012, can you explain how it works?
Solution
Starting with SQL Server 2012, we can deploy our package to integration services catalog (to learn more about this click here) which is the recommended way now and then we can control almost every aspect of package maintenance and execution with stored procedures available in the catalog database. When you want to troubleshoot a failing package in production, sometimes you might need to analyze the data passing through data paths of the data flow components; in order to do so you can use [catalog].[add_data_tap] or [catalog].[add_data_tap_by_guid] stored procedures. To use it, you first need to create an execution instance of the package (using [catalog].[create_execution] stored procedure), then add a data tap by specifying PackagePath of the data flow task and IdentificationString of the data path inside that particular specified data flow task to tap the data (using [catalog].[add_data_tap] or [catalog].[add_data_tap_by_guid] stored procedure) and finally execute the package (using [catalog].[start_execution] stored procedure).
Getting started...
I have a very simple package, as shown below, deployed to integration services catalog with one execute SQL task to truncate the target table and then a data flow task to load data into the target table.
Now I want to add a data tap to the data flow task; in order to do so first I need to capture the value of the PackagePath property of the data flow task (in my case its "\Package\Load Person") for which I want to tap the data, as shown below:
Next I need to capture the value of the IdentificationString property of the data path (in my case its "Paths[OLE DB Source.OLE DB Source Output]") inside the selected data flow task for data tapping, as shown below:
Once I am done collecting this information, I can now specify these values as parameters for the stored procedures discussed above:
DECLARE @execution_id bigint
--Create an execution instance in the Integration Services catalog --Data tap(s) are valid for the specified execution instance only EXEC [SSISDB].[catalog].[create_execution] @package_name=N'Person.dtsx', -- Package name that is to be executed @execution_id=@execution_id OUTPUT, @folder_name=N'AdventureWorks', -- Folder that contains the package that is to be executed @project_name=N'AdventureWorks', -- Project that contains the package that is to be executed @use32bitruntime=False, @reference_id=Null
--Add a data tap for specified execution instance --You need to specify the parameters to indicate specific in the data flow in the package where data tap will be added. EXEC [SSISDB].[catalog].[add_data_tap] @execution_id = @execution_id, @task_package_path = '\Package\Load Person', -- Package path for the data flow task you need to tap @dataflow_path_id_string = 'Paths[OLE DB Source.OLE DB Source Output]', -- Identification string for the data flow path you need to tap @data_filename = 'Person.csv', --Name of the file that stores the tapped data. @max_rows = 100000 -- Number of rows that are captured during the data tap. If not specified, all rows are captured
--Start an execution instance in the Integration Services catalog --for the created execution instance EXEC [SSISDB].[catalog].[start_execution] @execution_id GO
If you notice I have specified the name of data file as Person.csv. The data files will be created in <installation drive>:\Program Files\Microsoft SQL Server\110\DTS\DataDumps location. When I executed the above T-SQL statements it executed successfully and returned immediately letting the package run in the background (by the integration services catalog) and tapping data into the above location in the specified file. If you run the above T-SQL statements multiple times, you will see multiple data files getting created and subsequent files are getting appended with a number at the end of the file. If you notice in the image below, the data file sizes are different and that is because I ran the above T-SQL statement multiple times each time specifying a different maximum number of rows to tap.
When you execute the above T-SQL statements and get a success message, but you are not seeing the data files be patient it might take some time for the files to appear based on the complexity and location of the data path in the package that you are tapping. Also you will notice increasing sizes of the data files, this happens as data is written to the data files and also based on the maximum number of rows you want to tap.
Notes:
- You can use [catalog].[add_data_tap_by_guid] instead of [catalog].[add_data_tap] and provide GUID of the data flow task of the data flow component.
- Adding a data tap is applicable for the execution instance only in which it has been added and does not impact other executions.
- As you might have guessed already, adding a data tap impacts the performance of the package execution and hence you should only add it when you are doing troubleshooting.
- Try to limit the number of rows to be tapped, if the number of rows is large it might impact performance badly and consume lots of storage to store the large data files.
- Data files generated are accessible to the administrator (who started the execution that contains the package with the data tap) of the machine on which the stored procedure is run.
- You can use the above discussed T-SQL commands to tap the data without changing your package.
- As said before, it does have performance impact and generates lots of IO for storing data in data files, so be very careful when using it in production and do it only when there is an absolute need for troubleshooting.
Next Steps
- Review SSIS Package Deployment Model in SQL Server 2012 tip
- Review SQL Server 2012 SSIS Enhancements tip
- Review my all previous tips
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips