Getting File Properties using SQL Server Integration Services Add-In
You are developing a SQL Server Integration Services Package (SSIS) that loads the contents of a text file. The package is meant to run every hour and has to process the file only if its contents have changed. In this tip I will show you how to check the file properties in SSIS to solve this problem.
A large amount of Extract-Transform-Load (ETL) processes involves text files, mostly because they are the most efficient way to move data across different platforms. Thatís the case for large companies who have multiple database solutions internally. When you are creating an ETL process to schedule to load a flat file you have to add extra logic to the package depending on how the flat file is sent to you.
Usually the files are left in a separate folder so you can process them and leave them in a backup folder or just delete them. This way, at certain times a new file will be queued and ready to be picked up by using a For Each File loop in your SQL Server Integration Services (SSIS) package.
Another way less used is to have a text file with a fixed name on which the data is appended to the end of the file. If the file already exists when you load the data then you have to truncate the file.
In both cases itís useful to get information about the file you are loading, like the modification date or size. For example, when you intend to load a flat file with a fixed record size you can find out if the file has errors before loading it by dividing its size in bytes by the record length, if the result is not an integer then the file has bogus data.
In order to get file information, the first thing that may come to your mind is to use a Script Task, but there is a free add in to Integration Services from CodePlex called the File Properties Task (https://filepropertiestask.codeplex.com/) that requires less effort to achieve the same goal. This component comes with versions for Integration Services 2005 and 2008, but you can download the source code and adapt it to newer versions.
File Properties Task Add-In Installation
After downloading the custom task component you have to run the installer on a development machine and at the server that will execute the package. The installer doesnít require any more information than the installation path.
File Properties Task Add In Usage
Before starting to use this component take a look at the Integration Services Toolbox for the component named File Properties Task. If you don't see it then right click on the Toolbox and select the "Toolbox Items" option from the contextual menu.
A pop up window will open, then go to the "SSIS Control Flow Items" task and browse for a component named "File Properties Task" and check the box before its name and click the OK button to close the window.
File Properties Task - General Tab
Now drag the File Properties Task component in the package designer and double click on it. A configuration window will show with three tabs.
On the General tab for the File Properties Task we will setup the file we want the properties for. We can configure the file name as a static value or specify a variable that will hold the name of the file at runtime. Also we can set up how this component will behave if the file is not found and set up a wait time for the file to appear.
File Properties Task - Properties Tab
This Properties tab of the File Properties Task allows us to store file properties into user defined variables. For example you can store the file size into an Int64 variable and the creation date into a DateTime variable. Something to consider is that the component is strict with the data types of the variables and only shows the variables of the right type in the combo boxes. But this component is not limited to read file properties, it also can set the value of properties like creation date, access date and modified date.
File Properties Task - Attributes Tab
On this Attributes tab of the File Properties Task you can select if you want to store file attributes like readonly, hidden, compressed and so on into a user variable or set the value for those attributes from a variable.
In order to show the usage of this component letís assume that we have to load a text file only if its size is greater than zero. I will use a text file containing the data of the Products table from the Northwind database which you can download from here https://northwinddatabase.codeplex.com/. Then you will need to export the contents of the Products table to a text file.
Letís create a sample database.
USE [master] GO CREATE DATABASE [TestDB] CONTAINMENT = NONE ON PRIMARY ( NAME = N'TestDB_file1', FILENAME = N'E:\MSSQL\TestDB_1.mdf', SIZE = 128MB , MAXSIZE = UNLIMITED, FILEGROWTH = 64MB) LOG ON ( NAME = N'TestDB_log_file1', FILENAME = N'E:\MSSQL\TestDB_1.ldf', SIZE = 8MB, MAXSIZE = 2048GB, FILEGROWTH = 8MB) GO ALTER DATABASE TestDB SET RECOVERY SIMPLE GO
This script will create a table with the same structure of the Northwindís Products table.
USE [TestDB] GO CREATE TABLE [dbo].[Products]( [ProductID] [int] NOT NULL, [ProductName] [nvarchar](40) NOT NULL, [SupplierID] [int] NULL, [CategoryID] [int] NULL, [QuantityPerUnit] [nvarchar](20) NULL, [UnitPrice] [money] NULL CONSTRAINT [DF_Products_UnitPrice] DEFAULT (0), [UnitsInStock] [smallint] NULL CONSTRAINT [DF_Products_UnitsInStock] DEFAULT (0), [UnitsOnOrder] [smallint] NULL CONSTRAINT [DF_Products_UnitsOnOrder] DEFAULT (0), [ReorderLevel] [smallint] NULL CONSTRAINT [DF_Products_ReorderLevel] DEFAULT (0), [Discontinued] [bit] NOT NULL CONSTRAINT [DF_Products_Discontinued] DEFAULT (0), CONSTRAINT [PK_Products] PRIMARY KEY CLUSTERED ([ProductID] ASC) ) ON [PRIMARY]
On the package editor add a File Properties Task and a Data Flow Task component.
Then create an Int64 variable to store the file size and configure the File Properties Task according to the previous paragraphs.
Connect the File Properties Task and the Data Flow components, and then double click the green arrow to open the Precedence Constraint Editor. When it shows set the Evaluation operation combo box to Expression and Constraint. On the Expression textbox test if the file size is greater than zero.
The next image shows what you will see on the execution log when you execute the package. Notice that this component writes in the log the values read and where they were stored.
- In this tip you can learn how to Loop through Flat Files in SQL Server Integration Services.
- This one will show you how to Dynamically Name Text Files in SQL Server Integration Services.
- If you are new to the use of precedence constraints this tip will help: Dynamic SQL Server Integration Services SSIS workflow based on variables.
- Check out Integration Services Control Flow Transformations Tips Category.
About the author
View all my tips