Developing a File Watcher Task for SQL Server Integration Services

By:   |   Comments (9)   |   Related: More > Integration Services Development


Problem

SQL Server Integration Services (SSIS) does not have a default task to poll a folder to see if a specific file exists. There are third party controls that can assist us with this task, but not all organizations allow the use of third party controls in SSIS. In this tip, I will show how you can build an SSIS process to poll for a file.

Solution

Polling for a file is a common task used in ETL solutions. Polling means an application continuously checks for a change in the status of an external process by looking for a file whether it is an empty file or the file has actual data.  Most ETL solutions need to poll a specific directory for a file before moving to the next step, but unfortunately SSIS does not have a default task to achieve this.

There are various scheduling tools available in the market to poll a file or a directory and once the file is available the scheduling tool can execute a SSIS package. However these scheduling tools are very expensive and not preferred by all organizations.

Another approach is to use third party controls for SSIS. However use of third party controls is not always permitted in organizations. In addition, the installation of these components requires manual intervention to update the SSIS Packages and activate the license.

Hence I have come up with an alternative solution.

Solution Overview

This solution will use only the built-in tasks available in SSIS. This solution will use a For Loop and a Script Task component to poll for a file. Until it finds the file in the given folder, the For Loop¯ will run again and again. The Script Task has been placed inside the For Loop and for every iteration of the For Loop the Script Task will be executed.

The script task will check for the files in the given folder. If found, the script task will return the name of the file in a message box.

In addition the script task will reset a flag for the For Loop, so the For Loop¯ terminates on the next iteration.

A delay can be introduced in the script task after every iteration, so the polling can wait for a reasonable period of time before checking again.

Detailed Solution

So that you can understand the basics and you will be able to change the solution as per your requirements, I have created two user variables "FolderLocation" and "IsFileExists". These variables and their datatypes can be seen in the below picture.

SSIS Variables

Package Structure

This sample package has a For Loop¯ which contains a Script Task. The below picture highlights the package at very high level.

SSIS Package Overview

The below picture highlights the "For Loop"¯ configuration.

SSIS For Loop Configuration

We have to configure the Script Task to read the variable "FolderLocation" and to configure the variable "IsFileExists" to read/write. The below picture highlights the configuration in detail.

SSIS Script Task Configuration

Now let's write some basic code inside the script task. The below picture represents the actual code for the script task. As you can see this code simply pops up a message box.

SSIS code to pop-up a box

Warning

As we haven't set the package to terminate, it will run an infinite number of times. When you execute the package, it is expected that the message box will appear again and again. You may need to terminate the execution to stop the package. The below image represents the package execution in an infinite loop.

SSIS Package Execution with an Infinite Loop

Terminate the package after single iteration

The package can be terminated by setting the flag "IsFileExists"¯ to true as shown in the below image.

SSIS code to terminate the package after single iteration

When this runs, the package will complete in the first iteration as per the image below.

SSIS package will complete in the first iteration

Let's modify the script task to poll a file in the given folder. To use appropriate ".Net assemblies"¯, we need to refer the library "system.io" in the namespaces section. This is highlighted in the image below.

The code will look for a file in the given directory. If found, it will be return the name of the file and the package will terminate. The actual code can be referenced in the below picture

SSIS Package Code for File Watch

Now let's create a file in the folder location as per the image below.

Sample Data File to Watch For

A breakpoint can be set on the code to debug and investigate. During the execution the break point will be hit and the value of the variable can be validated by placing the cursor over the variable. The below picture represents the debug procedure.

Debug and Modify Code

After successful execution, it will display the message box as shown below. (Note you can copy the code further on in the tip.)

Message box that the file exists

Enhancing the Solution

It is possible that the data file may be delivered a little later than the expected, so the package should wait for some time before the next iteration. This can be done by introducing a delay in the script task. We need a new variable to store the value of the delay in milliseconds. The variable has been created and a value of 1000 has been provided. This can be seen in the picture below.

SSIS Variables for Delay Timer In MS

As this variable will be read by the script task, this needs to be referenced in the script task as shown below in the ReadOnlyVariables section.

variable will be read by the script task, this needs to be referenced in the script task as shown below in the ReadOnlyVariables section

The below represents the modified code to introduce a delay using "Thread.Sleep"¯.

public void Main()
{
   string DataFilesLocation; 
   string[] DataFiles; 
   Int32 DelayTimer; 

   DelayTimer = (Int32)Dts.Variables["User::DelayTimerInMs"].Value;

   DataFilesLocation = Dts.Variables["User::FolderLocation"].Value.ToString(); 
   DataFiles = Directory.GetFiles(DataFilesLocation); 
   if (DataFiles.Length > 0) 
   {
      MessageBox.Show("Data File Name: " + DataFiles[0].ToString()); 
      Dts.Variables.["User::IsFileExists"].Value = true; 
   }
   else 
      Thread.Sleep(DelayTimer);
 
   Dts.TaskResult = (int)ScriptResults.Success; 
}

Finalized code

The below image represents waiting after an iteration.

SSIS Package waiting after an iteration

Once a file has been found in the folder, the script task identifies the file and displays the name in the message box.

Once a file has been found in the folder, the script task identifies the file and displays the name in the message box

Scope For Improvement

This solution can be extended further to stop the package execution by terminating the "For Loop"¯ after a set amount of time (say after an hour or so). This can be easily done by finding the difference between the current time and the start time of the package. The start time of the package can be derived using the system variable "StartTime".

Summary

It is always recommended to terminate a package after a period of time instead of having it run forever. This can be managed as mentioned in the scope for improvement section. In addition, a SQL Server Agent Job can be scheduled to run the package again after a specific time interval.

Next Steps
  • Read more about the SSIS File Watcher here.
  • Learn more about script task here.
  • Learn about GetFiles function here.


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Nat Sundar Nat Sundar is working as an independent SQL BI consultant in the UK with a Bachelors Degree in Engineering.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Thursday, April 14, 2022 - 11:08:07 AM - vinay Back To Top (90005)
public void Main()
{
string DataFilesLocation;
string[] DataFiles;
Int32 DelayTimer;

DelayTimer = (Int32)Dts.Variables["User::DelayTimerInMs"].Value;

DataFilesLocation = Dts.Variables["User::FolderLocation"].Value.ToString();
DataFiles = Directory.GetFiles(DataFilesLocation);
if (DataFiles.Length > 0)
{
MessageBox.Show("Data File Name: " + DataFiles[0].ToString());
Dts.Variables["User::IsFileExists"].Value = true;
}
else
{ //Thread.Sleep(DelayTimer);
MessageBox.Show("else statement");

Thread.Sleep(DelayTimer);
}

Dts.TaskResult = (int)ScriptResults.Success;

}

Wednesday, May 26, 2021 - 8:53:32 AM - Scott Back To Top (88744)
What if the file is a large file and you need to ensure the file is not still being copied to the destination?

Tuesday, April 28, 2020 - 9:59:07 PM - David Back To Top (85495)

Good job.  I have not used SSIS in a few years and just getting back into it.  Your way of teaching makes it easy.

Thanks,

David


Tuesday, April 14, 2020 - 2:51:54 AM - sahil Back To Top (85359)

hi please explain the scope of improvement because i am unable to set time limit for execution of the loop please elaborate how can i set time for the loop


Wednesday, January 29, 2020 - 2:11:44 AM - rag Back To Top (84026)

Does the script works for C#2015. Can u update above full script task code


Thursday, December 20, 2018 - 3:57:54 AM - Naoufal Back To Top (78531)

 Interesting article,

what third party control do you recommend to do the same task ?


Friday, February 3, 2017 - 8:08:05 PM - Slava Back To Top (45870)

 

There is another way to notify the package that a file has come into folder - FileSystemWatcher. It is a bit better from point of view of performance as Thread.Sleep is not optimal thing.


Thursday, February 2, 2017 - 11:25:38 AM - Luther Back To Top (45810)

Instead of setting a pause in the script task, why not just schedule the watcher package to run every x minutes instead?

Seems you could even table drive the process with list of folders to watch, file patterns, and whether to launch another package, job, or just send email to process the file. That way, you can watch for multiple files, with multiple options on how to handle the file when it shows up.

 


Thursday, February 2, 2017 - 10:49:44 AM - Jason Back To Top (45808)

Nice article Nat.

Is there a cost when running this as a job if you replace the script with pure SSIS components?

 

For instance a for each container inside the infinite loop container that checks for a file using any type of wildcard. This would then place the file name in a variable which could be passed on to another task or a table? You could use another 3rd party tool like Pragmatic Works file property task in the free version to find out if the file is in uses as well. A SQL query task in line can be used to add a time delay for the next run until a file is actually found.















get free sql tips
agree to terms