Learn more about SQL Server tools

 

Tutorials          DBA          Dev          BI          Career          Categories          Events          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Events

DBA    Dev    BI    Categories

 

Search for Multiple File Extensions with the SSIS Foreach File Enumerator


By:   |   Read Comments   |   Related Tips: More > Integration Services Control Flow Transformations

Problem

I have a requirement to search for multiple file extensions in a folder. However, in SQL Server Integration Services (SSIS) there is a limitation that the "Foreach File Enumerator" can filter only one type of file. Are there any workarounds to search for multiple file extensions?

Solution

Unfortunately it's a limitation of SSIS and the "Foreach File Enumerator" supports only one type of file.

default ForEachLoop Configuration

However this limitation can be overcome with a simple script task and a few lines of C# or VB.NET code. In this tip, I will walk through the solution in detail.

Solution Overview

I developed a simple package to explain the solution in detail. This package has a script task and this script task will loop thru the given folder for multiple file extensions. In addition, the script task will build an array to list the identified files. The details of the files will be stored as an object variable.

A "Foreach Loop" container has been created with "variable Enumerator". This will help us loop thru each lookup file. To showcase the functionality, I have simply added a script task to print the name of the file. So when you run this package, it is expected to search for files with many extensions and print the full path and name.

Solution Details

The below picture represents the package structure at a very high level.

Package Overview

The below variables have been defined in the package:

  • DataFolder - Location of the folder where the SSIS package will look for files
  • FileFilters - Multiple file extensions or patterns separated by a pipe symbol
  • FileList - Internal variable used by the SSIS package to store the list of files (object variable)
  • FileName - Internal variable used by the SSIS package to store the value of each file name
Package Variables

SSIS Script Task

The script task will search for the files in the data folder based on the file filters. The variables "DataFolder" and "FileFilters" have been defined as ReadOnly variables for the script task. The script task will search for files and store the list of files in the object variable "FileList". Hence that variable "FileList" has been defined as a ReadWrite variable.

The below picture represents the variable configurations for the script task.

Script Variables Configuration

Code Inside the SSIS Script Task

It is mandatory to refer to the below namespaces. This will help us to access the directory and file related functions with ease.

using System.Collections;
using System.IO;

The below represents the actual code inside the script task. The code has been written using C#. The same functionality can be achieved by using VB.NET.

// TODO: Add your code here 

string DataFolder, FileFilters;

DataFolder = Dts.Variables["User::DataFolder"].Value.ToString(); 
FileFilters = Dts.Variables["User::FileFilters"].Value.ToString(); 

// ArrayList will store multiple file names 
ArrayList LookupFiles = new ArrayList(); 

// Delimited file filteres will be extracted and stored as a individual item in a string array 
string[] strFileFilters = FileFilters.Split('|');

// for each filter find matching file names 
foreach (string FileFilter in strFileFilters) 
{
   // Adding found files in the array list for each file filter 
   LookupFiles.AddRange(Directory.GetFiles(DataFolder, FileFilter, System.IO.Searchoption.AllDirectories)); 
}

//copy the values of multiple files to the object variable. 
Dts.Variables["User::FileList"].value = LookupFiles; 

Dts.TaskResult = (int)ScriptResults.Success; 

Here is a screenshot of the code as well.

Script Task Code

In the first section of the code we are storing the values of the DataFolder and the FileFilters to local variables. As the fileFilter has multiple extensions, all the individual items have to be stored as an array using the split method. From this array the individual items can be looped thru using a foreach loop. Now the individual files can be searched using the Directory.GetFiles method. This method accepts a folder location, file pattern and an option as parameters. This method returns the available files as a string array. This string array can be added to an ArrayList. Finally the array list can be stored in the SSIS object variable. In our example the array list will be stored in the FileList object variable.

Foreach From Variable Enumerator

Now a "Foreach Container" has been used to extract the name of the individual file from the FileList object variable. The enumerator has been set as "Foreach From Variable Enumerator". The object variable FileList has been set in the Enumerator configuration. The "FileName" variable has been set in the variable mappings section. This foreach container will loop thru each item in the FileList object variable and assign the individual file name to the FileName variable.

The below pictures represent the configurations for the "Foreach Loop" container.

ForEach From Variable Enumerator

ForEach From Variable Mapping

Script Task Inside the Foreach Container

A script task has been placed inside the foreach container to print the name of the individual file. The variable FileName will be passed to the script task. The script task will be able to print the filename with the full path using the messagebox.show function.

The below picture represents the configurations for the script task (Inside the Foreach Loop).

Script Task Configuration

The below picture represents the code inside the script task to print the file name.

ScriptTask Code

Execution Results

Now the package can be executed to see the results. The below pictures confirm the solution has correctly identified the files with csv and txt extensions in the given folder.

ExecutionResult For First File

ExecutionResult For Second Variable

Summary

I recommend trying this option in your development environment with a reasonable number of data files. If satisfactory, then deploy and test the solution in a test environment.

Next Steps
  • Read more about the Script Task here
  • Other SSIS Tips can be found here


Last Update:






About the author
MSSQLTips author Nat Sundar Nat Sundar is working as an independent SQL BI consultant in the UK with Bachelor Degree in Engineering.

View all my tips


 









Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Notify for updates 


Get free SQL tips:

*Enter Code refresh code     



Learn more about SQL Server tools