An FTP (File Transfer Protocol) server is often used for data exchanges in many data integration scenarios. SSIS includes an FTP task to download and upload data files to and from an FTP location, but how does this work and how can we configure it for data file downloads and uploads?
When data integration system or ETL processes can't connect directly to the data source system, an FTP server can be used as a temporary storage location for data exchange. Source systems can upload data to an FTP server and then the data integration system can download the data from the FTP server and import the data into the database.
SSIS being one of the leading ETL tools in the market includes an FTP task for working with an FTP server. Downloading data files from an FTP server to the local machine and other tasks can be used to import the data from the locally downloaded data files to the database/data warehouse. Not only does the FTP Task allow downloading of the data files, but there are a host of other operations that can be performed with this task as mentioned below:
Used for uploading files from the local machine to the FTP server
Used for downloading files from the FTP server to the local machine
Create local directory
Used to create a folder on the local machine
Create remote directory
Used to creates a folder on the FTP server
Remove local directory
Used to delete a folder on the local machine
Remove remote directory
Used to delete a folder on the FTP server
Delete local files
Used to delete a file on the local machine
Delete remote files
Used to delete a file on the FTP server
Connecting to an FTP Server Using Windows Explorer
There are different ways to connect to an FTP Server for data verification and one of the easiest ways is to use Windows Explorer. Just specify the FTP Server name with FTP protocol as shown below and hit <Enter>.
This will bring up a screen as shown below. Here we need to specify the credentials to connect to the FTP Server or choose to log on anonymously if the FTP Server allows anonymous access.
Next click on the Log On button to log onto the specified FTP Server with the specified credentials. Based on permissions, it will display folders and files similarly to what's shown below:
Using the FTP Task in an SSIS Package
In order to connect to an FTP Server, we first need to create an FTP connection manager which encapsulates information needed to connect to the FTP Server and the FTP task that uses that information at run time to connect to the server. In order to create a FTP connection manager create a new package and right click on the Connection Managers pane on the bottom and then select FTP as the connection manager type from the dialog box as shown below:
Clicking on the Add button on the above dialog box will bring up an FTP Connection Manager Editor to specify the different information which will be used to connect to the FTP Server. For example, the FTP Server name, port, credentials to connect to the FTP Server, etc...
Now you can drag an FTP Task from the SSIS Toolbox to the control flow, which will look similar to this:
Double click on the FTP task to change its properties in the Editor. The editor has 3 pages. On the General page we need to select the FTP connection manager that we created above for connecting to the FTP Server and then specify an appropriate name and description for the component as shown below:
On the FTP Transfer page, we first need to select the operation that we want to perform (more about these different operations can be found in the beginning of this tip) with this FTP task based on the Local Parameters and Remote Parameters options will appear or disappear. For example, when we select "Receive files" as the operation type then both Local Parameters and Remote Parameters options will appear, but if I select the "Create remote directory" option then only the Remote Parameters option will appear.
For this example I want to download a specific file from the FTP Server so I will chose "Receive files" as the operation type and then specify the remote parameters like location and name of the file to be downloaded and the local parameters like local machine folder where the downloaded file will be saved. Here, I can either specify hard-coded values for these parameters or the values can come from SSIS package variables making this process more dynamic.
As you might have noticed, we can select only one operation type for each FTP task and if we want to perform multiple operations we need to use multiple FTP tasks, one for each operation. Now if I execute the package we can see the file from the FTP Server getting downloaded and written to the folder that we specified. After the file has been downloaded I can then use a data flow task to load data from the locally downloaded file to the database.
Downloading Multiple Files using a Wild Card Character
In the above example, I simply downloaded one file from the FTP Server to the local machine. But what if I want to download or upload all the text files or all the XML files in one go. Well in that scenario, we can use a wild card character. For example, as you can see below I want to download all text files from the specified FTP Server location to the local machine and hence I specified "*.txt" for the file name.
Now the above specified wild card character will download all the text files to the local machine from the FTP Server. To load each of these files to the database I can use a ForEach Loop container in SSIS as shown below.
We can use the ForEach Loop Container with a ForEach File Enumerator and process all the data files from the locally downloaded folder; in other words, process each data file one at a time in a loop.
The FTP Task in SSIS does not support SFTP (Secured FTP), but there is a task available at codeplex which can be used for this scenario.
Is it possible to place a dynamic criteria for wildcharecters for receiving files from a ftp folder ,like a specific date based expression criteria such as last day of evefry month file from a list of files with date stamp.