Get List of Files from an FTP Server in SQL Server Integration Services
We use the FTP task in SSIS to process a number of files from an FTP server. We would like to implement a step in our SSIS packages that would retrieve the list of files that are available on the FTP server before we try to process them. The FTP task doesn't have an operation that will retrieve the list of files. Can you provide an example of how to do this?
All of the code for this tip can be downloaded here.
The FTP task provides the following operations as shown below:
As you have noted, there is no operation to retrieve a list of files. You can use the Script task to do this. Here is a sample SSIS package that we will review to get the list of files from an FTP server:
The main points about the above SSIS package are:
- Get Ftp File Listing is a Script task that will retrieve the list of files available on the FTP server. It will return the list of available files as an XML document.
- Stage Ftp File Listing is an Execute SQL task which will call a stored procedure to insert the list of files from the XML document into a table.
SSIS Package Setup
Before digging in to the package tasks shown above, let's discuss the package setup. The following variables are defined in the SSIS package:
The variables are used as follows:
- FtpFileListXML will be populated with the list of files available on the FTP server by the Get Ftp File Listing Script task
- FtpFileType is used to specify the type of file
- FtpServer is the URL of the Ftp server; e.g. ftp.yourservername.com
- FtpWorkingDirectory is the directory or folder on the Ftp server to get the list of files from
The FtpFileType, FtpServer, and FtpWorkingDirectory variables will be set on the command line when we run the package. This will allow our package to retrieve the list of files from any or our ftp servers.
The package uses the following Connection Managers:
- OLE DB Connection called Staging which is the SQL Server database where the list of files from the FTP server will be written
- FTP Connection which defines the FTP Server; the ServerName property is set to the FtpServer package variable
Get Ftp File Listing
Let's take a look at the Script task that is used to get the list of files from the FTP server. The Script page of the editor is shown below:
The FtpWorkingDirectory variable allows the package to specify the folder on the FTP server to query for the list of files. The script code will populate the FtpFileListXML variable with the list of files available on the FTP server as an XML document.
Now let's review the actual code in the Script task:
The main points about the above code are as follows:
- Step 1: declare a StringBuilder variable to build up the XML document containing the list of available files on the FTP server
- Step 2: setup the FtpClientConnection object using an FTP Connection Manager defined in the package; the FtpClientConnection class is provided in the Microsoft.SqlServer.Dts.Runtime namespace; the ftpFileNames array will be populated with the list of files; the ftpFolderNames array will be populated with the list of subfolders
- Step 3: connect to the FTP server and get the list of files and subfolders from the folder specified by the FtpWorkingDirectory variable
- Step 4: iterate through the ftpFileNames array and build an XML document with the list of file names
- Step 5: set the FtpFileListXML package variable to the XML document containing the list of file names
Stage Ftp File Listing
Let's take a look at the Execute SQL task that inserts the list of files available on the FTP server into a table. The create table script is as follows:
CREATE TABLE dbo.stg_FtpFileList ( FileType nvarchar(50) NOT NULL, FileName nvarchar(50) NOT NULL )
The following stored procedure is used to shred the XML document created in the Get Ftp File Listing task and insert the available files into the table above:
CREATE PROCEDURE [dbo].[usp_PutFtpFileList] @fileType nvarchar(50) , @xml xml AS BEGIN SET NOCOUNT ON; DELETE FROM [dbo].[stg_FtpFileList] WHERE [FileType] = @fileType INSERT INTO [dbo].[stg_FtpFileList] ( [FileType] ,[FileName] ) SELECT @fileType ,doc.col.value('@name', 'nvarchar(50)') filename FROM @xml.nodes('//file') doc(col) END
The above stored procedure takes advantage of new XML capabilities provided in SQL Server 2005; take a look at our previous tip Replacing OPENXML with the XML nodes() Function in SQL Server 2005 for the details.
Running the Package
To run the package use DTEXEC and specify values for the FtpFileType, FtpServer, and FtpWorkingDirectory variables on the command line; e.g. here is a batch file that you could use:
DTEXEC /FILE GetFtpFileListing.dtsx [variable assignments go here]
The entire DTEXEC command must be specified on a single line; replace [variable assignments go here] with the following:
After running the package you can query the stg_FtpFileList table to get the list of files available in a particular folder on an FTP server; e.g.:
- While SSIS provides quite a few useful tasks, occasionally you need something more. The Script task is likely a good alternative as it allows you to execute any VB .NET code inside of your SSIS package.
- Take a look at the sample code here to experiment on your own.
- The FtpClientConnection class used in the above example has other properties and methods that you may find useful; you can get the details here.
- The FTP task also has other properties that you may need to set such as, user name, password, etc.
About the author
View all my tips