By: Koen Verbeeck | Last Updated: 2018-10-24 | Comments (2) | Integration Services Development
I want to transfer files from an FTP server to my local server. The server is an SFTP server however, so I can’t use the Integration Services FTP task. How can I resolve this?
In this tip, we’ll use the free tool PSFTP (from the PuTTy software family) in combination with SQL Server Integration Services (SSIS) to download a file from an SFTP server.
- This tip is an update from the tip Using SFTP with SQL Server Integration Services. There were a couple of questions about the use of PSFTP in the comments and this tip will try to answer those.
- There are plenty of other tools which also can handle SFTP (or FTPS) file transfers, such as WinSCP. This tip focuses solely on PSFTP, but the method used is very much similar when you would use other tools.
- The tip uses basic authentication with username and password. Complex set-up with public/private keys is out of scope.
- You don’t really need SSIS to transfer files with SFTP, you could use PSFTP (or other tools) from batch scripts just as easily. But typically, such a file transfer is part of a larger ETL process orchestrated by SSIS. For example, after downloading the file it could be ingested and written to a database, after which the file is archived. All of this can also be done in the same SSIS work flow.
- The example package in this tip is created with SSDT 15.7.3 for Visual Studio 2017. The project settings are configured to target SQL Server 2017, but the provided solution is valid for at least SSIS 2012 and up (probably SSIS 2005 and 2008 as well, but this hasn’t been tested).
SFTP File Transfer with SSIS
When you want to transfer a file with SFTP, you need an SFTP server of course. The SolarWinds SFTP server is free to use and easy to set up. In the settings, we configure the root directory of the SFTP server, which is a folder on your hard drive:
In the TCP/IP settings, you can optionally change the port:
In the Users tab, we add a user called MSSQLTIPS and we specify a password:
Click OK to exit the settings. Don’t forget to actually start the SFTP server. You might need administrative permissions to do this. You can also start the service from the Windows Services Manager (services.msc).
As the last step of the test set-up, we place a random file in the root directory. This file will be “downloaded” by SSIS and PSFTP:
Creating the SSIS Package
Add a new package to your SSIS project. On the control flow, add an Execute Process Task. This task will call the PSFTP executable.
Open the editor of the task. Configure it as follows:
- RequireFullName: this configures if the task fails if the executable cannot be found at the specified path. If the executable is added to the PATH environment variable, you can set this to False.
- Executable: the name of the executable you want to execute, in our case psftp. If you don’t have psftp added to the PATH environment variable, you can either copy psftp.exe to the working directory, or you can specify the full path to the executable: C:\Program Files\PuTTY\psftp.exe.
- Arguments: the arguments that you want to pass to psftp. First, we specify
the user and the server name, using the syntax
[email protected]. Then we specify the password with the switch -pw. Then the switch -be is used, which means continue on error when using a batch file. We also use the -batch switch to avoid any interactive prompts. Finally, we specify a batch file containing all the SFTP commands we want to execute on the SFTP server, using the -b switch. Typically, you’ll want to use an expression to pass the value to the arguments, so you can use parameters to make them more dynamic.
- WorkingDirectory: this is the location where we want to put the downloaded file.
- WindowStyle: this needs to be set to Hidden. We don’t want any interactive windows.
A full list of all the arguments you can pass to psftp (including how to specify an alternative port) can be found here. The contents of the batch file SFTPCommands.bat are the following:
cd MSSQLTIPS get BimlBasics.pptx
With the command cd, we specify the directory on the SFTP server we want to browse to. With the get command, we indicate which file we want to download to our working directory.
The Execute Process Task will give a warning when the psftp is not located in the working directory:
Let’s test the package. After running it, you should see the file in the working directory:
You can also verify on the SFTP server itself:
Deploying the Package to a Server
Normally you’re not going to run the package manually, but rather schedule it on a server. In this tip, we’ll use SQL Server Agent to schedule the package. There’s a bit of prep work that needs to be done:
- Make sure psftp is also downloaded and installed on the server. Optionally add it to the PATH environment variable there as well.
- You can either use the SQL Server Agent account to execute the SSIS package, but best practice is to use a proxy account. The tip Running a SSIS Package from SQL Server Agent Using a Proxy Account explains how you can set this up. Give the proxy account read/write permissions on the working directory and execute permissions on PSFTP.
- Use the same folder structure for your working directory, or parameterize your set-up.
- Adjust any firewall settings if necessary.
Deploy the SSIS package/project to the SSIS catalog on the server. Create a SQL Server Agent job and use an SSIS job step to schedule the package:
When writing this tip, the job failed when executing the package. The logging in the SSIS catalog didn’t have anything useful to say:
To start debugging a scenario like this, we can write the output of the PSFTP process to a log file, instead of relying on the SSIS logging. We need to slightly modify the Execute Process Task, as explained in this MSDN blog post. To redirect the output, we use the cmd executable, which calls the PSFTP executable with the /C switch:
At the end of the arguments, we specify log.txt as log file. When executing the package in SSDT, we get the following result:
Let’s deploy the package to the server and run the job again. As expected, the job fails, but the log is empty as well!
There’s also no activity logged on the SFTP server, so something goes wrong before PSFTP contacts the server. Let’s try something else. Instead of using an SSIS package in a job, let’s call PSFTP directly from a Operating system (CmdExec) job step:
This job will also fail, but when we check the error messages in the job history, we see something interesting:
So, this is the reason the SSIS package fails on the server: the connection is abandoned because server’s host key isn’t cached in the registry. If you would run PSFTP interactively, it would ask you if you would like to cache it and then it would move on. However, because we are running in batch mode, there is no interactivity so the connection is immediately abandoned. The solution to our problem would be to turn off batch mode, but how can we respond with “y” on the question if we want to cache they key, when the package is run unattended on a server?
The answer lies into redirection. Just as we redirected the output to a log file, we can redirect the contents of a file into the batch process. All we need is a simple flat file containing the character y.
Now we need to adjust the Execute Process Task so we can “answer” the cache question:
Don’t forget to remove the -batch switch. Deploy the package to the server and run the job again. This time the file transfer will succeed.
- If PSFTP gives too many issues, you can also try another tool, like WinSCP.
- Also check out the original tip, Using SFTP with SQL Server Integration Services, which has additional troubleshooting steps, such as setting Internet Options.
- You can find more SSIS development tips in this overview.
Last Updated: 2018-10-24
About the author
View all my tips