By: Koen Verbeeck | Comments (2) | Related: > Azure Data Factory
Problem
In part 1 of this tip, we demonstrated how to setup an Azure Automation account and how to create a PowerShell runbook to automatically start or stop an Azure-SSIS Integration Runtime. In this second part, we'll look into further automation of this solution by calling the runbook from an Azure Data Factory pipeline through webhooks.
Solution
If you haven't already, please refer to part 1 of this tip, where we configure the Azure Automation account and the PowerShell runbook, since we build further upon those artifacts in this tip. We assume though an Azure Data Factory (ADF) service has already been created, as well as an Azure-SSIS IR. The tip Configure an Azure SQL Server Integration Services Integration Runtime provides more information on how to set those up.
Starting and Stopping the Integration Runtime in an ADF Pipeline.
Creating the Webhooks
Runbooks can be started through a web interface using an HTTP request, the so-called webhook. By creating a webhook, we can start our runbook by using doing such an HTTP request from an ADF pipeline. We're going to create two webhooks: one to start the IR, and one to shut it down. Go to the runbook we created in part 1, and click on Webhook.
In the newly opened blade, choose to create a new webhook.
Specify a name and an expiration date. If this webhook is for production use, you might want to choose a date far into the future. Copy and paste the URL and save it somewhere, because once the webhook is created you cannot retrieve it anymore!
Once it is created, go to the parameters and run settings menu option:
In the parameter pane, fill in all the required parameters of the runbook:
When everything is specified, go back to the webhook blade and click Create to finish.
You can repeat the same process to create a webhook to stop the Azure-SSIS IR. Specify STOP for the command parameter and don't forget to save the URL! You can find an overview of all available webhooks for a runbook when you go to Webhooks (under Resources) in the left menu.
Creating the Azure Data Factory Pipeline
We're going to create an ADF pipeline in which we're going to call the webhooks we just created. In ADF, click on the ellipsis next to Pipelines and choose Add Pipeline.
In the newly created pipeline, drag a Web activity onto the canvas.
In the General tab, give the activity an appropriate name and leave the default settings.
In the Settings tab, copy paste the URL from the webhook configured to start the Azure-SSIS IR.
Choose POST as http method. We don't use the body to pass anything into the webhook, so you're free to specify what you want. Drag another web activity onto the canvas. This one will be used to stop the IR once all tasks have been performed in the pipeline. Copy paste the URL from the other webhook:
Of course, you're not creating an ADF pipeline to start the IR runtime just to shut it immediately down again. Most likely you'll want to run some SSIS packages in between. The problem with the Web activity is that it is asynchronous: it doesn't wait until the IR is actually started. Rather, the next activity is immediately started right after the HTTP POST message is sent to the webhook. This means we need to implement a mechanism that checks if the IR has actually been started or not (which might take up to 20-30 minutes). Add a Stored Procedure activity to the canvas and connect it to the Web activity starting the IR:
This Stored Procedure activity needs to connect to the SQL Server instance hosting the SSISDB database, so you need to create a linked service in the connections tab:
You can find an example of creating a linked service in the tip Connect to On-premises Data in Azure Data Factory with the Self-hosted Integration Runtime - Part 1.
In the SQL Account tab, choose the linked service you just created:
In the Stored Procedure tab, we're going to execute a custom SQL script with the sp_executesql stored procedure. Check the Edit checkbox, so you can enter the stored procedure name manually, instead of searching for it in the database.
Click on New to add a parameter. Specify stmt as name, string as type and the following script as value:
-- Wait until Azure-SSIS IR is started WHILE NOT EXISTS ( SELECT 1 FROM [SSISDB].[catalog].[worker_agents] WHERE IsEnabled = 1 AND LastOnlineTime > DATEADD(MINUTE, -1, SYSDATETIMEOFFSET()) ) BEGIN WAITFOR DELAY '00:00:30'; END WAITFOR DELAY '00:05:00'; -- wait an extra 5 minutes because the Azure-SSIS isn't somehow always ready
The final configuration looks like this:
The script checks an SSIS catalog view for the number of working agents (these are the nodes of the Azure-SSIS IR) are started in the last minute. The script will be looping for quite some time, until the IR is started. I noticed however that even though the worker agents were found, the IR itself wasn't somehow quite ready yet. Starting an SSIS package still resulted in an error. That's why I added an extra wait time of 5 minutes at the end, just to make sure the IR is up and ready.
After this activity, you can add other activities that will run your SSIS packages. These can either be other Stored Procedures activities or Execute SSIS Packages activities. The entire flow then looks something like this pipeline:
You can find more information about running SSIS packages in the Azure-SSIS IR in these tips:
- Executing Integration Services Packages in the Azure-SSIS Integration Runtime
- Parallel package execution in Azure-SSIS Runtime
All that's left is to configure a trigger on the pipeline to schedule the execution of the SSIS packages.
Conclusion
In part 1 and part 2 of this tip we showed how you automate the startup and shutdown of an Azure-SSIS Integration Runtime cluster. The solution uses a combination of Azure Runbooks, webhooks and Azure Data Factory pipelines. Keep in mind that starting the Azure-SSIS IR on demand adds a wait time of up to 30 minutes to your ETL window.
Next Steps
- Make sure you've read the first part of the tip to learn how to set-up the Azure Automation account and the PowerShell runbook.
- More tips about Azure Data Factory and the Azure-SSIS IR:
- Connect to On-premises Data in Azure Data Factory with the Self-hosted Integration Runtime - Part 1
- Executing Integration Services Packages in the Azure-SSIS Integration Runtime
- Parallel package execution in Azure-SSIS Runtime
- Configure an Azure SQL Server Integration Services Integration Runtime
- Customized Setup for the Azure-SSIS Integration Runtime
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips