Deploy Azure Data Factory CI/CD Changes with Azure DevOps


By:   |   Updated: 2020-11-10   |   Comments   |   Related: More > Azure


Problem

In my previous article, Using Azure DevOps CI CD to Deploy Azure Data Factory Environments, I demonstrated how to deploy a dev Azure Data Factory environment to a different Azure Data Factory environment. This demo utilized GitHub for the source repo along with the preview 'Deploy Azure Data Factory' CICD release pipeline task. While this task has the capability of including start/stop ADF triggers without having the use PowerShell Scripts, along with creating and deploying a new data factory environment, I ran into issues when changing ADF configuration and connection properties from DEV to QA, probably since this task is still in preview. That said, what alternative methods and options do I have for deploying an Azure Data Factory to other environments while accounting for start/stop triggers and changing connection properties from DEV to QA during the CI/CD deployment process?

Solution

The critical need to deploy an Azure Data Factory from one environment to another using the best practices of the Azure DevOps CICD process presents a number of complexities to completing the deployment process successfully. In this article, we will cover how to utilize PowerShell scripts along with the Azure Resource Group Deployment task to start/stop ADF triggers and change ADF environment connection configuration properties through an end to end deployment of Azure Data Factory CI/CD changes with Azure DevOps.

Pre-Requisites

This demonstration assumes that a few pre-requisites have been completed. Below is the list of pre-requisites that must be completed prior to competing the end to end process. Additionally, read and complete the steps outlined in my previous article Using Azure DevOps CI CD to Deploy Azure Data Factory Environments.

1) Create the following resources in both a DEV and QA Resource Group (Note that the Data Factory for QA will need to be created as a prerequisite, which is a slightly different approach from my previous article where the 'Publish ADF Task' creates the Data Factory as well. For the purposes of this demo we will be manually creating the QA Data Factory. Explore blending the 'Publish ADF Task' with this process):

DEV Resource Group

devRG Dev Resource group resource list

QA Resource Group

qaRG Qa Resource group resource list

2) Create the following Secrets in both the DEV and QA Key Vault Accounts (Note that the secret names in both DEV and QA accounts are the same for deployment purposes, however the actual secrets are unique between DEV and QA):

a) ls-adls2-akv: This will contain the Access Key for both the DEV and QA ADLS2 Accounts, which can be found in the Access Keys section of the respective DEV and QA Data Lake Storage Accounts.

b) ls-sql-akv: This will contain the Admin Password of the DEV and QA Azure SQL Servers and Databases for authentication purposes.

DEV Key Vault Secrets

DevKVSecrets Dev Key Vault secrets list

QA Key Vault Secrets

QaKVSecrets Qa Key Vault secrets list

3) Remember to grant both the DEV and QA Data Factories access to both the DEV and QA Key Vaults by adding Key Vault Access policies.

DEV Key Vault Access Policies

DevAccessPolicies ADF added to access policies for dev KV

QA Key Vault Access Policies

QaAccessPolicies ADF added to access policies for Qa KV

4) Create the following Linked Services in the DEV Data Factory for testing purposes:

DEV Data Factory Linked Services

DevLinkedServices List of Linked services in dev

Below is the json script for the Azure Data Lake Storage Linked Service connection:

{
    "name": "LS_AzureDataLakeStorage",
    "properties": {
        "annotations": [],
        "type": "AzureBlobFS",
        "typeProperties": {
            "url": "https://adls2dev.dfs.core.windows.net",
            "accountKey": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AzureKeyVault",
                    "type": "LinkedServiceReference"
                },
                "secretName": "ls-adls2-akv",
                "secretVersion": ""
            }
        }
    }
}

Below is the json script for the Azure Key Vault Linked Service connection:

{
    "name": "LS_AzureKeyVault",
    "properties": {
        "annotations": [],
        "type": "AzureKeyVault",
        "typeProperties": {
            "baseUrl": "https://rl-kv-001-dev.vault.azure.net/"
        }
    }
}

Below is the json script for the Azure SQL Database Linked Service connection:

{
    "name": "LS_AzureSqlDatabase",
    "type": "Microsoft.DataFactory/factories/linkedservices",
    "properties": {
        "annotations": [],
        "type": "AzureSqlDatabase",
        "typeProperties": {
            "connectionString": "Integrated Security=False;Encrypt=True;Connection Timeout=30;Data Source=rl-sqlserver-001-dev.database.windows.net;Initial Catalog=rl-sql-001-dev;User ID=devadmin",
            "password": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_AzureKeyVault",
                    "type": "LinkedServiceReference"
                },
                "secretName": "ls-sql-akv"
            }
        }
    }
}

5) Create a trigger in the DEV Data Factory for testing purposes.

DevADFTriggger Trigger1 added to Dev ADF

6) Publish the DEV Data Factory to commit the changes to the adf_publish branch and to prepare for the DevOps CICD process. (Note that in this method, we will be using the adf_publish branch rather than the master branch for the CICD deployment process. For more on creating a CICD deployment from the master branch instead, see the 'Publish ADF Task'.

7) Verify that the Data Factory additions and changes have been published to the adf_publish branch in the GitHub Repo.

adf_publishRepo publish the adf changes to adf_publish branch in GitHub Repo

8) Finally, add the following Sample pre- and post-deployment script from Microsoft's article: Continuous integration and delivery in Azure Data Factory. Note that the script can be found toward the bottom of the page. Add the file within the same folder as the dev ADF resources and name it cicd.ps1

Ad dprepost deployment script Add the cicd.ps1 pre and post deployment script.

Create the CI DevOps Build Pipeline

Now that we have created all of the necessary pre-requisites, we are ready to begin creating the Azure DevOps continuous integration build pipelines in Azure DevOps.

Let's begin by creating a new Build Pipeline.

CreateBuild Create the Build Pipeline

Next, select 'Use the classic editor' link to prevent having to create the pipeline with YAML code.

ClassicEditor Use the classic editor to create pipeline

Next, select GitHub since that is where the code repo is saved. Also select, the repo name along with the adf_publish branch.

SelectBuildSource Select GitHub repo and branch for the build pipeline.

Next, start with an Empty job template.

EmptyJobTemplate Select an empty job template

Click the + icon on the Agent job to add a new task. Add the Publish build artifacts task.

PublishBuildArtifacts Add Publish Build Artifacts Task

Configure the Publish Artifact task; click save and queue.

ConfigurePublishBuildArtifacts Configure Publish Build Artifacts Task and save/run.

Verify the Run pipeline details and click Run.

RunBuildPipeline Run the build pipeline

Once the job completes running successfully, verify that the artifacts have been published.

VerifyArtifacts Verify the build artifacts

As expected, the folder appears to contain all the artifact that will be needed in the release pipeline.

Published Artifacts List of published artifacts from the build pipeline.

Create the CD DevOps Release Pipeline

Now that the CI Build pipeline has been successfully completed and verified, its time to create a new CD release pipeline.

Click New release pipeline.

NewReleasePipeline Create a new release pipeline.

Click add an artifact.

AddArtifacttoRelease Add artifact to release pipeline.

Select Build as the source type, select the build pipeline and complete the required details and click Add.

selectBuildSource Select the build source from the build pipeline published artifacts.

Next, add a stage.

AddNewStage Add a new stage

Start with an Empty job template.

SelectReleaseTemplate Select an empty job for the release pipeline template.

Next click the link to add a task.

AddReleaseTasks Add the release tasks.

Begin by adding an Azure PowerShell script task. This will be used to stop the Data Factory triggers.

AddPrePSScriptTask Add a pre Azure PS Script task

Also add an ARM template deployment task. This will be used to deploy the Data Factory artifacts and parameters to the desired environment.

AddARMTemplateDeployTask Add a ARM deploy Script task

Finally, also add another Azure PowerShell task. This will be used to re-start the Data Factory triggers.

AddPostPSScriptTask Add a post Azure PS Script task

Ensure that all three tasks are organized in the following order prior to configuring them.

ReleaseTaskList List of tasks in the release pipeline.

Azure PowerShell: Stop Triggers

Begin configuring the Azure PowerShell script to stop the Data Factory triggers in the QA environment.

Ensure that the script path is pointing to the cicd.ps1 file that we added to the GitHub Repo in the pre-requisite step.

Select Task Version 4* and select the latest installed version of Azure PowerShell.

ConfigureStopTriggerTask Configure the stop triggers task

Also, add the following pre-deployment script to the script arguments of the task.

Pre-deployment script

-armTemplate "$(System.DefaultWorkingDirectory)/_ADF-Demo-Project-CI1/dev/ARMTemplateForFactory.json" -ResourceGroupName rl-rg-001-dev  -DataFactoryName rl-adf-001-dev -predeployment $true -deleteDeployment $false

ARM Template Deployment

This release pipeline task will incrementally update the QA resource group with the template and template parameters of the DEV Data Factory that was published to the DevOps Artifacts from the build pipeline. These artifacts were published in the DEV Data Factory and committed to the adf_publish branch GitHub Repo.

Ensure the select the appropriate ARMTemplateForFactory.json and ARMTemplateParametersForFactory.json files.

ConfigureARMTemplateTask Configure the ARM Template Deployment task

Once all the other Azure details fields are completed, choose the Override template parameters by clicking the … icon.

OverrideTemplateParameters Change the override template params.

Change the override template parameters to the QA resources and connection properties and click OK.

OverrideParams List of the Override parameters.

Azure PowerShell: Start Triggers

Finally, let's configure the Azure PowerShell script to start the Data Factory triggers in the QA environment.

Ensure that the script path is pointing to the cicd.ps1 file that we added to the GitHub Repo in the pre-requisite step.

Select Task Version 4* and select the latest installed version of Azure PowerShell.

ConfigureStartTriggerTask Configure the start triggers task

Also, add the following post-deployment script to the script arguments of the task.

Post deployment script

-armTemplate "$(System.DefaultWorkingDirectory)/_ADF-Demo-Project-CI1/dev/ARMTemplateForFactory.json" -ResourceGroupName rl-rg-001-dev  -DataFactoryName rl-adf-001-dev -predeployment $false -deleteDeployment $true

Run the Release Pipeline

Now that we have added and configured the CD release pipeline tasks, its time to run the release pipeline.

RunRelease Run the release pipeline after the config is complete.

As expected, the release has succeeded.

StageSucceeded Confirmation that the pipeline succeeded.

Upon navigating to the logs, we can verify that all steps in the release pipeline have successfully completed.

ReleasePipelineLog Log of the successfully completed tasks

Verify the Deployed Data Factory Resources

As a final check to ensure that the QA Data Factory changes have deployed correctly, let's navigate to the QA Data Factory and check the Linked Services connections.

As expected, we can see the three linked services connections.

ADFQaLinkedServices List of ADF QA Linked Services created from the pipeline run.

Additionally, there is one trigger that was added and started. This is the trigger that will be stopped and re-started during the incremental CICD DevOps process.

QATrigger QA Trigger created and started in the QA ADF.

Upon drilling in further, we can see that the connection strings for the Data Lake has been changed and the connection was successfully tested.

LS_ADLS ADLS linked service connections pointing to QA connections.

The Key Vault connection was also changed to QA and successfully tested.

LS_KeyVault KeyVault linked service connections pointing to QA connections.

Lastly, the SQL Server and Database connections were also changed to QA and successfully tested.

LS_SQLDB SQL Server and DB linked service connections pointing to QA connections.
Next Steps


Last Updated: 2020-11-10


get scripts

next tip button



About the author
MSSQLTips author Ron L'Esteve Ron L'Esteve is a seasoned Data Architect who holds an MBA and MSF. Ron has over 15 years of consulting experience with Microsoft Business Intelligence, data engineering, emerging cloud and big data technologies.

View all my tips
Related Resources





Comments For This Article





download





Recommended Reading

Adding Users to Azure SQL Databases

Connect to On-premises Data in Azure Data Factory with the Self-hosted Integration Runtime - Part 1

Transfer Files from SharePoint To Blob Storage with Azure Logic Apps

Continuous database deployments with Azure DevOps

Reading and Writing data in Azure Data Lake Storage Gen 2 with Azure Databricks














get free sql tips
agree to terms