Azure Data Catalog Register Data Source - Part 2
As a Business Intelligence (BI) developer you just deployed a Data Model on Azure Analysis Services along with sourcing your Tabular Data Model through Azure SQL Database and your organization has now asked you to help them with data discovery assuming there are several data models to be deployed soon which is going to make it a bit difficult to understand what data lies in which model. This is particularly important when your organization has heavily invested in a business intelligence solution and is also looking forward to speedy discovery of its enterprise data assets apart from BI powered analysis and reporting.
The solution is to register your Azure SQL database and Analysis Services databases (data models) with Azure Data Catalog to make data understandable and discoverable.
Part 1 Overview
In the first part of this tip we discussed in detail the basic concepts of enterprise data assets and why it is so important for the data assets to be discoverable, understandable and consumable.
We also mentioned that by data assets, we typically mean data sources which get registered by an enterprise in the form of an Azure Data Catalog data source registration to maintain the information about their location, usage and resource person.
Data Users (Producers and Consumers)
Those who are interested in discovering and understanding an organization's data assets are called data users.
We grouped data users into the following two types:
- Data Producers (those who register the data assets to make them available)
- Data Consumers (those who utilize the available data assets)
Tribal Knowledge vs. Azure Data Catalog
We compared tribal knowledge with Azure Data Catalog in the first part of the tip.
According to Microsoft documentation, getting tribal knowledge about data assets so that they can be discovered, understood and consumed is a very challenging process both from the data consumer and data produce perspective.
The following challenges are being faced by first time data consumers who are keen to find out the required data asset (data source) to meet some business specification:
- Existence of the Data Source
- Location of the Data Source
- Intended use of the Data Source
- Locating Documentation
- Locating Data Expert
- Process to Access Data Source
Data Assets Documentation Demand for Data Producers
One of the biggest challenges is to keep data source documentation in sync with data source usage which requires consistent reviews and often left out is the absence of a sophisticated documentation management system.
About Enterprise Data Assets
We further mentioned in the first part that the data assets (data sources) are not restricted to analysis services databases only, in fact they belong to any database system which is part of the organization.
The data assets registered with Azure Data Catalog can be from any database system including Line-of-Business data sources, OLTP (Online Transaction Processing) data sources, OLAP (Online Analytical Processing) data sources and Business Intelligence/Analytics data sources.
It was also mentioned that the data source can be annotated by tags, description, documentation, adding data controller and access information in the Azure Data Catalog.
Azure Data Catalog in Database Lifecycle Management (DLM)
Finally, it is highly advisable to add Data Assets to the Azure Data Catalog after the deployment stage in the Database Lifecycle Management (DLM), so that every time a database deployment takes place it also gets registered with the Azure Data Catalog to instantly become beneficial (discoverable, understandable and consumable) for both data consumers and data producers.
For further details please refer to my previous tip.
Registering Data Asset (Data Source) with Azure Data Catalog
Let's move to the standard steps of registering a data asset with Azure Data Catalog to get practical understanding of the concepts discussed in part 1. We are going to register Azure SQL Database first with Azure Data Catalog followed by registering Azure Analysis Services database (in the next tip).
This tip assumes that the readers have basic knowledge of working with Azure SQL database and are also familiar with basic data warehouse concepts including analysis services tabular projects.
This tip also assumes the following requirements are already met if you are interested in following the steps:
- You have an active Azure subscription to create Azure SQL Database(s) and Azure Analysis Services in-memory database (Tabular Data Model).
- A Resource Group has already been setup for Azure SQL Database and Azure Analysis Services server
- Azure SQL database OfficeSuppliesSampleV4 has been created.
- (Please refer to my previous tip Microsoft Azure Analysis Services Web Development Tools to follow the detailed steps to create the desired Azure SQL database.)
- OfficeSuppliesSampleV4 Tabular Data Model has also been deployed to Azure
- (Please refer to my previous tip Microsoft Azure Analysis Services Web Development Tools - Part 2 to quickly create Azure Analysis Services database.)
Azure Data Catalog Sign in
After deploying the model to Azure Analysis Services we can get more value out of it by registering it with Azure Data Catalog by making it discoverable and understandable. Please make sure that you either own or co-own the Azure subscription before you start registering the data asset with Azure Data Catalog.
If you are already signed in to your Azure Portal then you can launch Azure Data Catalog without signing in again.
Name, Subscription and Location Setup
Go to https://www.azuredatacatalog.com/ and after signing in do the following:
- Enter Data Catalog Name as OfficeSuppliesDataCatalog
- Please check your selected subscription level in the drop-down list
- Select the suitable catalog location, we have selected West Europe in this walkthrough
Choosing Pricing Model
Next choose the pricing model FEE EDITION and leave the rest of the options as they are:
Click the Create Catalog button at the bottom right of the web page to start creating a catalog:
Creating Manual Entry to register Azure SQL database
Please select Create Manual Entry when you are prompted to either Launch Application or Create Manual Entry:
Type in the name of the Data Asset (data source) to be registered as OfficeSuppliesSampleV4 and friendly name as Office Supplies Sample Azure Database followed by selecting Source Type: SQL Server, Object Type: Database and Authentication as Protocol:
Next type in the Azure SQL database and server name and add yourself (subscription holder) as an expert followed by adding a tag Sample Database then click Create and View Portal:
Viewing Registered Data Asset (Azure SQL Database)
Successful registration of a data asset with Azure Data Catalog takes you to the next screen where your Data Asset is ready to be discovered.
Next click Home on the top right corner to see the list of all the registered data sources with Azure Data Catalog:
Congratulations, you have successfully registered a Data Asset (Azure SQL database) with Azure Data Catalog which is now ready to be discovered and consumed.
The first part of the article was based on conceptual understanding of Azure Data Catalog and this part is focused on registering the Azure SQL Database with the Azure Data Catalog, so please look out for the next part of this tip series where we are going to register a Tabular Data Model with Azure Data Catalog and explore some useful features for quick data asset discovery.
Meanwhile please try the following:
- Please try creating the sample database on-premises as well and register it with Azure Data Catalog
- Please try adding more information about your registered data assets such as tags, description
- Please try adding connection string information to your registered data assets
- Please try using "Launch Application" option when interacting with Azure Data Catalog and the register any on-premises sample SQL Database (if you would like to install Azure Data Catalog on your machine)
- Please trying enriching your registered data assets with as much related information as possible and see the ease of exploring and discovering them through Azure Data Catalog
About the author
Haroon Ashraf's interests are Database-Centric Architectures and his expertise includes development, testing, implementation and migration along with Database Life Cycle Management (DLM).
View all my tips
Article Last Updated: 2019-03-06