Cleaning, Validating and Enhancing the SQL Server Data Warehouse Contact Dimension
Melissa sponsored on-demand webinar
Contact data is acquired from a variety of sources. You may collect contacts at trade shows or from your web site. You may purchase lists of contacts. In most cases, cleaning, validating and standardizing the contact data is a challenge.
In a SQL Server Data Warehouse, the Contact dimension includes existing customers as well as prospects. The goal of the Contact dimension is to have clean, valid, and up-to-date data which can be used to communicate with contacts via email, mail, and phone, as well as perform analysis on the contacts based on demographics. Unfortunately, the built-in SSIS components do not provide the kind of data cleansing, validating and enhanced demographic data that you need. SSIS does provide the ability for you to create script components using .NET code to do these kinds of tasks. The Melissa SSIS components are plug and play; you simply drag and drop the components onto the Data Flow, configure the component properties, and you are ready to go. There is no coding required. The Melissa SSIS components provide a wealth of capabilities including:
- Properly parsing names and addresses
- Validating email addresses and phone numbers
- Detecting duplicate contacts
- Delivering enhanced contact information including latitude/longitude coordinates, and demographic data, which is not natively available in SSIS.
Since the Melissa components are available in SSIS, their capabilities are available to everyone who uses SSIS as their ETL tool.
In this session you will learn how to:
- Parse contact names and addresses into their respective individual columns
- Validate email addresses and get details such as a delivery confidence score
- Validate phone numbers and get details such as carrier
- Provide a single contact view using advanced deduping/matching capabilities and survivorship rules
- Perform fuzzy matching using single or multiple columns and selecting from a toolbox of algorithms to identify hard-to-spot similar and duplicate contacts
In this webcast we will demonstrate an ETL implementation for a Contact dimension in a typical data warehouse using Microsoft SQL Server Integration Services (SSIS) and the Melissa SSIS components. Join us and learn how to resolve common data quality challenges for SQL Server data warehouses.
Speaker - Ray Barley
Raymond Barley is a Business Intelligence Architect who has 15 years of experience in the design, development and implementation of ETL frameworks, data warehouses and business intelligence applications all using the SQL Server stack. He is a frequent contributor to MSSQLTips.com.