Using Data Compression in Master Data Services 2016

By:   |   Comments   |   Related: > Master Data Services


Problem

In the new release of Master Data Services (MDS) in SQL Server 2016, a couple of new features have been added in order to improve performance. In this tip, we'll introduce the capability to enable data compression on an entity.

Solution

SQL Server 2016 Preview

At the time of writing, SQL Server 2016 is still in preview (currently Release Candidate 3 (RC3) has been released). This means functionality or features of Master Data Services might change, disappear or be added in the final release.

Data Compression

SQL Server has the ability to compress the data inside a table or an index. MDS uses this feature in order to reduce the size of the data on disk. There are two different kinds of data compression in SQL Server: row and page compression. A detailed treatment of the differences between the two compression methods is out of scope for this tip. For more information, please refer to:

Master Data Services uses row compression. As said before, it will reduce the number of bytes necessary to store the data - which reduces I/O when reading data - but it comes with a cost for CPU. Compressing and decompressing data is CPU intensive. In previous SQL Server editions, data compression is an Enterprise-only feature. However, Master Data Services is available in Enterprise and BI editions of SQL Server. It remains to be seen if you can use data compression in MDS when using the BI edition.

In this tip, we'll create two entities: one with data compression enabled, and one with no compression. This will allow us to investigate the effects of this new feature in MDS. When you create a new entity using the browser, data compression is enabled by default.

Create a new entity in Master Data Services

However, if you create an entity using the Excel add-in, data compression is currently not enabled by default. There is no option to enable it in the dialog.

Excel add-in to Create an Entity in Master Data Services does not support compression

You can still edit the entity and enable data compression after it was created.

Edit the entity and enable data compression after it was created

The following attributes were added to the compressed entity:

The following attributes were added to the compressed entity

Another entity with the same attributes, but with data compression disabled, is created as well. We will use the data from the Customers dimension of the AdventureWorks sample data warehouse to populate both entities. With the following query, we insert the data into the MDS staging tables of the entities.

INSERT INTO MDS.[stg].[Customers_NotCompressed_Leaf]
	([ImportType]
	,[ImportStatus_ID]
	,[BatchTag]
	,[First Name]
	,[Last Name]
	,[Email]
	,[Occupation])
SELECT
	 [ImportType] = 1 -- insert only
	,[ImportStatus_ID] = 0 -- ready for staging
	,[BatchTag] = 'Upload Customers with No Compression'
	,[FirstName]
	,[LastName]
	,[EmailAddress]
	,[EnglishOccupation]
 FROM [AdventureworksDW2016CTP3].[dbo].[DimCustomer];
 GO 5 -- insert 5 times

Insert into staging

The data from the Customer dimension is inserted 5 times in order to create a data set that's big enough to see the effects of the data compression. Once the data is inserted, the batches have to be run from Integration Management.

Batches run from Integration Management

On my machine, the batches takes about 9 to 13 seconds to insert 92,420 each (the batch size was set to 5,000 rows in the MDS Configuration Manager).

Batches took about 9 to 13 seconds

Let's take a look at the data size of the entities:

Data size of the entities in Master Data Services

The compressed entity is almost 30MB in size (data + indexes), while the not compressed entity is almost 42MB. This is a difference of about 16%, which means 16% less I/O when all of the data has to be read from disk.

16% difference in data and indexes between the compressed and not compressed entities in Master Data Services

The savings probably could have been much bigger if page compression was used instead of row compression. For the moment row compression is the only option in MDS.

Let's add an index to see the effect. For more information about custom indexes in MDS, check out the tip Add a Custom Index in Master Data Services 2016.

Add index in Master Data Services

When we take a look at the compression information of the table using dynamic management views, we can see that the new index is compressed as well.

Check data and index compression

Adding an index to the entity without data compression results in an index that's not compressed. The custom indexes inherit the compression settings of the entity.

Conclusion

Master Data Services 2016 uses row compression by default to minimize the data footprint of the entities on disk. This compression is also applied to the indexes. More significant space savings could have been obtained though if page compression is used. Be aware that data compression saves on I/O and disk storage, but has a an impact on CPU usage.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Koen Verbeeck Koen Verbeeck is a seasoned business intelligence consultant at AE. He has over a decade of experience with the Microsoft Data Platform in numerous industries. He holds several certifications and is a prolific writer contributing content about SSIS, ADF, SSAS, SSRS, MDS, Power BI, Snowflake and Azure services. He has spoken at PASS, SQLBits, dataMinds Connect and delivers webinars on MSSQLTips.com. Koen has been awarded the Microsoft MVP data platform award for many years.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article

















get free sql tips
agree to terms