By: Koen Verbeeck | Comments | Related: > Master Data Services
Problem
In the new release of Master Data Services (MDS) in SQL Server 2016, a couple of new features have been added in order to improve performance. In this tip, we'll introduce the capability to enable data compression on an entity.
Solution
SQL Server 2016 Preview
At the time of writing, SQL Server 2016 is still in preview (currently Release Candidate 3 (RC3) has been released). This means functionality or features of Master Data Services might change, disappear or be added in the final release.
Data Compression
SQL Server has the ability to compress the data inside a table or an index. MDS uses this feature in order to reduce the size of the data on disk. There are two different kinds of data compression in SQL Server: row and page compression. A detailed treatment of the differences between the two compression methods is out of scope for this tip. For more information, please refer to:
- Implementing Data Compression in SQL Server 2008
- SQL Server Data Compression Storage Savings for all Tables
- Demonstrating the Effects of Using Data Compression in SQL Server
Master Data Services uses row compression. As said before, it will reduce the number of bytes necessary to store the data - which reduces I/O when reading data - but it comes with a cost for CPU. Compressing and decompressing data is CPU intensive. In previous SQL Server editions, data compression is an Enterprise-only feature. However, Master Data Services is available in Enterprise and BI editions of SQL Server. It remains to be seen if you can use data compression in MDS when using the BI edition.
In this tip, we'll create two entities: one with data compression enabled, and one with no compression. This will allow us to investigate the effects of this new feature in MDS. When you create a new entity using the browser, data compression is enabled by default.
However, if you create an entity using the Excel add-in, data compression is currently not enabled by default. There is no option to enable it in the dialog.
You can still edit the entity and enable data compression after it was created.
The following attributes were added to the compressed entity:
Another entity with the same attributes, but with data compression disabled, is created as well. We will use the data from the Customers dimension of the AdventureWorks sample data warehouse to populate both entities. With the following query, we insert the data into the MDS staging tables of the entities.
INSERT INTO MDS.[stg].[Customers_NotCompressed_Leaf] ([ImportType] ,[ImportStatus_ID] ,[BatchTag] ,[First Name] ,[Last Name] ,[Email] ,[Occupation]) SELECT [ImportType] = 1 -- insert only ,[ImportStatus_ID] = 0 -- ready for staging ,[BatchTag] = 'Upload Customers with No Compression' ,[FirstName] ,[LastName] ,[EmailAddress] ,[EnglishOccupation] FROM [AdventureworksDW2016CTP3].[dbo].[DimCustomer]; GO 5 -- insert 5 times
The data from the Customer dimension is inserted 5 times in order to create a data set that's big enough to see the effects of the data compression. Once the data is inserted, the batches have to be run from Integration Management.
On my machine, the batches takes about 9 to 13 seconds to insert 92,420 each (the batch size was set to 5,000 rows in the MDS Configuration Manager).
Let's take a look at the data size of the entities:
The compressed entity is almost 30MB in size (data + indexes), while the not compressed entity is almost 42MB. This is a difference of about 16%, which means 16% less I/O when all of the data has to be read from disk.
The savings probably could have been much bigger if page compression was used instead of row compression. For the moment row compression is the only option in MDS.
Let's add an index to see the effect. For more information about custom indexes in MDS, check out the tip Add a Custom Index in Master Data Services 2016.
When we take a look at the compression information of the table using dynamic management views, we can see that the new index is compressed as well.
Adding an index to the entity without data compression results in an index that's not compressed. The custom indexes inherit the compression settings of the entity.
Conclusion
Master Data Services 2016 uses row compression by default to minimize the data footprint of the entities on disk. This compression is also applied to the indexes. More significant space savings could have been obtained though if page compression is used. Be aware that data compression saves on I/O and disk storage, but has a an impact on CPU usage.
Next Steps
- For more information on data compression, please refer to the following tips:
- Implementing Data Compression in SQL Server 2008
- SQL Server Data Compression Storage Savings for all Tables
- Demonstrating the Effects of Using Data Compression in SQL Server
- You can find more Master Data Services tips in this overview.
- For more SQL Server 2016 tips, you can use this overview.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips