Learn more about SQL Server tools



solving sql server problems for millions of dbas and developers since 2006 join MSSQLTips for free SQL Server tips


























































   Got a SQL tip?
            We want to know!

Implementing Data Compression in SQL Server 2008

MSSQLTips author Edwin Sarmiento By:   |   Read Comments (6)   |   Related Tips: 1 | 2 | 3 | 4 | More > Compression

Problem
As we are looking through the new features in SQL Server 2008 we found a potentially interesting one called Data Compression which we can use to reduce the size of tables, indexes or a subset of their partitions.  Can you give us a detailed explanation of how we go about using this one?

Solution
Data Compression in SQL Server has been made available in SQL Server 2005 with Service Pack 2 where a new storage format for storing decimal and numeric data was introduced. The vardecimal storage format allows decimal and numeric data types to be stored as a variable-length column. This concept has been extended in SQL Server 2008 to all fixed-legth data types such as integer, char, and float data types. Data compression reduces the storage costs and increases query performance by reducing I/O and increasing buffer-hit rates.

SQL Server 2008 supports both row and page compression for both tables and indexes. Here's how these two types of data compression differ:

  • ROW Compression. This compression feature takes into account the variable data type structures defining a column. Take, for instance, a CHAR(100) column stored in a variable length storage format will only use up the amount of storage defined by the data. Storing "SQL Server 2008 " in the column will only require storing fifteen characters instead of the full 100 characters, thereby, a 85% savings on storage space. This is the extension of the vardecimal storage format made available in SQL Server 2005 Service Pack 2. Note, also, that this compression feature does not take any disk space for zero or null values.
  • PAGE Compression. This compression feature is a superset of ROW compression and takes into account the redundant data in one or more rows on a given page. It also uses prefix and dictionary compression. What this simply means is that for both page compression techniques, the storage engine cuts down on repeated data in the page. For example, if a table is partitioned using a column prefix, all data in a specific partition will have the same or similar prefix. Let's say the value of columns start with something like A1000Q-xxxx like some product codes, the storage engine store the A1000Q- once on the page and then refer to this value from all other occurrences of this value on the same page. This can also be said of a column with a defined DEFAULT constraint. Page compression only occurs when the page is full to optimize the performance.

While it may appear that data compression would reduce the size of your tables or indexes, it is best to first evaluate the estimated space savings in a table or index by using either the sp_estimate_data_compression_savings system stored procedure or the Data Compression Wizard. You also might want to check if the existing data is fragmented as you might be able to reduce the size of the index by rebuilding it instead of using compression.

Estimating storage space savings

We can use the sp_estimate_data_compression_savings system stored procedure or the Data Compression Wizard to find out the estimated storage space savings for atable of index. To use the sp_estimate_data_compression_savings system stored procedure on the AdventureWorks database:

USE AdventureWorks
GO
EXEC sp_estimate_data_compression_savings 'Sales''SalesOrderDetail', NULL, NULL, 'ROW' ;
GO 

where the:

  • first parameter is the schema name,
  • the second parameter is the object name,
  • the third parameter is the index id,
  • the fourth parameter is the partition id
  • and the last parameter is the compression type.

In the example above, I will be looking at ROW compression on the Sales.SalesOrderDetail table in the AdventureWorks database. Running the query will give you the results shown below.

Note the columns size_with_current_compression_setting (KB) and size_with_requested_compression_setting (KB) which will give you an estimate on the space savings you will achieve in the Sales.SalesOrderDetail table, assuming there is no fragmentation.

To use the Data Compression Wizard, right-click on the Sales.SalesOrderDetail table, select Storage and click Manage Compression.

This will launch the Data Compression Wizard. On the Welcome to Data Compression Wizard, click Next.

On the Select Compression Type page, click the drop-down menu on the Compression Type column to select your compression of choice. I'll choose ROW since this is the option used using the sp_estimate_data_compression_savings system stored procedure. Click the Calculate button to display a similar information provided by the stored procedure.

These two methods will give you an estimate of the storage space savings you will get by compressing your tables or indexes.

Enabling Compression on a Table

To enable compression on an existing non-partitioned table, use the ALTER TABLE command with the REBUILD option:

ALTER TABLE Sales.SalesOrderDetail
REBUILD 
WITH (DATA_COMPRESSION ROW);

The syntax above enables ROW compression on Sales.SalesOrderDetail table. You can also continue with the Data Compression Wizard to enable compression on a table. After clicking the Calculate button on the Select Compression Type page, click Next. On the Select an Output Option page, you can chose to either create a script for review or saving to a file, run immediately or schedule as a job. As a DBA, you would have to do this as part of your regular database maintenance so you might as well create a script for it and later on include in a database maintenance job. Click on the Next button to continue.

Review the Data Compression Summary page before clicking Finish.

You may want to create a script that iterates thru all of your tables to generate a summary report using the sp_estimate_data_compression_savings system stored procedure which will help you in analyzing which tables or indexes you can compress.

Next Steps

As anything comes at a cost, data compression will definitely cost you CPU resources. You should evaluate and test whether the benefits outweigh the cost.

  • Download an evaluation copy of SQL Server 2008 from this site
  • You can get the AdventureWorks database used in the sample here (click on AdventureWorksDB.msi).
  • Try out the examples above on your sample test databases to have an idea on the benefits you will gain from data compression
  • Check out Chad Boyd's blog entry on Data Compression in SQL Server 2008.


Last Update: 9/12/2008


About the author
MSSQLTips author Edwin Sarmiento
Edwin Sarmiento works as a SQL Server DBA for The Pythian Group in Ottawa and is a SQL Server MVP.

View all my tips


print tip Print  
Become a paid author





join MSSQLTips for free SQL Server tips     



Learn more about SQL Server tools
Comments and Feedback:
Friday, September 12, 2008 - 3:24:41 AM - @tif Read The Tip

 

Nice feature edwin. Please also tell that what are impacts of data compression on mirroring and replication.

Regards


Friday, September 12, 2008 - 11:32:35 AM - bass_player Read The Tip

There is a table on Books Online describing how compression affects replication.  With mirroring and log shipping, you have the option to do log compression so that the log backups being copied over to the standby server are already compressed.  I'll write another tip on log compression as it is used in SQL Server 2008 in the future


Tuesday, September 20, 2011 - 9:33:21 PM - Bismi Read The Tip

Can you tell us about restoring a compressed database to another server? any issues or anything we need to check? or any demerits of comrpessing data other than CPU usage?


Tuesday, September 20, 2011 - 9:59:20 PM - Bismi Read The Tip

Hi Again, From the books online, its saying "Compression does not affect backup and restore". But again, am really want to know that page level compression wil not make any issues in restoring the database into a differnt server.

thank you for your help.


Wednesday, September 21, 2011 - 9:29:45 AM - Greg Robidoux Read The Tip

Since this is an SQL Server Enterprise edition feature if you restore the database it has to be restored to either an Enterprise or Developer edition.


Thursday, September 22, 2011 - 9:55:08 AM - bass_player Read The Tip

There is a DMV that's been added to SQL Server 2008 to check for enterprise-only features in a database - sys.dm_db_persisted_sku_features

Make sure you check your databases for any enterprise-only feature on a regular basis

 

http://msdn.microsoft.com/en-us/library/cc280724%28v=SQL.100%29.aspx



Post a Comment or Question

Keep it clean and stay on the subject or we may delete your comment.
Your email address is not published. Required fields are marked with an asterisk (*)

*Name   *Email Notify for updates



Comments
Get free SQL tips:

*Enter Code refresh code


 
Sponsor Information







Copyright (c) 2006-2014 Edgewood Solutions, LLC All rights reserved
privacy | disclaimer | copyright | advertise | about
authors | contribute | feedback | giveaways | free t-shirt | user groups | community | events | first timer?
Some names and products listed are the registered trademarks of their respective owners.