Implementing Data Compression in SQL Server 2008

By:   |   Comments (7)   |   Related: 1 | 2 | 3 | 4 | > Compression


Problem

As we are looking through the new features in SQL Server 2008 we found a potentially interesting one called Data Compression which we can use to reduce the size of tables, indexes or a subset of their partitions.  Can you give us a detailed explanation of how we go about using this one?

Solution

Data Compression in SQL Server has been made available in SQL Server 2005 with Service Pack 2 where a new storage format for storing decimal and numeric data was introduced. The vardecimal storage format allows decimal and numeric data types to be stored as a variable-length column. This concept has been extended in SQL Server 2008 to all fixed-legth data types such as integer, char, and float data types. Data compression reduces the storage costs and increases query performance by reducing I/O and increasing buffer-hit rates.

SQL Server 2008 supports both row and page compression for both tables and indexes. Here's how these two types of data compression differ:

  • ROW Compression. This compression feature takes into account the variable data type structures defining a column. Take, for instance, a CHAR(100) column stored in a variable length storage format will only use up the amount of storage defined by the data. Storing "SQL Server 2008 " in the column will only require storing fifteen characters instead of the full 100 characters, thereby, a 85% savings on storage space. This is the extension of the vardecimal storage format made available in SQL Server 2005 Service Pack 2. Note, also, that this compression feature does not take any disk space for zero or null values.
  • PAGE Compression. This compression feature is a superset of ROW compression and takes into account the redundant data in one or more rows on a given page. It also uses prefix and dictionary compression. What this simply means is that for both page compression techniques, the storage engine cuts down on repeated data in the page. For example, if a table is partitioned using a column prefix, all data in a specific partition will have the same or similar prefix. Let's say the value of columns start with something like A1000Q-xxxx like some product codes, the storage engine store the A1000Q- once on the page and then refer to this value from all other occurrences of this value on the same page. This can also be said of a column with a defined DEFAULT constraint. Page compression only occurs when the page is full to optimize the performance.

While it may appear that data compression would reduce the size of your tables or indexes, it is best to first evaluate the estimated space savings in a table or index by using either the sp_estimate_data_compression_savings system stored procedure or the Data Compression Wizard. You also might want to check if the existing data is fragmented as you might be able to reduce the size of the index by rebuilding it instead of using compression.

Estimating storage space savings

We can use the sp_estimate_data_compression_savings system stored procedure or the Data Compression Wizard to find out the estimated storage space savings for atable of index. To use the sp_estimate_data_compression_savings system stored procedure on the AdventureWorks database:

USE AdventureWorks
GO
EXEC sp_estimate_data_compression_savings 'Sales''SalesOrderDetail', NULL, NULL, 'ROW' ;
GO 

where the:

  • first parameter is the schema name,
  • the second parameter is the object name,
  • the third parameter is the index id,
  • the fourth parameter is the partition id
  • and the last parameter is the compression type.

In the example above, I will be looking at ROW compression on the Sales.SalesOrderDetail table in the AdventureWorks database. Running the query will give you the results shown below.

1

Note the columns size_with_current_compression_setting (KB) and size_with_requested_compression_setting (KB) which will give you an estimate on the space savings you will achieve in the Sales.SalesOrderDetail table, assuming there is no fragmentation.

To use the Data Compression Wizard, right-click on the Sales.SalesOrderDetail table, select Storage and click Manage Compression.

2

This will launch the Data Compression Wizard. On the Welcome to Data Compression Wizard, click Next.

3

On the Select Compression Type page, click the drop-down menu on the Compression Type column to select your compression of choice. I'll choose ROW since this is the option used using the sp_estimate_data_compression_savings system stored procedure. Click the Calculate button to display a similar information provided by the stored procedure.

4

These two methods will give you an estimate of the storage space savings you will get by compressing your tables or indexes.

Enabling Compression on a Table

To enable compression on an existing non-partitioned table, use the ALTER TABLE command with the REBUILD option:

ALTER TABLE Sales.SalesOrderDetail
REBUILD 
WITH (DATA_COMPRESSION ROW);

The syntax above enables ROW compression on Sales.SalesOrderDetail table. You can also continue with the Data Compression Wizard to enable compression on a table. After clicking the Calculate button on the Select Compression Type page, click Next. On the Select an Output Option page, you can chose to either create a script for review or saving to a file, run immediately or schedule as a job. As a DBA, you would have to do this as part of your regular database maintenance so you might as well create a script for it and later on include in a database maintenance job. Click on the Next button to continue.

5

Review the Data Compression Summary page before clicking Finish.

6

You may want to create a script that iterates thru all of your tables to generate a summary report using the sp_estimate_data_compression_savings system stored procedure which will help you in analyzing which tables or indexes you can compress.

Next Steps

As anything comes at a cost, data compression will definitely cost you CPU resources. You should evaluate and test whether the benefits outweigh the cost.

  • Download an evaluation copy of SQL Server 2008 from this site
  • You can get the AdventureWorks database used in the sample here (click on AdventureWorksDB.msi).
  • Try out the examples above on your sample test databases to have an idea on the benefits you will gain from data compression
  • Check out Chad Boyd's blog entry on Data Compression in SQL Server 2008.


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Edwin Sarmiento Edwin M Sarmiento is a Microsoft SQL Server MVP and Microsoft Certified Master from Ottawa, Canada specializing in high availability, disaster recovery and system infrastructures.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Monday, April 27, 2015 - 5:19:23 AM - mehdoini abdallah Back To Top (37039)

hello

thank you for this article which appears to me very interssant
which inqiéte me on this before implementing this solution on production bases is the cost in terms of CPU
how can I calculate the increase CPU resource to implement this solution

Thursday, September 22, 2011 - 9:55:08 AM - bass_player Back To Top (14722)

There is a DMV that's been added to SQL Server 2008 to check for enterprise-only features in a database - sys.dm_db_persisted_sku_features

Make sure you check your databases for any enterprise-only feature on a regular basis

 

http://msdn.microsoft.com/en-us/library/cc280724%28v=SQL.100%29.aspx


Wednesday, September 21, 2011 - 9:29:45 AM - Greg Robidoux Back To Top (14713)

Since this is an SQL Server Enterprise edition feature if you restore the database it has to be restored to either an Enterprise or Developer edition.


Tuesday, September 20, 2011 - 9:59:20 PM - Bismi Back To Top (14708)

Hi Again, From the books online, its saying "Compression does not affect backup and restore". But again, am really want to know that page level compression wil not make any issues in restoring the database into a differnt server.

thank you for your help.


Tuesday, September 20, 2011 - 9:33:21 PM - Bismi Back To Top (14707)

Can you tell us about restoring a compressed database to another server? any issues or anything we need to check? or any demerits of comrpessing data other than CPU usage?


Friday, September 12, 2008 - 11:32:35 AM - bass_player Back To Top (1807)

There is a table on Books Online describing how compression affects replication.  With mirroring and log shipping, you have the option to do log compression so that the log backups being copied over to the standby server are already compressed.  I'll write another tip on log compression as it is used in SQL Server 2008 in the future


Friday, September 12, 2008 - 3:24:41 AM - @tif Back To Top (1805)

 

Nice feature edwin. Please also tell that what are impacts of data compression on mirroring and replication.

Regards















get free sql tips
agree to terms