join the MSSQLTips community

Today's Site Sponsor


 

Find performance issues related to Analysis Services memory limits.
 


SQL Server 2008 Sparse Columns Identifying Columns For Conversion
Written By: Ken Simmons -- 12/4/2008 -- 2 comments -- printer friendly -- become a member



The level of compression is amazing.

        Win SQL Books  -----  SharePoint Tips  -----  Live Webcast - SQL Backup Mistakes  -----  Bookmark and Share        

Problem
SQL Server 2008 has introduced a new way to store data for columns that contain excessive NULL values called Sparse Columns. What this means is that when you declare a column as Sparse and any time a NULL value is entered in the column it will not use any space.  Is there a way to identify what columns would make a good candidate for this without having to analyze each column individually?

Solution
With the introduction of the new Sparse Column feature in SQL Server 2008, it is now possible to declare a column as Sparse and any time a NULL value is entered in the column it will not use any space. One of the tricks here is to figure out when to determine when a column should be defined as Sparse or not.

In the following example Address Lines 1 through 3 are required, Address Lines 4 and 5 are not required but often used, and Address Lines 6 through 8 are rarely used at all. When creating the table for Address Lines 6 through 8 we are using the SPARSE option, since we know this data will be rarely stored.

CREATE TABLE CustomerInfo
(CustomerID INT PRIMARY KEY
Address_Line1 VARCHAR(100) NOT NULL, 
Address_Line2 VARCHAR(100) NOT NULL, 
Address_Line3 VARCHAR(100) NOT NULL, 
Address_Line4 VARCHAR(100) NULL, 
Address_Line5 VARCHAR(100) NULL, 
Address_Line6 VARCHAR(100SPARSE NULL, 
Address_Line7 VARCHAR(100SPARSE NULL, 
Address_Line8 VARCHAR(100SPARSE NULL, 

So why not just declare all columns as Sparse Columns?

Sparse Columns require four extra bytes of storage for each non NULL fixed-length data type value in the table and zero bytes to store a NULL value; therefore it is very important to have the correct threshold per data type or you will end up using more space instead of gaining it. The fewer bytes a data type uses, the higher the percentage of NULL values are required to save space.

There is a table on MSDN that gives recommended percentages for using Sparse Columns. Take a look at this to help identify when you will get the benefits of Sparse Columns.

Using this table as a guideline, the following script will identify any columns that may qualify for the new Sparse Columns feature. By searching for columns in the database that have NULL values over a certain threshold, you can easily analyze the results and determine if the new feature may be applicable. The thresholds for the fixed-length columns are stored in a temp table and the data types that are dependant on precision and length will default to sixty percent.

Here is a sample of the output when run against the AdventureWorks database.  The NullPercent column should be compared to the ValuePercent column to determine if there is an advantage of using Sparse Columns.  As you can see in row one all rows for Sales.SalesOrderHeader for column Comment are NULL therefore the NullPercent is 100% and the ValuePercent is 60%, so this is a great candidate for using Sparse Columns. Note: the query above limits the output to only show which columns would gain the benefit of using Sparse Columns.

In the code above I commented out two lines, shown below, that can be used to limit the search to one table at a time. Just uncomment these and change the Schema and Table values for the table you want to analyze.

--AND INFORMATION_SCHEMA.COLUMNS.TABLE_SCHEMA = 'Person'
--AND INFORMATION_SCHEMA.COLUMNS.TABLE_NAME = 'Contact'

If you try to run the script on SQL 2000 you will have to change Varchar(MAX) to Varchar(8000) and will most likely have to search on a table by table basis.

Next Steps

Readers Who Read This Tip Also Read Comment or Ask Questions About This Tip Twitter This Tip!


Sponsor Information
Free SQL Server performance monitoring dashboard – Idera SQL check

Try SQL Object Level Recovery Native from Red Gate to save time and disk space. Download a free trial.

SQL Server Issues? Not sure where to turn for answers? Innovative SQL DBA consultants

Changing careers? Becoming a SQL Server Professional? Look no further...

Webcast - Top 10 SQL Server Backup Mistakes and How to Avoid Them

Make the most of MSSQLTips...Sign-up for the newsletter

Do you work on SharePoint too? Check this out...

Free whitepaper - Ten Things DBAs Need to Know About Storage


 

 



DB Nitro - SQL Nitro

SQL Nitro sits between SQL Server & its clients, optimizing the normally inefficient TDS protocol. Optimize TDS & compress the data up to 80%, reduce SQL bandwidth by 50%, & improve response times over 65%!

Download now!

More SQL Server Tools
SQL safe backup

SQL Compare

SQL comparison toolset

SQL secure

SQL Prompt




Copyright (c) 2006-2010 Edgewood Solutions, LLC All rights reserved
privacy statement | disclaimer | copyright | advertise | write for mssqltips | feedback | about
Some names and products listed are the registered trademarks of their respective owners.