By: Ben Snaidero
Overview
In this section we will look at columnstore indexes and how these indexes can improve performance of your queries.
What is a columnstore index?
A column store index is an index that was designed mainly for improving the query performance for workloads with very large amounts of data (eg. data warehouse fact tables). This type of index stores the index data in a column based format rather than row based as is done with traditional indexes.
Why use a columnstore index?
There are two main benefits of column store indexes. First they reduced storage costs. Column store indexes provide a very high level of compression, up to 10x, due to the fact that the data across columns is usually very similar and will compress quite well. Second is better performance. This benefit is multi-faceted. With a smaller index footprint, due to the compression, we reduce the amount of IO we have to perform. Also because of this small footprint we can fit more of the index into memory which helps to speed up processing. Finally queries often only query a few columns from the table. Since the data is stored in a column based format this also reduces the amount of IO that needs to be performed.
How to create a columnstore index?
Creating a column store index is done by using the CREATE COLUMNSTORE INDEX command and has many of the same options as a regular index.
The below TSQL create a simple column store index with all the defaults.
CREATE COLUMNSTORE INDEX IX_SalesOrderDetail_ProductIDOrderQty_ColumnStore ON Sales.SalesOrderDetail (ProductId,OrderQty);
Confirm Index Usage
Once the index is created we can write a basic aggregation query similar to what you would see in any data warehouse environment. This TSQL will simply sum the order quantity across each product.
SELECT ProductID,SUM(OrderQty) FROM Sales.SalesOrderDetail GROUP BY ProductId;
If we look at the EXPLAIN plan we can see that the entire query was satified by scanning the column store index. No table access was even required.