SQL Server XML Indexes
By: Ben Snaidero
In this section we look at XML indexes in SQL Server, why they exist and how they can be used.
What is an XML index?
An XML index is an index type that is specifically built to handle indexing XML type columns. There are two different types of XML indexes, primary and secondary.
A primary XML index stores and indexes all the parts of data in your chosen XML column. A primary XML index requires that there be a clustered primary key index on your table so the XML index can correlate rows from the XML index with rows in the table that contain the XML column. Also, note that there can only be one primary XML index on a table. Since primary XML indexes index the entire column, which makes them quite large in most cases, you can also add secondary XML indexes which further enhances performance by creating another index on top of the primary XML index.
There are three types of secondary XML indexes that can be created: PATH, VALUE, and PROPERTY. A PATH index is one that is created on the node path and can be used to speed up queries that use path expressions, most commonly the exist() method on XML columns in the WHERE clause. VALUE indexes are made up of keys of the node value and node path and are most useful when you are searching for a specific value but don't know the path. Finally, there is the PROPERTY index which is built on the base tables primary key along with the node path and node value. This type is useful when you know the primary key value of the base table and you are using the value() method on the XML column.
Why use an XML index?
The main reason for creating indexes on XML columns is performance. Since XML columns are stored using the BLOB datatype and can sometimes be quite large, shredding the XML column at runtime can be quite time consuming. Adding both primary and secondary indexes (although they do carry some overhead when creating/maintaining) to your XML column can reduce this time considerably.
How to create an XML index?
Creating the primary XML index follows a similar syntax that you would use to create regular index, all that needs to be added is the "PRIMARY" clause to the CREATE XML INDEX command as shown in the below example.
--create primary XML index CREATE PRIMARY XML INDEX PXML_Person_Demographics ON Person.Person (Demographics);
Creating secondary XML indexes also follows this pattern but there are two clauses that also need to be added. First, we need to define which primary XML index this is being create on using the "USING XML INDEX [#NAME#]" clause. Second, we need to define which type of secondary index this is going to be using the "FOR #TYPE#" clause. The below TSQL shows an example of each type secondary index.
-- create secondary XML indexes CREATE XML INDEX XMLPATH_Person_Demographics ON Person.Person (Demographics) USING XML INDEX PXML_Person_Demographics FOR PATH; CREATE XML INDEX XMLPROPERTY_Person_Demographics ON Person.Person (Demographics) USING XML INDEX PXML_Person_Demographics FOR PROPERTY; CREATE XML INDEX XMLVALUE_Person_Demographics ON Person.Person (Demographics) USING XML INDEX PXML_Person_Demographics FOR VALUE;
Confirm Index Usage
The following simple query using XQuery will get a count of people whose occupation is "Professional". Here is the TSQL.
;WITH XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/IndividualSurvey' AS ns) SELECT COUNT(1) FROM Person.Person WHERE Demographics.exist('(/ns:IndividualSurvey/ns:Occupation[.="Professional"])')=1;
If we check the EXPLAIN plan for this query we can see that it is indeed using the secondary index.