In this section we will take a look at the full text index feature of SQL Server and how to create and use these indexes.
A full-text index is a special type of index that provides index access for full-text queries against character or binary column data. A full-text index breaks the column into tokens and these tokens make up the index data. Before you can create a full-text index you must create a FULL TEXT CATALOG and this catalog is where the full-text index data is stored. A full-text catalog can contain many indexes but a full-text index can only be part of one catalog. Also, only one full-text index can be created on each table so if you have more than one column you need to be indexed the columns have to be moved to separate tables. It is also important to note that full-text indexes are not updated right away as is the case with regular indexes. Populating full-text indexes can be resource intensive so there are more options that let you control when they are updated.
More details on each of these options can be found here.
While regular clustered and non-clustered indexes give us the ability to index most column datatypes they unfortunately are not supported for any of the large object (LOB) datatypes. Full-text indexes can however can be created on LOB datatype columns like TEXT, VARCHAR(MAX), IMAGE, VARBINARY(MAX) (it can also index CHAR and VARCHAR column types). Without this functionality any query that referenced a column defined with a LOB datatype would require a full scan.
Creating a full-text index does require some extra setup. As we mentioned in the opening full-text indexes require a full-text catalog for their storage. It's important to note that this catalog name must be unique across all databases on the server. An example of the TSQL that can be used to create a catalog is below.
CREATE FULLTEXT CATALOG fulltextCatalog AS DEFAULT;
Once the catalog is created we can create the full-text index. There are two extra options to take note of below. The first option is the "KEY INDEX". One requirement for creating full-text indexes is that the table has a unique key defined on it. This option associates the unique key with the full-text index. In order to get the best performance this unique key column should be an integer, usually it's the primary key. The second option, "STOPLIST", associates a stop list with the index. Any tokens that are part of the stop list are not populated in the index. Here is the TSQL to create the index.
DROP FULLTEXT INDEX ON Production.Document; CREATE FULLTEXT INDEX ON Production.Document(DocumentSummary) KEY INDEX PK_Document_DocumentNode WITH STOPLIST = SYSTEM;
This simple query listed below will check the document table for any summaries that contain the word "important" which should use the full-text index we just created.
SELECT * FROM Production.Document WHERE CONTAINS(DocumentSummary, 'important');
Looking at the EXPLAIN plan we can confirm it is performing a full-text search using the index and not scanning the entire table.