Create Index for Full Text Search using Azure Cognitive Service and Azure SQL Database
Microsoft Azure Cognitive Services provides a full-text search engine, persistent storage of search indexes, integrated AI used during indexing to extract more text and structure, and APIs and tools. How do you create indexes using the sample dataset in Azure Cognitive Service? In this tip we explore how to create a search index for the Azure Cognitive Service using the Azure portal.
The earlier tip, An Overview of Azure Cognitive Search Service, gave an overview of Azure Cognitive Search Service. It further deployed cognitive service and imported data from the existing Azure SQL Database to demonstrate the search functionality.
Azure Cognitive Search provides full-text search functionality for heterogeneous data. Its key strengths are below:
- Azure Data Integration
- It can integrate with artificial intelligence (AI) for converting unsearchable content into text searchable.
- It adds AI enrichment and knowledge mining functionality.
- It supports Linguistics and custom analyzers with the support of more than 56 languages.
- It has valuable features such as auto-complete, synonym, auto-correction, rich query language, relevance tuning, geo-search, and result composition.
- It offers Azure scalability, reliability, and world-class availability.
Before you get started, make sure to have the following items ready:
- Azure subscription. You can also use Azure free credits - Create an account for free.
- Deploy an Azure Cognitive Search service using the previous tip.
Azure Search Configurations
Once we deploy the Azure Cognitive Search service, it can import data for building indexes.
Click on Import Data and you can choose existing data sources from the drop-down list.
For this tip, we will use Azure SQL Database as a data source. I have already deployed Azure SQL Database, as shown below.
You can follow these existing tips for deploying Azure SQL DB. For the deployment, you should enable the following option – Allow Azure Services and resources to access this server.
Also, I used the exiting [AdventureWorksLT] as a sample database.
Now, go back to Azure Cognitive Service Import data wizard and choose Azure SQL Database. Click on the option – Choose an existing connection and select the database we want to use for data import and indexing. It automatically populates the connection string for your database.
The connection string does not specify the password for the SQL user. Therefore, enter the password and click on Test Connection. Once the connection is validated, it shows status – Connection Validated.
After a successful test connection is made, the next step is to provide a table or view. Your user will use this table to search for required data. I am using [SalesLT].[Customer] table in this example.
Click on Next: Add cognitive skills. It is an optional step and required if you need optical character recognition (OCR) of text in image files and text analysis for unstructured data.
You can select which AI enrichment to be included in the skill set. I am not selecting anything here, just keeping the default settings and clicking on the skip button.
Customize Target Index
In the Customize Target Index step of the Wizard, you can now decide to review or change the auto-generated index schema. The Wizard automatically identifies the data fields and their properties based on analyzing your data sampling and deriving information from its metadata.
You can also modify the key field for a document. The key field is used to identify each document in your index uniquely.
Although there are quite a few options to choose from when it comes to stemmers, Azure Cognitive Search uses an Apache Lucene analyzer named Standard (standard Lucene). The analyzed text is converted into lower case characters. Indexed documents, search terms, and analyzed text are all part of query processing.
You can refer to this resource for different types of Analyzers.
Create an Indexer
An indexer is a crawling robot that extracts searchable data and meta information from an external Azure data source.
When this last step is reached, you can specify the indexer's name and its frequency. The indexer's name can be chosen manually, but you can let the tool give the chosen name automatically if you prefer. The chosen frequency is usually set up according to how often you think of new posts or updates. Therefore, you need to make sure to check back here frequently after inserting or updating data to control whether this one has been indexed already (or not).
If you want to clear a record from your index if it is deleted from the database, you can turn on the "Track Deletions" checkbox. Once enabled, you will have to specify additional input – the column which identifies a record as deleted (i.e., the column used for soft deletes).
Once the index deployment is successful, you can click on Indexer to view the documents it scanned. Currently, it shows 0 documents in the following screenshots.
Refresh the console, and it gives information about the number of documents scanned and the number of errors\warnings successfully.
Once the index is available, you can use Postman to perform HTTP requests or write some .Net code that allows you to connect with the Azure Search service to retrieve search results.
On the Azure portal, you can use Search Explorer for searching data using the Azure Cognitive service. Azure Search Explorer is a tool that allows users to learn the syntax of Azure Search, how queries work, and use an explorer to identify when their search indexes need updating. As a Cognitive Search user, you can tell whether any new content has been indexed since your last refresh by using the explorer in conjunction with query previews.
Click on the Search Explorer, and it would open a search window, where you can choose the index to use for your search query. For instance, if you want to search on a book repository, select the relevant index and enter the text, you require to search, and it shows you relevant results.
In this tip, we use a basic query. You can refer to this resource for understanding query strings.
For example, in the query string, I specified "Robert" and clicked on search. It returns all data rows matching the specified query string.
This tip was designed to teach you how to create and modify a search index using Azure Cognitive Services. You learned how to create an index using the Import Data wizard and the Search Explorer in the Azure portal to query and filter a search index.
- Read more about Azure Cognitive Search in Microsoft documentation
- Explore SQL Server Azure Tips for existing tips on Azure.
About the author
View all my tips
Article Last Updated: 2021-12-30