An Overview of Azure Cognitive Search Service

By:   |   Updated: 2021-12-28   |   Comments   |   Related: > Azure


Problem

Search engines make our lives easier. Google has proven to be a trustworthy option for people searching online as it takes them directly to the results they're looking for and typically returns them fast. We can search anything we want, whenever we want, and never have to worry about the best results popping up on the screen.

Imagine that you need to implement a "min-search-engine" similar to Google or Bing for your data. You would have an interface that allows users to get the essential information they require without fussing with settings and complicated input fields. All this is fine, but imagine going through all of implementing such a search function from scratch. We are all in agreement that it would be a very complex and time-consuming task.

This tutorial will explore how Azure Cognitive Search can help you implement a search service for heterogeneous data.

Solution

The Microsoft Azure Cognitive Search (Azure Search) provides a cloud search service in the search-as-a-service model. It provides a rich search experience over the content (on-prem or cloud) for your applications. Azure uses REST APIs or .Net SDK for search functionality and hides the internal implementation details from the developers.

The following diagram shows that the Azure Cognitive Search sits between your content (un-indexed data) and the client application. The client application sends the search request to the search service and handles the response.

Azure Cognitive Search

The Azure Cognitive Search Service has the following parts.

  • Data Source: The data source provider can be Azure SQL Database, Managed Instance, SQL Server on Azure VM, Cosmos DB, Azure blob storage container or Azure Table Storage, SharePoint Online (preview), Azure Data Lake Storage Gen2, or any dataset composed of JSON documents.
  • Index: Azure Cognitive Search creates the index on the specified data. The index is a persistent store of documents that are used for filtered and full-text search. Internally, Azure processes data into tokens and stores them into the inverted indexes for faster scanning. The automated crawler process (Indexer) runs at predefined intervals and defines changes based on SQL Integrated change tracking or high water mark change detection for Azure SQL Database.
  • Querying: Once the index is populated with search text, the client applications can send and receive a response from the search service. The search can include auto-complete, synonym matching, fuzzy matching, filter, sort, auto spell correction or pattern matching, Optical Character Recognition (OCR), and identification of visual features, such as facial detection, image interpretation, image recognition.

Let us go ahead and implement Azure Cognitive Search for the data stored in Azure SQL Database.

Create an Azure Cognitive Search Service in the Portal

To create the Azure Cognitive Search Service, navigate to the Azure portal and search for the keyword - Cognitive search.

The Create Search Service requires the following inputs.

  • Subscription and resource group
  • Service Name: You need to provide a service name in the instance details section. The service name is used for all API calls in the following format – https://<ServiceName>.search.windows.net. The service name should be unique in the search.windows.net namespace.
  • Location: Azure Cognitive Search is available in most Azure regions. However, you can refer to Products available by region for your Azure region based on AI enrichment, business continuity, and disaster recovery requirements.
  • Pricing tier: Azure Cognitive Search offers Free, Basic, Standard, or Storage Optimized pricing tiers with capabilities and limits. By default, it uses the Standard service tier. You can click on Change service tier and choose the required pricing tier by considering the indexes, indexers, storage, search units, partitions, and estimated search unit cost per month.
Azure Cognitive Search select price tier

For this tip, we use the free pricing tier as shown below.

Azure Cognitive Search instance details

Click on Review + Create for validation and Azure Cognitive Service deployment.

Azure Cognitive Search validation

The following page shows the Azure Cognitive Search dashboard.

Azure Cognitive Search dashboard

Use Azure Portal for Creating an Azure Cognitive Search Index

To create an API connection to Azure Cognitive Services, open the Azure portal and navigate to the dashboard page.

Azure Cognitive Search build full text search

Click Import Data on the connect your data bar to create and populate a search index. The import data page requires connecting with an existing data source such as Azure SQL Database, Azure Cosmos DB, Azure Storage, and SharePoint.

Azure Cognitive Search import data

Learning the indexing concept in Azure Cognitive Services provides a few sample data sets as well. Click on Samples and choose the required dataset. The dataset type shows that it has samples for Azure SQL Database and Azure Cosmos DB.

  • realestate-us-sample
  • hotels-sample
Azure Cognitive Search connect to data

For the tip, let's select data source - hotels-sample and Continue to the next page.

For the built-in sample index, a default index schema is already defined. You can run the queries in the target hotel-samples index for returning search data.

The Import data wizard simplifies the importing process by condensing steps into a basic importing configuration. At a minimum, you'll need to specify a name and a fields collection; one field should be marked as the document key to identifying each document uniquely. However, you're able to specify additional details (such as language analyzers or suggesters) if you want to autocomplete functionality or suggested queries.

As shown below, the index uses the HotelID column as an index key.

Azure Cognitive Search build index

Each column has the following attributes as the checkbox.

  • Retrievable: The retrievable defines a column to appear in the search result. For example, you might require limiting search result columns so that you can clear the checkbox from a column.
  • Key: It is a unique document identifier column, and it is a mandatory field and must be a string.
  • Filterable, Sortable, and Facetable: These attributes determine whether the column is used for filtering, sorting, or faceted navigation structure.
  • Searchable: The searchable field defines to include the column for full-text search. Usually, the string columns are searchable, while the numeric, Boolean fields are not searchable.

By default, the Azure Cognitive Search Service sets the attributes as below.

  • String columns: Retrievable, Searchable
  • Images: Retrievable, Filterable, Sortable, and Facetable.

You can change the column attributes as required. Let's go with the default attributes in the sample data set and move to the next page: Create an indexer.

Enter a suitable name for the indexer and define the schedule. However, you cannot change or modify the schedule for sample data sets or existing data sources without tracking changes. It allows setting once, hourly, daily, or custom schedule. The description is field is optional.

Azure Cognitive Search create indexer

Click Submit to configure and simultaneously run the indexer.

The wizard takes you to the indexer list, where a content analyst can review indexes, the number of documents scanned, and status. You can go to the overview page and click the indexers tab as well.

Azure Cognitive Search indexers

It might take a few minutes for the portal to update before you can see anything, but keep refreshing until the page shows the newly created indexer in the list - with a status of "in progress" or "success" then along with how many documents have been indexed.

The service overview page provides you with a list of links. Click Indexes to see the index you created. The indexer shows document counts and storage size.

Azure Cognitive Search indexes

Click on the index name and verify the fields with their attributes. Specific fields are greyed out, which means they cannot be modified or deleted.

Azure Cognitive Search index

You can also retrieve index definition in JSON format with an option – Index Definition (JSON).

Azure Cognitive Search index

Query Using Search Explorer

The Search Explorer handles REST API requests, and it works well with simple queries and full Lucene query parsers. You can launch search explorer in the following ways.

  • Launch search explorer from the Azure Cognitive Service home page.
Azure Cognitive Search search explorer
  • Use the search explorer from the Index menu.
Azure Cognitive Search search explorer

Specify the query string and click on search in the verbose JSON documents. You can specify the search keywords similar to Google or Bing search or specify a fully-specified query expression. Let's explore a few sample queries in the search explorer.

Example Queries

  • String query

The search parameter gives input of a keyword for the full-text search. The following query returns data from sample data set for those container "coffee" in any of the searchable fields of the document.

Query: search=coffee

The query returns all documents (records) marked as "retrievable" in the index.

Azure Cognitive Search sample query

Parameterized Query

The parameterized query returns the search result as per the specified conditions. To specify the parameters, use the following.

  • Use the & symbol to append search parameters. You can specify search parameters in any order in the query.
  • The query below uses $count=true for returning a total number of returned documents from the search. The result (value) appears at the top of search results.
  • The $top=5 parameter returns the number of documents as per their rank. For example, my sample query returns the highest ranked document in the search result.
Query: search=wifi&$count=true&$top=1

As shown below, the query returned 19 documents, and the highest-ranked document score is 4.961525.

Azure Cognitive Search sample query

Filter Data

The $filter parameter is used to specify the criteria for returning the results. For example, suppose we want to retrieve hotels whose rating is less than 4.

Query: search=wifi&$count=true&$filter=Rating lt 4

In the result set, we can verify that the search result includes documents satisfying the filter condition.

Azure Cognitive Search sample query

Facet the Query

The facet parameter returns aggregated count of documents matching a facet value, and it returns a navigation structure with the category and count.

Query: search=wifi&facet=Rating

The query returns facet for the rating based on the text search for wifi. To specify a file as a facet, it should be marked as filterable and to included in the results, and it needs to be retrievable.

As shown below, it groups the rating column and returns the result count at the top of the search result.

Azure Cognitive Search sample query

Highlight Search Results

The search query returns all columns specified as retrievable in the index configuration. If there are multiple columns in the search results, it might be challenging to find the corresponding column. Therefore, you can use the HIGHLIGHT keyword to format the matching text on the keyword. The query output highlights the field to make it easier to spot.

Query: search=beaches&highlight=Description
Azure Cognitive Search sample query

Wrapping Up

I shared this first tip on the new Azure Cognitive Search Service and created an index using the sample data set. In this tutorial, we explored the Cognitive Search Service with some use cases as a search solution. In the next tutorial, we will cover more details on getting results from Azure Cognitive Search. Stay Tuned.

Next Steps
  • Refer to these tips related to Azure.
  • Read Microsoft documentation on the Azure Cognitive Search service.





get scripts

next tip button



About the author
MSSQLTips author Rajendra Gupta Rajendra Gupta is a Consultant DBA with 14+ years of extensive experience in database administration including large critical OLAP, OLTP, Reporting and SharePoint databases.

View all my tips


Article Last Updated: 2021-12-28

Comments For This Article

















get free sql tips
agree to terms