Real-Time Anomaly Detection Using Azure Stream Analytics


By:   |   Updated: 2020-01-14   |   Comments   |   Related: More > Azure

Problem

The need to process Big Data Real-Time streams is becoming an increasing need for many businesses. Customers in numerous industries are seeking to harness the power of real-time big data to unlock valuable insights. They are seeking an easy to use, flexible, reliable, and scalable solution to process and transform streams of real-time data for their IoT projects. Additionally, along with these big real time data streams come anomalies in the data. Customers in a variety of industries are interested in the concept of real time anomaly detection using machine learning algorithms and operators. How can we create and visualize a real time streaming anomaly detection service?

Solution

Azure Stream Analytics is an event-processing engine which allows examining high volumes of data streaming from devices, sensors, web sites, social media feeds, applications etc. It is easy to use and based on simple SQL query language. Additionally, it is a fully managed (Paas) offering on Azure that can run large-scale analytics jobs that are optimized for cost since users only pay for streaming units that are consumed.

Azure Stream Analytics now offers built in machine learning based anomaly detection capabilities to monitor temporary and persistent anomalies. This anomaly detection capability coupled with Power BI's real time streaming service makes for a powerful real-time anomaly detection service. In this article, I will demonstrate a practical example of how to create real time anomaly detection using Azure Stream Analytics for processing the stream and Power BI for visualizing the data.

Prerequisites

To begin implementing this solution, we will need a few resources to be created and running.

  1. Stream Analytics: Check out "What is Azure Stream Analytics?" for more detail.
  2. IoT Hub: Check out "What is Azure IoT Hub?" for more detail.
  3. Power BI Premium Service: This article, titled "What is Power BI Premium?" introduces key features in Power BI Premium.
  4. Device Details Simulator: The DeviceSimulator is used to simulate anomalies being sent from a device to IoT Hub. The schema uses temperature and a sensor ID. These events can then be consumed by an Azure Stream Analytics job configured to read from this IoT Hub. (Source)

Create an Azure Stream Analytics Job

Let's begin by creating a new Stream Analytics Job in the Azure Portal.

Create Stream analytics Step to create Stream Analytics

I will name my job and click create.

Stream Analytics Config Steps to configure stream analytics

Create an IoT Hub

While my Stream Analytics Job is spinning up, I will create an IoT Hub as well.

IoT Hub is a managed service that is hosted in the cloud and acts as a central message hub for bi-directional communication between IoT applications and the devices it manages. An IoT Hub is essentially an Event Hub with additional features including per-device identity, cloud to device messaging and several additional features that can be reviewed and considered in Microsoft's article Connecting IoT Devices to Azure: IoT Hub and Event Hubs.

Create IoT Hub Steps to create IoT Hub

In the Basics Tab, give your IoT Hub a Name and ensure that the Subscription, Resource Group and Region are configured correctly.

CongifIotHub Step 2 to create an IoT Hub

There is also an option to Size and Scale the IoT Hub as needed:

IotHub3 Step 3 to create an Iot Hub

After selecting the tier, Review the selections and then click Create to deploy the IoT Resource:

IotHub4 Step 4 to create an IoT Hub

Once the IoT resource is deployed, navigate to the IoT Hub and then click New from under the IoT Hub navigation menu:

IotHub4 Step 4 to create an IoT Hub & Add device

Then add a Device ID and click Save.

Create a Device Steps to create a device

Once the device is added, Click the device to open the device details:

DeviceID Verify that the device is created.

Next, copy the connection string (primary key) which will be used as the connection for the IoT Device.

Device Details Device Details are available here.

Download the Device Simulator

The DeviceSimulator is used to simulate anomalies being sent from a device to IoT Hub. The schema uses temperature and a sensor ID. These events can then be consumed by an Azure Stream Analytics job configured to read from this IoT Hub.

For more information on the DeviceSimulator and to download, visit Device Simulator on Github.

Once the DeviceSimulator is downloaded, open the visual studio solution file and run the simulator.

VS Device Simulator Run the device simulator.

Note the various setting available in the device simulator.

Review the readme.md GitHub file to understand the various configuration and settings available.

For example, Mock Mode is a method of experimenting with the simulator and various anomaly patterns without sending data to a live IoT Hub.

When we have the available IoT Namespace, device ID and device key, we will enter it in the IoT Hub Config section prior to running the simulator to ensure messages and data are sent to the IoT Device.

Device Simulator App Image of the device simulator App

Create a Stream Analytics Job

A Stream Analytics job comprised of an Input, a Query, and an Output will need to be defined and running in order to take the IoT Hub Device input, process it using the query and output it to a Power BI Real time streaming dashboard.

Add Stream Input

I'll start by adding an IoT Hub stream input.

Create Stream Analytics Job Steps to create a stream analytics Job input

Next, I will configure the input stream details.

IoTHubSource Config for the new IoT Hub input.

Lastly, I will see that the IoTHub Source Stream has been created.

IoTHubCreated Image showing IoTHub is created.

Add Stream Output

Similarly, I will also add an Output Stream.

PowerBISink Create the Power BI sink

Next, I will authorize Power BI to access the stream analytics job.

Authorize PowerBI Step to authorize power bi to access the stream analytics job

This will prompt me to enter my Azure Portal credential and then click next.

AzureLogin Step to login to azure.

I will then give my name my dataset and table along with setting the authentication mode to ‘User Token' since I am in dev/demo mode.

Also available is Managed Identity mode. For more information, check out Use Managed Identity to authenticate your Azure Stream Analytics job to Power BI

OutPut Details Step showing output details config

Lastly, I will see that the IoTPowerBIOutput Sink Stream has been created.

OutputCreated Image showing that the Power BI Output is created.

Write the Stream Analytics Query

My final piece to creating the Stream Analytics Job will be to write my SQL Query for my Anomaly Detection. In this scenario, I will be using the Spike and Dip Function.

Check out Anomaly detection in Azure Stream Analytics for more detail on machine learning based anomaly detection operators such as Spike and dip or Change Point.

Add Query Step to add and test SQL Query with anomaly detection operator
WITH AnomalyDetectionStep AS
(
    SELECT
        EVENTENQUEUEDUTCTIME AS time,
        CAST(temperature AS float) AS temp,
        AnomalyDetection_SpikeAndDip(CAST(temperature AS float), 95, 120, 'spikesanddips')
            OVER(LIMIT DURATION(second, 120)) AS SpikeAndDipScores
    FROM IoTHub
)
SELECT
    time,
    temp,
    CAST(GetRecordPropertyValue(SpikeAndDipScores, 'Score') AS float) AS
    SpikeAndDipScore,
    CAST(GetRecordPropertyValue(SpikeAndDipScores, 'IsAnomaly') AS bigint) AS
    IsSpikeAndDipAnomaly
INTO IoTPowerBIOutput
FROM AnomalyDetectionStep

Start the Stream Analytics Job

Now that I have configured all components of my job, I can Start the job.

StartASAjob Steps to start to ASA job
Start job2 Step 2 to start job
ASARunningMode Image displaying that the ASA job is running.

Create a Real-Time Power BI Dashboard

Once the job is running, I am ready to head over to Power BI to begin building my real time dashboard.

Create a Dataset

I'll start by navigating to my workspace containing my IoTPoweBIDataSet in the datasets section. Note that the Stream Analytics Job must be running and must have processed at least 1 event for the dataset to be created.

PowerBI DataSet Step to verify PBI Data set is created.

Create a Dashboard

Now that I have verified that my dataset was created, I will create a new Dashboard.

Create PBI Dashboard Steps to create a pbi dashboard.
Create PBI Dashboard2 Step 2 to create a pbi dashboard.

Add a Tile

I will then add a tile for my Real-time custom streaming data.

Add PBi Tile Step to add a PBI tile

I'll then select my dataset.

Streaming Title Add custom streaming tile.

I will add a Card visualization to track the count of spike and dip anomalies.

Streaming Tile2 Add custom streaming tile2

I will also add a line chart visualization to track the spike and dip scores.

Tile Details step to add tile details.


Run the Device Simulator

Now that I have finally configured all of my resource, I am ready to head back to my DeviceSimulator and configure the IoT Hub Details.

Once the details are configured, I will click ‘Update IoT Hub Config.

Device Simulator Config Steps to confit the device simulator

I will then select Anomaly settings for Normal Events and start the DeviceSimulator.

Start Device Simulator Steps to start the device simulator.

Monitor Real-Time Power BI Streaming

Once my device has been started, I can head back to my Power BI Dashboard to begin monitoring my stream.

As we can see, streaming details and now available on my Power BI Dashboard. Note also how the SpikeandDip Anomalies Card is still 0 since we are currently only streaming Normal events.

PBI Dashboard1 PBI Dashboard with normal events.

I will then head back to my simulator and change the Anomaly settings to Spike/Dip and will have it repeat every 2 seconds. As I can see, the anomalies have begun in the simulator.

SpikeDip Device Simulator Step to change anomaly to spike dip

When I head back to my real time Power Bi dashboard, I can see that the Spike and Dip anomalies have begun flowing into the dashboard and the SpikeandDip Anomalies Card is beginning to display the incoming anomalies.

PBI Dashboard2 PBI Dashboard with spike/dip events.

There are a few additional anomaly settings in the DeviceSimulator that could be explored as a next step.

Anomaly settings Additional Device simulator anomaly settings available.

Once I complete my tests, I will ensure that I head back to the Azure Portal and stop my Stream Analytics Job and Device Simulator as well.

Stop Streaming Stop the streaming job when finished.
Next Steps


Last Updated: 2020-01-14


get scripts

next tip button



About the author
MSSQLTips author Ron L'Esteve Ron L'Esteve is a seasoned Data Architect who holds an MBA and MSF. Ron has over 15 years of consulting experience with Microsoft Business Intelligence, data engineering, emerging cloud and big data technologies.

View all my tips
Related Resources




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name
*Email
Email me updates

Signup for our newsletter

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.






download

























get free sql tips

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



Learn more about SQL Server tools