Power BI Histogram Example using DAX

By:   |   Comments (5)   |   Related: > Power BI


Problem

Companies often use a Bell Curve approach to measure performance of various aspects of the business, such as employee performance. A histogram is a statistical concept and according to Wikipedia it is defined as "a graphical distribution of the numerical data". A histogram is made of several bins and a bin can be considered a range of values or a benchmark.

As part of this process, we have to divide the entire range into multiple bins and the range should be unique and continuous. Our grades in high school (i.e. A, B, C, D, E, F) can be considered an individual bin. If a teacher plots the student marks across the grades (bins) in a bar chart it sometimes follows a bell shaped pattern with a mix of high grade, medium grade and low grade students which could be used for assessing students. The same could be applied at company level by plotting an employee performance metric across bins to understand and assess employees.

Solution

For today's tip we will use the self service BI tool, Power BI desktop to create and understand how to use histograms. Consider yourself as running a Telemarketing company (XYZ Telemarketing) and you have hired Telemarketers to promote and sell your company's product. You are in need to assess the Telemarketers performance and we will use a histogram approach.

Before proceeding, make sure you have Power BI desktop installed in your machine, if not please follow this tip (Getting Started with Power BI Designer) by Scott Murray and install the product.

Open the Power BI Desktop application. Our first step is to load the necessary data for this project. The data for today's task is available to us in the form of two CSV files. The first one is the SalesData.CSV which has the Sales made by Telemarketers and the second one is Benchmark.CSV which has the benchmark (bins) defined for the Sales metrics. Let's load the first file. Click the drop down available for the Get Data option and select the CSV option as shown below.

Get Sales Data

It will open a pop up window as shown below. At that point, navigate to the location of Sales Data CSV file and select it and click the Open option as shown below.

Select Sales Data

It will open a preview window like below which will give you a preview of the data available in the Sales Data CSV file. Next click the Load button to load the data into the project.

Load Preview Window

Once the data is loaded you will see all the fields of the loaded data available in the Fields Section of the Power BI Desktop as highlighted below.

Sales Fields Available

Our next step is to load the Benchmark data. Repeat the above steps to load the Benchmark CSV file.

Get Benchmark data

When the Benchmark data is loaded you will see the fields of the Benchmark data available in the fields section as highlighted below. Lets navigate to the data tab by clicking in the area of where the arrow is pointed in the below image.

Benchmak Field available

You can see the loaded data is available in the data tab. We now have two tables with raw data. Our next step is to create a performance metric from the Sales Data table and link it with the range of values in the Benchmark table, so that we can create a histogram. In this case the performance metric is the ratio of Orders_Taken to the Calls_Made by the Telemarketer and we will call it Success_Rate. Double click the Sales Data as shown below to open the contents and create a new column in Sales Data by clicking the New Column option.

Add New column

Type in the DAX formula as shown below in the calculation bar of the New Column. Once you type fields like Orders_Taken or Calls_Made it will auto populate the fields for you in the formula bar and then hit enter. Please note the expression in Power BI Desktop supports only the DAX language. You will see the Success_Rate column getting populated as shown below.

Add formula

Let's format the Success_Rate column as a Percentage. Click the drop down available in the Format button and select percentage as shown below.

Format Percentage

Now the Succes_Rate column is available as Percentage with a 2 scale, let's make it to 0 scale by highlighting the Success_Rate column and changing the scale value to 0 as shown below.

Format Scale

Double click the benchmark table in the fields section and you will see the ranges Lower_Level and Upper_Level available in decimals, repeat the above formatting steps and convert it into a percentage as shown below.

Format Benchmark

Format Benchmark

As you see from the formatted image above (on the bottom) we have the lower and upper range defined for every bin in the benchmark table and we are at the step to map these bins to every record in the Sales Data table by joining them. Double click the Sales Data from fields section and create a new column which we will use as bin and call it RankCategory column.

New Column for bin

We will write a DAX formula like below in the formula bar of the new column added. This formula uses the CALCULATE function and checks if the Success_Rate column from Sales Data is greater than the Lower_Level and Lesser than or equal to the Upper_Level of the Benchmark, if it is true it returns the Bin value minus the Category field from Benchmark to the RankCategory column of Sales Data.

RankCategory = VAR CatVar=[Success_Rate] RETURN CALCULATE (VALUES ( BenchMark[Category]), CatVar>BenchMark[Lower_Level],CatVar<=BenchMark[Upper_Level])

Once you have typed the formula you will see the RankCategory column gets populated like below. We now have the performance metric (Success_Rate) mapped to the bin (RankCategory) as per the bin range provided and our next step is to plot and see which Telemarketer falls under which bin. For that let's navigate to the Report Canvas tab as shown in the second arrow below.

Rank Category DAX fomula

Double click the Clustered Column Chart available in the Visualizations, it will open up an empty visual like below in the report canvas.

Create Bar chart

Drag the RankCategory colum from the fields section to the axis and the Telemarketer to the value. You will see the empty visual turn into a chart like below.

Plot Histogram

We have distributed the Telemarketers across the bin and formed a histogram, note the Histogram looks in a Bell curve shape. Let's format the visual to make it more engaging by following the steps below.

1. Click the Format option in the Visualization section.

2. Turn the Title On for X-axis it will add a title as RankCategory.

3. Format the X - axis fonts to black.

4. Turn on the data labels, this will show how many Telemarketers are there in each bin.

5. Format the data label color to black.

6. Increase the font size of data label to 12 pt.

Format Histogram

Scroll down the format option and follow the below steps to format the visual more appealing as shown below.

1. In the Title Section, add a title as Telemarketer Histogram using DAX.

2. Change the title color to black.

3. Align the title to center.

4. Increase the title text to 12pt.

5. In the background section change the background color to light gray.

Format Histogram 2

We now have a well formatted histogram which shows how many Telemarketers are in each bin, to know who they are we will add a multi row card visual by double clicking the multi row card visual from the visualization section and drag the Telemarketer to the Fields as shown below.

TeleMarketer Row Card

Format the Telemarketer visual like below.

TeleMarketer Format

Move the Telemarketer visual to the right of the Histogram visual and adjust their positions. Minimize the Visualization and Fields section, you will have a well formatted histogram which follows a bell shaped curve like below. As I mentioned in the beginning of this tip by looking at the Histogram we can see the Top Performers (90-95%, 95-100%), Mid Performers (75-80%, 80-85%, 85-90%) and the Low performers (65-70%, 70-75%).

Hisogram Final

If you want to know who belongs to a bin, click the bin like below and you will see the Telemarketers name on the right.

TeleMarketer Filter
Next Steps

Learn more about Power BI in this 3 hour training course.


Click here to start the Power BI course






sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Junaith Haja Junaith Haja is a Senior Business Intelligence Consultant with Browse Info Solutions, Inc and leads a Microsoft SQL Server and BI team.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Tuesday, June 28, 2022 - 5:11:57 PM - Merry Hong Mei Back To Top (90216)
Hi, May I have an idea how you get the bell curve histogram? I followed your steps but my chart becomes bar with high from left low to right, seems no matter how I tried..

Tuesday, September 22, 2020 - 11:09:58 PM - Naidu Back To Top (86521)
Very easily understood even for a newbie. Great articulation Junaith

Sunday, October 27, 2019 - 12:18:53 PM - Sethu Back To Top (82914)

Simple and Elegant example. I am giving up my excel workarounds for your logic. Thanks for sharing.


Friday, September 8, 2017 - 10:41:58 AM - Angela Trapp Back To Top (66051)

Thanks for this. It helped with the histogram that I needed to build.

Any idea if it's possible to implement drill-down with this solution? I've tried several things and nothing seems to work.

 

 


Monday, October 10, 2016 - 9:38:52 AM - Pieter du Preez Back To Top (43526)

 Thanks for this example. Much appreciated.

 

I could get the sample to work, but not the model on my own data (age)

 

I simply have a table with Age of Employees, and another table to create the bins, but the RankValue remains stubbornly empty despite anything I do

Employee    Age

-------------------

Employee1   48

Employee2   50

Employee3   30

Employee4   25

 

AgeBin  Lower   Upper

--------------------------

15-25    15        25

25-35    25        35

35-45    35        45

45-55    45        55

 

Formulae used:

RankCategory = VAR CatVar=[Age] RETURN CALCULATE (VALUES ( AgeBIN[AgeBin]), CatVar>AgeBIN[Lower],CatVar<=AgeBIN[Upper])

 

 















get free sql tips
agree to terms