Simple mathematical formulas using the M Language
By: John Miner | Updated: 2015-10-16 | Comments (4) | Related: > Power BI
Just recently, Microsoft released its latest Power BI desktop designer which can be downloaded from the following link. The designer is based upon Power Query to import data into the model. The x-velocity engine is used to save the data in a space saving format. The imported datasets can be related via relationships. The report visualizations can be categorized into different dashboard pages. This part of the designer is based upon Power View.
If you have worked with Power Query and Power view in the past, you will be right at home.
The Power BI desktop designer is a great tool; however, the hardest part of any Business Intelligence project is gathering and formatting the data into reportable datasets. How can we get a better understanding of the internal workings of Power BI?
The data section of the Power BI designer is where you import and manipulate data. Just like SQL Server Management Studio which obscures the TSQL generated by menu selections and actions, the Power BI designer is based upon the Power Query formula language informally know as "M".
Having a deeper understanding of the M language will help a BI developer in cases where the menus can not.
You are give a list of geometric shapes and measurements and the data is stored in a tab delimited format. Your task is to calculate the area for a given shape.
Before we can solve this problem, we need to build up our knowledge of the M language.
The M language is optimized for building queries that mash-up data. It is a case sensitive functional language similar to Microsoft F#. Both Power Query for Microsoft Excel and Power BI Desktop Designer can leverage this exciting language.
I definitely need to spend a little more time talking about functional languages. The basis of all these languages is Lambda calculus that provides the framework for describing functions and their evaluation. Some notable functional languages are LISP which was designed by John McCarthy at MIT in 1950's and ML which was designed by Robin Milner at the University of Edinburgh in the 1970's.
The F# language was developed by Microsoft and open source contributors. The F# 3.1 version released in November of 2013 is a cross platform language that can run on Linux, OS X and Windows. In short, the M language has some deep roots.
If we want to write a M language statement without using the menus, just choose the blank query option from the most common data sources. This menu pops up when you choose the home menu and select the get data button on the ribbon.
The most basic M language statement starts with the let reserved word. Each mathematical calculation builds upon the results of a previous one. Last but not least, the in reserved word tells the M language processor that we want to evaluate the expression.
The very first program that everyone learns at school is how to print "Hello World". I think that example does not apply well since Power BI works with datasets. Do not fret yet. The Table.FromRecords function is allows the developer to insert static data as sets of lists. I always use the US Presidents as a simple sample dataset. The M program and resulting output is shown below.
The M language has many different types of operators. Today, we are going to concentrate on comparison, numeric and logical operators. Once we have these tools in our belt, we will be able to tackle our geometry business problem.
The real power in the M language is the ability to pull data from almost anywhere and mash it up for analysis. As a society, many datasets are published on the internet. One popular statistical data set is the Iris multivariate data set introduced by Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems".
I will definitely not be talking about advanced statistics today. However, this set has 50 measurements of sepal and petal characteristics from three distinct species. In short, it is a small dataset that we can practice our formula's on.
Before we can start working with the M language operators, we need to pull the Iris dataset into our model. From the home menu, select the get data button form the ribbon. Right at the top is the world wide web choice.
The iris dataset is published on Wikipedia. Just enter the URL when prompted.
Many websites are secured by active directory credentials, basic credentials or anonymous. Just pick anonymous since Wikipedia is open to the public.
The Power BI designer is smart enough realize there is a table of data as well as the html document. We want to load the table. The resulting M language code uses the Web.Page and Web.Contents functions to pull in the data.
Choose the edit button since we want to start writing formulas using M language operators.
M Language Common Operators
Comparison or common operators can be applied to the following data types: null, logical, number, time, date, datetime, datetimezone, duration, text, or binary. See the table below for a detailed listing. I will be using the four measurements in the Iris data set to demonstrate their usage.
Greater than or equal
Less than or equal
Skip to the next section if you are experienced with comparison operators in another language.
A sample use of the greater than operator with an if then else statement.
A sample use of the less than operator with an if then else statement.
A sample use of the greater than or equal to operator with an if then else statement.
A sample use of the less than or equal to operator with an if then else statement.
The above four examples return a string depending upon the evaluation of the comparison (common) operator. If true, the first string is returned; otherwise, the second string is returned. We can mix it up by returning the logical outcome of the operator. Either true or false.
A sample use of the equal to operator returning a Boolean value.
A sample use of the not equal to operator returning a Boolean value.
The screen shot below shows the results of our six computed columns using common operators and the Iris dataset.
M Language Number Operators
Number operators work with numbers which can be integer, decimal or exponential in form. See the table below for a detailed listing. I will be using the four measurements in the Iris data set to demonstrate their usage.
Skip to the next section if you are experienced with numeric operators in another language.
A sample use of the sum operator using [sepal length] and [sepal width] as input.
A sample use of the difference operator using [sepal length] and [sepal width] as input.
A sample use of the product operator using [sepal length] and [sepal width] as input.
A sample use of the quotient operator using [sepal length] and [sepal width] as input.
A sample use of the unary negative operator using [sepal length] as input.
I have to complain loudly right now about the addition of this useless operator. The value of +2 and 2 are the same. Therefore, there is no need for this operator. I think every language designer includes the operator for mathematical completeness.
A sample use of the unary positive operator using [sepal length] as input.
The screen shot below shows the results of our six computed columns using numeric operators and the Iris dataset.
Just hang in there! This is the last set of operators to cover before our main event.
M Language Logical Operators
Logical operators allow you to test one or more operands and return a Boolean value. See the table below for a detailed listing. I will be using the four measurements in the Iris data set to demonstrate their usage.
Conditional logical OR
Conditional logical AND
Skip to the next section if you are experienced with logical operators in another language.
A sample use of the or operator using [sepal length] and [sepal width] as input. The expression returns true if one of the two conditions are meet.
A sample use of the and operator using [sepal length] and [sepal width] as input. The expression return true if both of the two conditions are meet.
A sample use of the not operator using [sepal length] and [sepal width] as input. Returns the opposite value of the sub-expression.
The screen shot below shows the results of our three computed columns using logical operators and the Iris dataset.
We have a tab delimited file of geometric shapes. For a given shape, we need to calculate the area using the formulas below. We will be apply some of the operators we learned today to solve this business problem.
Choose the home menu and select the get data button on the ribbon. Under the file item, we can see there is a text file option. Click this option to continue.
Browse to the location of the "area-of-basic-shapes.txt" file. Click the open option to continue.
The preview windows shows us a glimpse of our data set. Choose the edit button so that we can start editing our M language query.
We now can see our data set in the query editor. Choose to add a custom column. If you forgot how to do this task, refer to my previous tip.
The calculation for area has eight different outcomes given the nested if then else statement. We can see many of the operators discussed today are used in the formula.
Since this is the final query to solve our business problem, we want to change the M step variable names to be meaningful. It is interesting to note the Csv.Document function is used to parse the file loaded by the File.Contents function. The "#tab()" argument tells the function it is tab not csv delimited file. (click on image below to enlarge)
The screen shot below shows that we have the output we need.
The Power BI desktop designer is a FREE and powerful tool that will allow an end user to quickly import data into the model. The Power Query formula language informally known as M is based upon the F# functional language. If you are an expert with this language, you can start off with a blank query instead of using the menus to build up your query steps.
Operators are key to any functional language. Today, we reviewed the common (comparison), numeric and logical operators. We learned that Power BI can easily pull in data from web sites like Wikipedia.
Last but not least, we solved our business problem by loading a tab delimited file into the query designer. We used a complex if then else statement to return the correct calculation given a shapes name. This calculation used some of the operators we discussed today.
- Review these other M Language Tips
About the author
View all my tips
Article Last Updated: 2015-10-16