Introduction to Power Query for Excel
By: Koen Verbeeck | Updated: 2015-04-28 | Comments | Related: More > Microsoft Excel Integration
As part of the Power BI offering by Microsoft, a couple of Excel add-ins are available to help you with your self-service BI solution. One of those add-ins is Power Query which enables you to intuitively gather and transform data from various sources inside Excel. This tip is an introduction to Power Query.
Power Query is a free Excel add-in that is sort of a "self-service ETL tool". According to the Office Help pages:
"Microsoft Power Query for Excel enhances self-service business intelligence (BI) for Excel with an intuitive and consistent experience for discovering, combining, and refining data across a wide variety of sources including relational, structured and semi-structured, OData, Web, Hadoop, Azure Marketplace, and more. Power Query also provides you with the ability to search for public data from sources such as Wikipedia."
Power Query was introduced to the public as the preview called Data Explorer. In fact, there is already a tip published on this early preview: Introduction to Data Explorer for Excel. Power Query has a monthly release cycle – meaning each months there are new updates and bug fixes for the product – so a lot has changed since this preview. The basic functionality however is still the same:
- You can search for data. However with some configuration you can now also search on-premises data, next to online data.
- You can extract data from various sources. The list of possible sources has almost doubled.
- If you want more advanced calculations, you can use the M language.
Now it's also possible to merge queries, to parameterize queries, to create functions, to send your queries to the online Power BI site on Office 365 and to load data directly into the Excel Data Model (aka the Power Pivot model). It's also worth mentioning that Power Query will be by default included in Excel 2016, if you have the correct edition of course.
For a very basic introduction to the user interface of Power Query, the tip Introduction to Data Explorer for Excel is still valid, although some screenshots might be out of date. For the remainder of this tip, I'll show two simple use cases where Power Query really is straight forward and easy to use in contrast with SQL Server Integration Services for example.
Reading Data from Excel in Power Query
First of all we're going to read data from an Excel file using Power Query. To start, we need to select the From Excel option in the From File menu.
In the dialog we need to pick the Excel file we want to import. Power Query will parse the Excel workbook and present you with a navigator, where you can choose the worksheets to import.
When hovering over a sheet, you get a preview of the data.
To edit the query - to filter, sort or transform the data – you simply click the Edit button, which will take you to the Power Query editor. In the editor you have an Office-like ribbon with a lot of options to manipulate the data.
The Home tab:
The Transform tab:
The Add Column tab:
The View tab:
The first transformation is using the first row as a header. This can be done by clicking the Use First Row As Headers button at either the home tab or the transform tab.
The Continent column contains group headers for the different countries. There are however a lot of NULL values. There are quickly resolved with the Fill functionality. Right-click the column, choose Fill and then Down.
The NULL values are now automatically replaced by their corresponding continent.
Now the query can be loaded to the Excel worksheet.
Aside from the data being imported into the Excel sheet, there is now also a "query object" added to the Excel workbook. A query is a refreshable object and you can edit it any time.
Imagine doing the Fill transformation in Integration Services. Since none of the standard components can do this, you'd have to resort to custom .NET logic in a script component.
Unpivoting Data with Power Query
In the next use case we're going to transform data from a table with multiple columns into a name-value pair table. This process is called unpivoting data. Integration Services has a transformation that can do this as well, but its editor is a bit confusing. As you will see, in Power Query it's just a matter of a few clicks.
As in the previous section, we need to read an Excel file into Power Query, select a sheet in the navigator and promote the first row as headers. The result looks like this:
All we have to do to unpivot his data is selecting the columns: Cow, Horse, Pig and Dog and clicking Unpivot Columns in the transformation tab.
And our name-value pair table is finished!
In this tip we quickly introduced Power Query as an easy-to-use, but very powerful data collector and enhancement for Excel. With two use cases we demonstrated how intuitive Power Query is and how it can apply transformations with almost no effort.
- If you want to explore the two use cases yourself, you can download both the Excel files here.
- In the tip Introduction to Data Explorer for Excel the preview of Power Query – Data Explorer – is introduced. It is worthwhile checking this tip out, as some of the functionality and the user interface is the same. In this tip data from Facebook is extracted.
- If you want to know more about Power Query and the M language, check out the book Power Query for Power BI and Excel by SQL Server MVP Chris Webb. Since Power Query is frequently updated, some parts may be already a bit dated. Most likely the user interface has changed the most, but most of the M language is still unchanged.
- If you want to know how to unpivot data using SSIS: How To Use the Unpivot Data Flow Transform in SQL Server Integration Services SSIS.
About the author
View all my tips
Article Last Updated: 2015-04-28