Power BI Datasets

By:   |   Updated: 2019-09-23   |   Comments (5)   |   Related: More > Power BI

Problem

I want to create a reporting solution for my company using Power BI. I've been learning the tools, but I wonder what the difference is between a dataset and a workbook. When would I use which solution?

Solution

In this tip, we'll give an overview of the dataset and workbook objects of Power BI. It's not the goal to list and explain every possible feature, but rather give an introduction and explain their usage. You can follow along if you have a free, a Pro or a Premium Power BI subscription, but some features might not be available in all subscription models. The screenshots in this tip are made with Power BI Desktop August 2019 edition. You can download the latest version here. Since Power BI (PBI) Desktop is continuously evolving, it's possible the screenshots differ from the PBI Desktop version you're currently using.

Power BI Datasets

In PBI Desktop, there are 3 main ways to retrieve the data upon which you will build your visualizations:

  • Live. Here you connect to a server which holds all the data. No data is transferred, but the metadata of the model is imported into PBI Desktop. When you create visualizations, a query is sent to the server where it will be executed. The results are then returned to Desktop and visualized. Live connections are typically used with SQL Server Analysis Services (SSAS) models, be it Tabular or Multidimensional. In this case, PBI Desktop acts like any other thin client, such as Excel or Reporting Services (SSRS). It's not possible to make fundamental changes to the model, although you can add new measures to the model which are then available in that .pbix file.
  • DirectQuery. Similar to a Live connection, but here you can make more changes to the model. Like in Live, the data stays at the server and queries are executed on the server. But it's for example possible to create relationships in the PBI Desktop model.
  • Import. Here the data is imported in the PBI Desktop file (.pbix) using Power Query queries. The data is highly compressed so it's possible to load millions of records into a file on your machine. Behind the scenes, a model is created which is very similar to an SSAS Tabular model. This mode is the most flexible, since you can combine data from all possible sources. All data however needs to be imported into your model which can cause long refresh times.

If you have Power BI Premium, you can create a hybrid of DirectQuery and Import.

In short, a Power BI dataset is the model created in PBI Desktop that holds all of your data. For Live and DirectQuery you connect to the data, for Import you import the data. For the remainder of the tip, we'll use the Import mode.

Creating a Power BI Dataset

A dataset is linked one on one with the .pbix file where it was created. You can create a new dataset by clicking on Get Data when you start PBI Desktop.

get data 1

Or you can select a source from the dropdown menu:

get data 2

Let's suppose we imported a couple of tables from the SQL Server sample database WideWorldImporters (you can download the .pbix file here). In the Model view, you can see the tables and their relationships:

model view

In the Data view, you can see the actual data of one table at a time.

data view

In the Report view, you can create, view and interact with visualizations build on top of the data and the model.

report view

You could say the dataset comprises the data and the model view.

Publishing a Power BI Dataset

Once your content with your report, you can publish the .pbix file to the Power BI service. While PBI Desktop is free, you need a Power BI subscription to share your reports with other people. Simply click on Publish in the ribbon to start the process.

publish pbix

If necessary, you'll be asked to log into your Power BI account. Once logged in, you need to choose a workspace where you want to store your items. When publishing, the .pbix file will be split into two pieces: a report based on what you created in the report view and a dataset (the combination of the data and model view).

destination workspace

When publishing is complete, you can view the artifacts in the Power BI service.

publishing finished

Either click on the link as indicated in the screenshot, or go to app.powerbi.com to view the results. In the workspace, you can see there's a report and a dataset created with the name of the PBI Desktop file:

artifacts after publishing

If you click on your dataset, the service will open a new blank canvas where you can start creating reports, just as in Power BI Desktop:

new report

Another exciting feature is that you can re-use the dataset is a data source for a new Power BI Desktop file. In the Get Data dialog, you can select Power BI datasets as the data source.

dataset as a source

After connecting to the service, you can choose any dataset from any workspace you have permission to:

select dataset from service

This source behaves as a "live" source, meaning you only see the model. Just like you would when connecting to Analysis Services.

connecting to live dataset

The biggest advantage is that you can create one model for your company, publish it to the service and anyone else in your company can benefit from your modeling efforts. You'll have a centralized model which leads to less mistakes and less ambiguity.

Scheduling a Power BI Dataset Refresh

It's not practical to refresh your dataset manually in PBI Desktop and then upload it yourself to the service in order to have up-to-date data. Luckily you can schedule a refresh so your dataset contains the latest data automatically. It's also possible to refresh the data manually from the service for any uploaded dataset:

schedule refresh

If you click on the "schedule refresh" icon, you'll go to the properties page of the dataset:

dataset properties

There are two important source categories for a scheduled data refresh:

  • Cloud data sources. The Power BI service can connect to those sources without any intervention. Examples are: Azure SQL Database, Azure SQL Datawarehouse, Azure Analysis Services and SharePoint Online.
  • On-premises data sources. In this case, the Power BI service needs a gateway to connect to the data. The gateway is responsible for the connection between the data source and the PBI service. Keep in mind some cloud sources may still need a gateway, such as for example a data source inside an Azure Virtual Machine, but also some data sources from other parties, such as Snowflake for example.

Once you have a gateway configured, you can enter the data source credentials:

enter credentials

Finally, you can configure a schedule to refresh your data:

refresh schedule

Depending on your subscription, you can refresh your dataset more frequently.

Conclusion

In this part of the tip we gave an overview of the Power BI dataset object. It's the data and model from a Power BI Desktop file and can be compared with an Analysis Services Tabular model. A dataset can be uploaded to the Power BI service. It can serve as a source for new reports created in the service, but also for new Power BI Desktop reports. Finally, you can schedule a refresh of the dataset. Depending on the source data, you might need a gateway to establish a connection.

Next Steps


Last Updated: 2019-09-23


get scripts

next tip button



About the author
MSSQLTips author Koen Verbeeck Koen Verbeeck is a BI professional, specializing in the Microsoft BI stack with a particular love for SSIS.

View all my tips




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name
*Email
Email me updates

Signup for our newsletter

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.





Thursday, September 26, 2019 - 12:00:46 PM - J.M. Back To Top

Hi,

thanks for the comment to my question.

I look forward to the next parts of this tip.

kind regards

J.


Wednesday, September 25, 2019 - 11:41:52 PM - Jimmy Back To Top

Excellent article! Thanks a lot!! 


Monday, September 23, 2019 - 8:46:17 AM - Jeremy Kadlec Back To Top

J.M. and Pin,

Thank you for the feedback.  This is a 2 (or more) part tip.  This part focuses on the datasets and the next tip in the series will focus on the workbooks.

Sorry for any confusion.

Thank you,
Jeremy Kadlec
Community Co-Organizer


Monday, September 23, 2019 - 6:15:10 AM - J.M.-Schubert Back To Top

I can't find the answer of the question on the top of this article.

"...what the difference is between a dataset and a workbook. When would I use which solution?"

It's absolut possible that I don't understand your explanation in this article.

In this case, can you can you give me a hint where the point is where you explain the diffenrence and the advantages.

Thanks a lot.

Kind regards

M.-Schubert


Monday, September 23, 2019 - 5:26:17 AM - pin Back To Top

Title states: Workbooks vs Datasets, but u talk only about dataset...



download

























get free sql tips

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



Learn more about SQL Server tools