Combining multiple files from multiple folders in OneDrive for Business in Power BI


By:   |   Updated: 2020-12-09   |   Comments   |   Related: More > Power BI


Problem

I have written in a previous article on how to leverage Excel sources in Power BI Paginated reports of which I described how to connect to a single Excel file within OneDrive for Business and how to set up a scheduled refresh for this dataset in the Power BI service. However, there are situations where one may want to combine several files from several folders within OneDrive for Business or SharePoint sources, then this might pose some challenges. In this article I have described a detailed technique on how to achieve this in Power BI.

Solution

There have been several articles relating to how to solve this challenge, but most have been on combining files from single/same folder in SharePoint or OneDrive for Business, you can find an example of this here by Microsoft. I will now provide the needed solution in a series of steps.

Step 1: Get data from the OneDrive location

The diagram below shows the structure of the OneDrive for Business location with four different folders holding similar datasets, but of different months. This has been prefixed by "Data&AI" for a reason I will explain later (though not mandatory to include, but helpful).

Snapshot showing Folders in OneDrive location

When the "August" folder is opened for example, it contains three .csv files for the month of August only. Similar files are also in the "July", "September" and "October" folders for data for those months as seen in the diagram below. This can be in some other forms in your own business environment, but the solution is the same logic.

Snapshot showing csv files in OneDrive folders

We now understand the structure of how the datasets are saved within the OneDrive for Business location, and next is how to combine the datasets in each month automatically for reporting purposes. Let's connect to the OneDrive location where the datasets are saved so we can get the datasets into Power BI as seen in the diagram below.

Snapshot showing how to connect to OneDrive location in Power BI

In this case we need to select the "SharePoint folder" option as our source connector. Then copy the URL link for the OneDrive location of the folders as seen below.

Snapshot showing where to copy the url link to OneDrive location

The link should look like the one below. Please note x y and z represent sensitive information I needed to remove.

https://xyz-my.sharepoint.com/personal/x_y_z_com/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fx%5Fy%5Fz%5Fcom%2FDocuments%2FSample%20Data%20Sources

But the full link would not work when used for the connection, thus it should be edited to look like the one below.

https://xyz-my.sharepoint.com/personal/x_y_z_com

Then paste the shortened URL on the Site URL as seen in the below diagram.

Snapshot showing how to edit and paste the url link

You might need to sign into your Microsoft account if not done so already as seen below.

Snapshot showing connection authentication in Power BI

When you successfully sign in, you should see something like the output below, just go ahead a click on "Transform Data".

Snapshot showing initial landing page of the connection in Power BI

That should now take you to the Power Query Editor window.

Step 2: Filter out unwanted data

The initial look in the Power Query Editor provides us with all the datasets in the OneDrive for Business along with their extensions and other details as shown in diagram below.

Snapshot showing landing page of the connection in Power BI after selecting Transform

However, since our intention is to filter out other datasets and use only the ones we require in this project, then the need for the prefix at the front of each folder names as I mentioned earlier becomes evident. The prefix "Data&AI" would be used to filter for the datasets we need on the folder path as seen in the image below.

Snapshot showing how to filter the data to for required datasets 1

When we select the Contains text filter option, we enter "Data&AI" as seen below.

Snapshot showing how to filter the data to for required datasets 2

This would then filter it down to the datasets we need as seen in the diagram below.

Snapshot showing how to filter the data to for required datasets 3

Step 3: Work on a single dataset at a time

The technique here is to start by working on a single type of the dataset at a time. To do this we need to filter on the "Name" column and select a single dataset as seen in the diagram below.

Snapshot showing how to filter the data to for required datasets 4

Then we need to combine the files for "CurrencyRate.csv" from the different months as seen below.

Snapshot showing how to expand the view of the combined datasets

So, by now you can now see the "CurrencyRate" dataset has now been successfully combined. For the purpose of this tip, I will mention that Power Query has created a function for one of the files and then ran it through each of the files from each folder to combine them. You can learn more about functions and parameters in Power BI from this awesome blog.

Snapshot showing combined dataset

You can then rename the Query1 to an easily understandable table name like "Currency_Rate" for example and remove the "Source.Name" column as it is not required.

Step 4: Use the Advanced Editor Query to configure the combining of the other datasets

To do this we need to click on the Currency_Rate table and select the "Advanced Editor" to expose the M query behind it as seen in the diagram below.

Snapshot showing how to copy the M query of the dataset

Then we need to make just one change to be able to work on the next dataset. Let us work on the "Orders.csv" dataset. So, we go copy the M code, and close the window. We need to connect to a new Blank Query source as shown below.

Snapshot showing how to create a new dataset  using Blank Query

Again, with the new Query that opens, we need to click on the "Advanced Editor" again so we can paste the M code we just copied earlier into it as seen below. After that, we need to go copy the name of the new dataset (so we do not get the spelling wrong) and then replace it with that of the "CurrencyRate" as seen in the diagram below.

Snapshot showing how to alter the dataset being combined using the M Query

Initially, this might create a query with errors like the one below.

Snapshot showing where to look at for Applied Steps

To resolve this, we can take out the applied steps to fix the issue (note that this might not always be the first three steps, just ensure you can see the filter symbol on the "Name" column that should be where to combine) as shown below, and then we need to combine again for this dataset.

Snapshot showing applied steps window
Snapshot showing where to navigate to for correcting errors on applied steps

So, as can be seen in the diagram below, we have successfully combined the files for Orders in the different folders as we did for the CurrentRate dataset.

Snapshot showing new dataset combined as new query

This can be repeated for as many other datasets in different folders that need to be combined.

Note that for each dataset a new function is created as pointed out earlier. After combining the datasets, the queries can be loaded accordingly, and report visuals created and published to the Power BI service.

See my article on how you can schedule a refresh for the datasets within the Power BI service. Note that you would not require an On-Premises Gateway as the source is Online (i.e. OneDrive for Business or SharePoint source).

Next Steps
  • You can learn more about Parameters and Functions in Power Query from here.
  • Try this tip out in your own environment and adapt it with your business logic.


Last Updated: 2020-12-09


get scripts

next tip button



About the author
MSSQLTips author Kenneth A. Omorodion Kenneth A. Omorodion is a Microsoft Certified Data Analytics and BI Professional mostly in Microsoft BI stack of tools.

View all my tips



Comments For This Article





download





Recommended Reading

Using Power BI with JSON Data Sources and Files

Calculating MTD, QTD, YTD, Running and Cumulative Total in Power BI

Power BI Conditional Formatting for Matrix and Table Visuals

Calculate Percentage Growth Over Time with Power BI

Power BI Workspace Permissions and Roles














get free sql tips
agree to terms