Power BI Incremental Refresh for SQL Sources
Incremental refresh is now available for both Pro and Premium capacity in Power BI. There are three main benefits for implementing incremental refresh on a dataset including faster refreshes, more reliable refreshes and, a reduction in resource consumption. Moreover, since incremental refresh is usually much more useful in cases where we have potentially large datasets with millions upon millions of rows and there is a potential for the datasets to grow significantly overtime, then it is essential that the partition filters are pushed to the source system when queries are submitted for refresh operations.
However, these benefits cannot be attained if the data source used in the development of reports in Power BI do not support what is known as Query Folding. Most data sources that support SQL queries like Azure SQL database, SQL Server, Oracle, and Teradata usually support Query Folding. But these need to be implemented appropriately to ensure incremental refresh is working fine.
In this tip, I will be demonstrating how to implement an incremental refresh on a SQL Server data source the right way. This would be done in a series of steps.
I will be using the SalesOrderHeader table from AdventureWorks2014 database for this tip. In your case you might have more than one table as source data, you just need to figure out which of the tables you need to implement an incremental refresh on. This is usually the fact tables which usually grow with time more than the dimension tables.
Step 1: Connecting the data source with Power BI Desktop
The SQL Server database needs to be connected to Power BI desktop as shown in the diagrams below.
Enter the server name, the database name and optionally write out the SQL queries you would want to use. I will explain later how writing out the SQL Queries (not Views) may affect the implementation of incremental refresh.
Step 2: Configure the Power Query datetime parameters
It should be noted that to configure incremental refresh, datasets are filtered using the reserved and case sensitive datetime parameter names RangeStart and RangeEnd. The diagrams below show how to define these.
The "Type" should usually be entered "Date/Time", the "Suggested Values" should usually be "Any Value" and the "Current Value" should be any datetime value since Power BI service would auto detect this later. However, I just used a MAX of "Modifieddate" from the dataset in RangeEnd and a MIN of the "Modifieddate" in RangeStart.
Step 3: Configure the parameters to apply filters to the dataset
In this step we are going to locate the column in the dataset we are trying to apply the filter of incremental refresh on and map it to the parameters created by using custom filters as shown in the diagrams below.
Click on "Date/Time Filters" and navigate to "Custom Filters". In this case we used the Basic configuration as shown below.
Then input the values exactly as shown below. Click OK and Close & Apply the Query Editor window.
Step 4: Activate the incremental refresh on the dataset
Having finished the initial configuration, it is now time to activate the incremental refresh settings on the dataset as seen in the diagrams below.
Next you can see that we are getting a warning in the next diagram that our dataset cannot be folded or does not support Query Folding. But remember we are using a data source that supports SQL Queries in this tip you might say. We can verify this further by going back to the Power Query Editor and check if the data source supports Query Folding by right clicking on the source step or any step in the Applied Steps in the dataset and checking if the dataset "View Native Query" highlighted or greyed out. If greyed out it means the dataset cannot support Query Folding, thus even if we manage to setup the incremental refresh, it would be very slow and might not work as planned.
The way to resolve this is to go back to Power Query Editor, click on the properties icon of the source data and totally remove the SQL Statement query. Then only import all tables and select the tables required. As seen in diagrams below.
After this, redo Step 3 & Step 4 again. You should now be able to see the incremental refresh settings without the warnings. And you can verify if there is Query Folding taking place by following steps mentioned in step 4 (the View Native Query is now highlighted and not greyed out).
You can now view the Native Query behind this incremental refresh, should look something like the diagram below. Click Close & Apply.
select [_].[RevisionNumber], [_].[OrderDate], [_].[DueDate], [_].[ShipDate], [_].[Status], [_].[OnlineOrderFlag], [_].[SalesOrderNumber], [_].[PurchaseOrderNumber], [_].[AccountNumber], [_].[CustomerID], [_].[SalesPersonID], [_].[TerritoryID], [_].[BillToAddressID], [_].[ShipToAddressID], [_].[ShipMethodID], [_].[CreditCardID], [_].[CreditCardApprovalCode], [_].[CurrencyRateID], [_].[SubTotal], [_].[TaxAmt], [_].[Freight], [_].[TotalDue], [_].[Comment], [_].[rowguid], [_].[ModifiedDate] from ( select [RevisionNumber], [OrderDate], [DueDate], [ShipDate], [Status], [OnlineOrderFlag], [SalesOrderNumber], [PurchaseOrderNumber], [AccountNumber], [CustomerID], [SalesPersonID], [TerritoryID], [BillToAddressID], [ShipToAddressID], [ShipMethodID], [CreditCardID], [CreditCardApprovalCode], [CurrencyRateID], [SubTotal], [TaxAmt], [Freight], [TotalDue], [Comment], [rowguid], [ModifiedDate] from [Sales].[SalesOrderHeader] as [$Table] ) as [_] where [_].[ModifiedDate] >= convert(datetime2, '2011-05-31 00:00:00') and [_].[ModifiedDate] < convert(datetime2, '2014-06-30 00:00:00')
Next, try to activate the incremental refresh settings again as shown below. You would be able to now see that there is no more warning message.
This guarantees that our incremental refresh would work well as required since the filter partitioning is able to be pushed to the source data and thus guarantee that all the benefits of incremental refresh would be attained.
Step 5: Publish to Power BI Service schedule refresh frequency
The next step is to publish the created reports to Power BI Service and then where applicable configure a refresh schedule without which the refresh would not take place. Initially, this is usually not activated as seen in the diagram below.
The warning sign indicates that there are some issues with the scheduled refresh settings.
However, since the configuration of gateways and scheduled refresh are not the focus of this tip, I would advise that where applicable this should be looked at to ensure the refresh works properly.
- Learn more about how to configure refresh schedule here.
- You can get more information on Query Folding here.
- Learn more about how to use or configure parameters in Power BI here.
- Learn more about Power Query date filters here.
- Try this tip out with your data.
About the author
View all my tips