Combining Data from Multiple Relational Data Sources into One Table in PowerPivot for Excel
I have read a tip on Importing SQL Server Data from Multiple Data Sources into PowerPivot for Excel, which talks about bringing in data from two or more sources into PowerPivot and storing data from each source table/query as a separate PowerPivot table. I would like to know if there is a way to combine data from two or more databases/sources into one table in PowerPivot.
In this tip we will see how the above stated problem can be addressed. For the purpose of this tip, let's consider the following scenario.
We have a Sales Database which has a large amount of data. The data in this database is archived every year and stored in a separate database, which is on a different server. Let's assume that we have data starting from January 2010 and data for 2010 has been archived and stored in SalesDB_2010, data for 2011 has been archived and stored in SalesDB_2011, and the current data is present in SalesDB. All these three databases are present on different servers.
Sample Data Preparation
Let's first prepare the sample data for this demonstration using the following steps:
- Create SalesDB_2010 on Server1, SalesDB_2011 on Server2, and SalesDB on Server3 (Server which holds the current or most recent data, which in our case is 2012 data). Do not try any of these on your live or mission critical servers. Try these steps only on your development/test servers.
- Create the Sales table in each of these three databases using the following script.
CREATE TABLE dbo.FactSales( Id INT IDENTITY(1,1) PRIMARY KEY , DateKey INT , ProductKey INT , SalesAmount MONEY ) GO
- Run the following insert script on SalesDB_2010 on Server1.
INSERT INTO dbo.FactSales (DateKey,ProductKey,SalesAmount) VALUES (20100101,123,1000.26) , (20100201,456,1003.20) , (20100301,789,1007.856) , (20100401,123,1011.66) , (20100501,456,1016.982) , (20100601,789,1020.594) , (20100701,123,1028.70) , (20100801,456,1039.3515) , (20100901,789,1043.217) , (20101001,123,1046.28) , (20101101,456,1049.118) , (20101201,789,1065.90) GO
- Run the following insert script on SalesDB_2011 on Server2.
INSERT INTO dbo.FactSales (DateKey,ProductKey,SalesAmount) VALUES (20110101,123,2016.882) , (20110201,456,2023.32) , (20110301,789,2024.994) , (20110401,123,2033.964) , (20110501,456,2039.994) , (20110601,789,2041.188) , (20110701,123,2045.8791) , (20110801,456,2084.6752) , (20110901,789,2092.56) , (20111001,123,2097.2945) , (20111101,456,2101.2043) , (20111201,789,2108.4343) GO
- Run the following insert script on SalesDB_2012 on Server3.
INSERT INTO dbo.FactSales (DateKey,ProductKey,SalesAmount) VALUES (20120101,123,3001.3125) , (20120201,456,3011.73) , (20120301,789,3032.3036) , (20120401,123,3050.946) , (20120501,456,3051.6096) , (20120601,789,3061.782) , (20120701,123,3117.87) , (20120801,456,3123.2728) , (20120901,789,3123.4376) , (20121001,123,3124.1301) , (20121101,456,3212.082) , (20121201,789,3222.7484) GO
Combining data from three different source databases/servers into a Single PowerPivot Table
Follow the below steps to extract the data from the above three data sources and consolidate them into a single PowerPivot table. Here we are basically doing a UNION ALL operation.
- Connect to Server3 in SQL Server Management Studio (SSMS) and create two linked servers as "SalesDB_2010" pointing to Server1 and "SalesDB_2011" pointing to Server2.
- Open Excel, go to the PowerPivot tab and open the PowerPivot window. Note that you can either create a new Excel file or you can use the Excel file created in our previous tip Creating Linked Tables in PowerPivot for Excel.
- In the PowerPivot window, click on "From Database" under "Get External Data" section of Top Ribbon, and select "From SQL Server". This will start the "Table Import Wizard".
- Enter the connection information and click Next. Make sure to set the server and database to the one which holds the current (2012) data. In the above scenario, it should point to Server3.
- Click on "Select from a list of tables and views to choose the data to import" in the "Choose How to Import the Data" step.
- Select "FactSales" in the "Select Tables and Views" step.
- Click Close once the data import is completed. After the data import is completed, the PowerPivot window should look as shown below.
- With the first cell in the above table selected, go to Design Tab and click on "Table Properties" in the Top Ribbon. This will bring up the "Edit Table Properties" window.
- In the "Edit Table Properties" change the "Switch to" property drop-down from "Table Preview" to "Query Editor".
- Paste the following query into the "Edit Table Properties" window.
SELECT Id, DateKey, ProductKey, SalesAmount FROM OPENQUERY( SalesDB_2010 , 'SELECT Id, DateKey, ProductKey, SalesAmount FROM [SalesDB_2010].[dbo].[FactSales]' ) UNION ALL SELECT Id, DateKey, ProductKey, SalesAmount FROM OPENQUERY( SalesDB_2011 , 'SELECT Id, DateKey, ProductKey, SalesAmount FROM [SalesDB_2011].[dbo].[FactSales]' ) UNION ALL SELECT Id, DateKey, ProductKey, SalesAmount FROM [dbo].[FactSales]
- Ensure that the above query is prepared and tested in SSMS before using in PowerPivot. The updated "Edit Table Properties" window should look as shown below.
- Click Validate and then click Save in the "Edit Table Properties" window. Once the query is saved, the PowerPivot window will get refreshed and the data for all the three years will be loaded into a PowerPivot Table as shown below.
Note that the primary data source is Server3 and the linked servers are created from this data source to the other two data sources Server1 and Server2, which contain the 2010 and 2011 data respectively. As we can see, we are using a simple OPENQUERY and UNION ALL in the above query, which is verified and executed in SSMS. Based on this, we can clearly say that pretty much any query which works in SSMS (when executed on the Primary Data Source which is Server3 for this example), can be used in PowerPivot.
To refresh the data in PowerPivot, go to the Home Tab in PowerPivot ribbon and click on the down-arrow of the Refresh button and select "Refresh" or "Refresh All" depending upon whether you want to refresh the current table or all the tables in PowerPivot respectively.
So, combining data from multiple servers/databases/tables/queries into a single PowerPivot table is as simple as that.
- Try the above technique to load the data from different relational databases.
- Read Importing SQL Server Data from Multiple Data Sources into PowerPivot for Excel
- Read Creating Linked Tables in PowerPivot for Excel
- Check out my previous tips
Last Updated: 2013-02-25
About the author
View all my tips