Using Global Temporary Staging Tables for SQL Server Data Importing
By: Sergey Gigoyan | Updated: 2016-02-19 | Comments | Related: More > Import and Export
Sometimes we need to load data into the SQL Server database as a one-time process. Moreover some additional modifications (de-duplicating, cleaning, etc.) on the data combined from different sources may be needed before loading it into the final SQL Server database tables. Obviously, creating a new table in the database for loading the staging data and dropping it after finishing the task is not the best solution. In this tip we will illustrate how can we load such data using the Import and Export Wizard along with a global temporary table to store the staging information.
Suppose we have a UserInfo table in our database, which stores user related data. Occasionally we receive some updates about user related information in either Excel or Access files. We need to use the most current data and update our table. As these updates occur very rarely or even once, there is not a prebuilt SSIS package in our environment to automate this process.
Let's create a test environment and solve this task:
USE master GO --Creating the database CREATE DATABASE TestDB GO USE TestDB GO --Creating UserInfo CREATE TABLE UserInfo ( ID INT IDENTITY(1,1), FirstName NVARCHAR(50), LastName NVARCHAR(50), PhoneNumber CHAR(9), DateModified DATETIME DEFAULT GETDATE() CONSTRAINT [PK_ID] PRIMARY KEY CLUSTERED (ID ASC) ) --Creating unique index on FirstName and LastName CREATE UNIQUE NONCLUSTERED INDEX [UIX_FirstName_LastName] ON UserInfo (FirstName, LastName) --Insert data into UserInfo table INSERT INTO UserInfo (FirstName, LastName, PhoneNumber, DateModified) VALUES('User1Fname','User1Lname', '023235548', '2016-01-04'), ('User2Fname','User2Lname', NULL,'2016-01-04')
To facilitate testing, we created a UserInfo table which stores very simple data. We will receive updated phone numbers from some users in Excel ("PhoneNumbers.xlsx") and other data in Access ("PhoneNumbers.mdb") files. We need to combine this data, compare with relevant data in the UserInfo table and finally choose the most recent information.
To avoid creating additional regular tables in our database we will use a global temporary table to store the staging data gathered from these files. Considering the fact that global temporary tables are visible to any user and any connection and they exist as long as the creator session exists ( considering the table is not dropped manually ), we will create global temporary table ##tmpUserInfo with the same structure as the UserInfo table ( excluding the ID column which is not needed ). After this is created, we do not want to close this Query Window during the entire process.
--Creating an empty temporary table for storing staging data SELECT TOP(0) FirstName, LastName, PhoneNumber, DateModified INTO ##tmpUserInfo FROM TestDB.dbo.UserInfo
First let's load data from "PhoneNumbers.xlsx" file into ##tmpUserInfo table using the Import and Export Wizard. We can start by right clicking on the TestDB database in SQL Server Management Studio, then choose "Tasks" > "Import Data...":
Import and Export Wizard is opened:
By clicking the "Next" button we go to the next page where we choose the source file. As the data source we choose "Microsoft Excel":
As the destination database we choose tempdb because temporary tables are stored in the tempdb database:
Since the session that created temp table ##tmpUserInfo is still active, we can see this table in the list:
We are choosing this table as the destination table and then previewing the data:
In the next step, data is loaded into the UserInfo table. As we can see our import process completes successfully:
Now we will start the Import and Export Wizard again and as a data source we will choose the "PhoneNumbers.mdb" Access file:
We can see file content by clicking preview:
Then we will continue the same steps which we did for the previous file to load this data into the UserInfo table. By selecting all data in the ##tmpUserInfo table we can see that it contains all data from both files:
SELECT * FROM ##tmpUserInfo
The result shows there is more than one row for some users:
We need to choose the latest data for each user. (To not complicate our example, let's assume there are no records with different data and the same modification date):
--Query for deduplicating, finding users' the latest updated phone number SELECT DISTINCT FirstName, LastName, PhoneNumber, DateModified FROM ##tmpUserInfo o WHERE DateModified = (SELECT MAX(DateModified) FROM ##tmpUserInfo i WHERE o.FirstName=i.FirstName AND o.LastName=i.LastName GROUP BY i.FirstName, i.LastName)
This result contains the latest modified data for users provided from the source files.
Now we will use this data to update information in UserInfo table:
USE TestDB GO --Selecting UserInfo table data before update SELECT * FROM UserInfo --Updating data in UserInfo table, merging by staging table MERGE UserInfo AS target USING ( SELECT DISTINCT FirstName, LastName, PhoneNumber, DateModified FROM ##tmpUserInfo o WHERE DateModified = (SELECT MAX(DateModified) FROM ##tmpUserInfo i WHERE o.FirstName=i.FirstName AND o.LastName=i.LastName GROUP BY i.FirstName, i.LastName) ) AS source ON (target.FirstName = source.FirstName AND target.LastName = source.LastName) WHEN MATCHED AND target.DateModified < source.DateModified AND ISNULL(target.PhoneNumber,'0') <> ISNULL(source.PhoneNumber,'0') THEN UPDATE SET target.DateModified = source.DateModified, target.PhonenUmber = source.PhoneNumber WHEN NOT MATCHED THEN INSERT (FirstName, LastName, PhoneNumber, DateModified) VALUES (source.FirstName, source.LastName, source.PhoneNumber, source.DateModified) ; --Selecting UserInfo table data after update SELECT * FROM UserInfo
We can see that the data for user with "ID"=1 and "ID"=2 has been updated and 3 new users from the source files have been inserted:
Now our task is solved and we can close the Query Window where we created the ##tmpUserInfo table, because this table is no longer needed.
Thanks to the ability to be visible to any user and any connection, global temporary tables can serve for storing staging data imported by the Import and Export Wizard from different sources. It allows us not to have to create additional regular tables in our database.
- Read more Import and Export Tips
- SQL Server Import and Export Wizard
- Special Table Types
- SQL Server Global Temporary Table Visibility
About the author
View all my tips
Article Last Updated: 2016-02-19