solving sql server problems for millions of dbas and developers since 2006



SQL Server DBA Tips SQL Server Developer Tips SQL Server Business Intelligence Tips SQL Server Career Tips SQL Server Tip Categories SQL Server Tutorials SQL Server Webcasts SQL Server Whitepapers SQL Server Tools SQL Server Questions and Answers MSSQLTips Authors About MSSQLTips SQL Server User Groups MSSLQTips Giveaways MSSQLTips Advertising Options

MSSQLTips Facebook Page MSSQLTips LinkedIn Page MSSQLTips RSS Feed MSSQLTips Twitter Page MSSQLTips Google+ Page





SQL Product Highlight

Idera - SQL safe backup

Idera's SQL safe provides a high-performance backup and recovery solution for Microsoft SQL Server. SQL safe saves money by reducing database backup time by up to 50% over native backups and reducing backup disk space requirements by up to 95%. SQL safe also enables complete "hands-free" automated backup of your entire SQL Server infrastructure and ensures compliance with your organization's backup and recovery policies. From implementations with tens of SQL servers to enterprises with hundreds of servers spread around the globe, SQL safe is the only SQL Server backup and recovery solution that scales to meet the challenge.

Learn more!




Joining data using UNION and UNION ALL in SQL Server

By: | Read Comments | Print

Greg is the President of Edgewood Solutions and a co-founder of MSSQLTips.com.

Related Tips: 1 | 2 | 3 | 4 | More

Problem
Sometimes there is a need to combine data from multiple tables or views into one comprehensive dataset. This may be for like tables within the same database or maybe there is a need to combine like data across databases or even across servers.  I have read about the UNION and UNION ALL commands, but how do these work and how do they differ?

Solution
In SQL Server you have the ability to combine multiple datasets into one comprehensive dataset by using the UNION or UNION ALL operators.  There is a big difference in how these work as well as the final result set that is returned, but basically these commands join multiple datasets that have similar structures into one combined dataset.

Here is a brief description:

  • UNION - this command will allow you to join multiple datasets into one dataset and will remove any duplicates that exist.  Basically it is performing a DISTINCT operation across all columns in the result set.
  • UNION ALL - this command again allows you to join multiple datasets into one dataset, but it does not remove any duplicate rows.  Because this does not remove duplicate rows this process is faster, but if you don't want duplicate records you will need to use the UNION operator instead.

Rules to union data:

  • Each query must have the same number of columns
  • Each column must have compatible data types
  • Column names for the final result set are taken from the first query
  • ORDER BY and COMPUTE clauses can only be issued for the overall result set and not within each individual result set
  • GROUP BY and HAVING clauses can only be issued for each individual result set and not for the overall result set
Tip

If you don't have the exact same columns in all queries use a default value or a NULL value such as:

SELECT firstName, lastName, company FROM businessContacts
UNION ALL
SELECT firstName, lastName, NULL FROM nonBusinessContacts

or

SELECT firstName, lastName, createDate FROM businessContacts
UNION ALL
SELECT firstName, lastName, getdate() FROM nonBusinessContacts

Examples:

Let's take a look at a few simple examples of how these commands work and how they differ.  As you will see the final resultsets will differ, but there is some interesting info on how SQL Server actually completes the process.

In this first example we are using the UNION ALL operator against the Employee table from the AdventureWorks database.  This is probably not something you would do, but this helps illustrate the differences of these two operators. 

There are 290 rows in table dbo.Employee.

SELECT * FROM dbo.Employee
UNION ALL
SELECT * FROM dbo.Employee
UNION ALL
SELECT * FROM dbo.Employee

When this query is run the result set has 870 rows.  This is the 290 rows returned 3 times.  The data is just put together one dataset on top of the other dataset.

Here is the execution plan for this query.  We can see that the table was queried 3 times and SQL Server did a Concatenation step to concatenate all of the data.


In this next example we are using the UNION operator against the Employee table again from the AdventureWorks database. 

SELECT * FROM dbo.Employee
UNION
SELECT * FROM dbo.Employee
UNION
SELECT * FROM dbo.Employee

When this query is run the result set has 290 rows.  Even though we combined the data three times the UNION operator removed the duplicate records and therefore returns just the 290 unique rows.

Here is the execution plan for this query.  We can see that SQL Server first queried 2 of the tables, then did a Merge Join operation to combine the first two tables and then it did another Merge Join along with querying the third table in the query.  So we can see there was much more worked that had to be performed to get this result set compared to the UNION ALL.


If we take this a step further and do a SORT of the data using the Clustered Index column we get these execution plans. From this we can see that the execution plan that SQL Server is using is identical for each of these operations even though the final result sets will still contain 870 rows for the UNION ALL and 290 rows for the UNION.

UNION ALL query

UNION query


Here is another example doing the same thing, but this time doing a SORT on a non indexed column. As you can see the execution plans are again identical for these two queries, but this time instead of using a MERGE JOIN, a CONCATENATION and SORT operations are used.

UNION ALL query

UNION query


Next Steps

 



Related Tips: 1 | 2 | 3 | 4 | More | Become a paid author


Last Update: 12/5/2007

Share: Share 






Comments and Feedback:


Post a Comment or Question

Keep it clean and stay on the subject or we may delete your comment.
Your email address is not published. Required fields are marked with an asterisk (*)

*Name   *Email   Notify for updates
Comments
*Enter Code refresh code


 
Sponsor Information
"SQL doctor is the best SQL product on the market, by far. All of Idera's tools are great, but this is the icing on the cake!"

SQL Monitor – For database professionals who need results on Day One. Try it online.

What grade do you think your SQL Servers get? Find out with Edgewood's SQL Server Health Check starting at $995.

Free Trial: Get Proactive Insight with Spotlight® for SQL Server Enterprise.

Join the over million SQL Server Professionals who get their issues resolved daily.

Learn SQL Server 2012, Performance Tuning, Development, Administration, Replication and more - free webcasts


Copyright (c) 2006-2012 Edgewood Solutions, LLC All rights reserved
privacy | disclaimer | copyright | advertise | about
authors | contribute | feedback | giveaways | user groups
Some names and products listed are the registered trademarks of their respective owners.


Edgewood Solutions LLC | MSSharePointTips.com | MSSQLTips.com