solving sql server problems for millions of dbas and developers since 2006



SQL Server DBA Tips SQL Server Developer Tips SQL Server Business Intelligence Tips SQL Server Career Tips SQL Server Tip Categories SQL Server Tutorials SQL Server Webcasts SQL Server Whitepapers SQL Server Tools SQL Server Questions and Answers MSSQLTips Authors About MSSQLTips SQL Server User Groups MSSLQTips Giveaways MSSQLTips Advertising Options

MSSQLTips Facebook Page MSSQLTips LinkedIn Page MSSQLTips RSS Feed MSSQLTips Twitter Page MSSQLTips Google+ Page








Compare SQL Server Datasets with INTERSECT and EXCEPT

By: | Read Comments (1) | Print

Greg is the President of Edgewood Solutions and a co-founder of MSSQLTips.com.

Related Tips: 1 | 2 | 3 | 4 | More

Problem
When joining multiple datasets you have always had the ability to use the UNION and UNION ALL operator to allow you to pull a distinct result set (union) or a complete result set (union all).  These are very helpful commands when you need to pull data from different tables and show the results as one unified distinct result set.  On the opposite side of this it would be helpful to only show a result set where both sets of data match or only where data exists in one of the tables and not the other. This could be done with using different join types, but what other options does SQL Server offer?

Solution
With SQL Server 2005, Microsoft introduced the INTERSECT and EXCEPT operators to further extend what you could already do with the UNION and UNION ALL operators.

  • INTERSECT - gives you the final result set where values in both of the tables match
  • EXCEPT - gives you the final result set where data exists in the first dataset and not in the second dataset

The advantage of these commands is that it allows you to get a distinct listing across all of the columns such as the UNION and UNION ALL operators do without having to do a group by or do a comparison of every single column.  

Like the UNION and UNION ALL operators the table structures need to be consistent as well as the columns need to have compatible data types.

Let's take for example we have two tables manager and customer.  Both of these tables have somewhat the same structure such as the following columns:

  • FirstName
  • LastName
  • AddressLine1
  • City
  • StateProvinceCode
  • PostalCode

Manager table sample data

Customer table sample data

We want to do two queries:

  1. Find the occurrences where a manager is a customer (intersect)
  2. Find the occurrences where the manager is not a customer (except)

INTERSECT
If we want to find out which people exist in both the customer table and the manager table and get a distinct list back we can issue the following command:

SELECT FIRSTNAME,
       
LASTNAME,
       
ADDRESSLINE1,
       
CITY,
       
STATEPROVINCECODE,
       
POSTALCODE
FROM   MANAGER
INTERSECT 
SELECT 
FIRSTNAME,
       
LASTNAME,
       
ADDRESSLINE1,
       
CITY,
       
STATEPROVINCECODE,
       
POSTALCODE
FROM   CUSTOMER

Here is the result set:

To do this same thing with a regular T-SQL command we would have to write the following:

SELECT   M.FIRSTNAME,
         
M.LASTNAME,
         
M.ADDRESSLINE1,
         
M.CITY,
         
M.STATEPROVINCECODE,
         
M.POSTALCODE
FROM     MANAGER M
         
INNER JOIN CUSTOMER C
           
ON M.FIRSTNAME C.FIRSTNAME
              
AND M.LASTNAME C.LASTNAME
              
AND M.ADDRESSLINE1 C.ADDRESSLINE1
              
AND M.CITY C.CITY
              
AND M.POSTALCODE C.POSTALCODE
GROUP BY M.FIRSTNAME,M.LASTNAME,M.ADDRESSLINE1,M.CITY,
         
M.STATEPROVINCECODE,M.POSTALCODE

EXCEPT
If we want to find out which people exists in the manager table, but not in the customer table and get a distinct list back we can issue the following command:

SELECT FIRSTNAME,
       
LASTNAME,
       
ADDRESSLINE1,
       
CITY,
       
STATEPROVINCECODE,
       
POSTALCODE
FROM   MANAGER 
EXCEPT
SELECT 
FIRSTNAME,
       
LASTNAME,
       
ADDRESSLINE1,
       
CITY,
       
STATEPROVINCECODE,
       
POSTALCODE
FROM   CUSTOMER

Here is the result set:

To do this same thing with a regular T-SQL command we would have to write the following:

SELECT   M.FIRSTNAME,
         
M.LASTNAME,
         
M.ADDRESSLINE1,
         
M.CITY,
         
M.STATEPROVINCECODE,
         
M.POSTALCODE
FROM     MANAGER M
WHERE    NOT EXISTS (SELECT *
                     
FROM   CUSTOMER C
                     
WHERE  M.FIRSTNAME C.FIRSTNAME
                            
AND M.LASTNAME C.LASTNAME
                            
AND M.ADDRESSLINE1 C.ADDRESSLINE1
                            
AND M.CITY C.CITY
                            
AND M.POSTALCODE C.POSTALCODE)
GROUP BY M.FIRSTNAME,M.LASTNAME,M.ADDRESSLINE1,M.CITY,
         
M.STATEPROVINCECODE,M.POSTALCODE

From the two examples above we can see that using the EXCEPT and INTERSECT commands are much simpler to write then having to write the join or exists statements.

 

To take this a step further if we had a third table (or forth...) that listed sales reps and we wanted to find out which managers were customers, but not sales reps we could do the following.

SalesRep table sample data

SELECT FIRSTNAME,
       
LASTNAME,
       
ADDRESSLINE1,
       
CITY,
       
STATEPROVINCECODE,
       
POSTALCODE
FROM   MANAGER
INTERSECT 
SELECT 
FIRSTNAME,
       
LASTNAME,
       
ADDRESSLINE1,
       
CITY,
       
STATEPROVINCECODE,
       
POSTALCODE
FROM   CUSTOMER 
EXCEPT
SELECT 
FIRSTNAME,
       
LASTNAME,
       
ADDRESSLINE1,
       
CITY,
       
STATEPROVINCECODE,
       
POSTALCODE
FROM   SALESREP

Here is the result set:

As you can see this is pretty simple to mix and match these statements.  In addition, you could also use the UNION and UNION ALL operators to further extend your final result sets.

Next Steps

  • Take a look at your existing code to see how the INTERSECT and EXCEPT operators could be used
  • Keep these new operators in mind next time you need to compare different datasets with like data


Related Tips: 1 | 2 | 3 | 4 | More | Become a paid author


Last Update: 9/11/2007

Share: Share 






Comments and Feedback:

Sunday, August 10, 2008 - 5:39:49 AM - naveentnk Read The Tip

Thanks for this article.Its a good one. its better to go for except and interset than to go for joins.

Thans

Naveen



Post a Comment or Question

Keep it clean and stay on the subject or we may delete your comment.
Your email address is not published. Required fields are marked with an asterisk (*)

*Name   *Email   Notify for updates
Comments
*Enter Code refresh code


 
Sponsor Information
"SQL diagnostic manager delivers response in minutes, not hours!"

Quickly and accurately deploy database changes with Red Gate's SQL Compare - the industry standard comparison and deployment tool.

What grade do you think your SQL Servers get? Find out with Edgewood's SQL Server Health Check consulting services.

Get SQL Server Tips Straight from Kevin Kline.

Join the over million SQL Server Professionals who get their issues resolved daily.

The SQL Server Security THREAT - It’s Closer Than You Think


Copyright (c) 2006-2012 Edgewood Solutions, LLC All rights reserved
privacy | disclaimer | copyright | advertise | about
authors | contribute | feedback | giveaways | user groups
Some names and products listed are the registered trademarks of their respective owners.


Edgewood Solutions LLC | MSSharePointTips.com | MSSQLTips.com