SELECT DISTINCT SQL Command to Return a Unique List of Values

By:   |   Updated: 2023-02-23   |   Comments (1)   |   Related: More > TSQL


Problem

You often find duplicate values in a field in a SQL Server database table. Duplicate records don't necessarily mean there is anything wrong with the data. It's not uncommon to have a table with customer addresses with multiple customers in the same city, state, and maybe even the same zip code. Sometimes we only want to see a distinct (or unique) set of those duplicate values.

Solution

This is where SELECT DISTINCT can be used. As its name implies, the DISTINCT keyword is commonly used in a SQL SELECT statement to return a distinct set of values. This tip will look at several simple examples of using SELECT DISTINCT that you can copy, paste, and edit for your needs.

Sample Database AdventureWorksLT2019

The following example queries have all been run in the AdventureWorksLT2019 sample database. AdventureWorksLT2019 is a free database from Microsoft for a fictitious bicycle manufacturer called Adventure Works Cycles. If you want to recreate these examples, you can follow the steps in this tip that will show you how to download and install it with SQL scripts or restore it from a database backup: AdventureWorks Database Installation Steps.

The examples will use the following tables in AdventureWorksLT2019:

  • SalesLT.Address
  • SalesLT.Customer
  • SalesLT.CustomerAddress
  • SalesLT.Product
  • SalesLT.SalesOrderDetail
  • SalesLT.SalesOrderHeader

Use the AdventureWorksLT2019 database.

USE [AdventureWorksLT2019];
GO

SQL Distinct Clause Example: Unique Locations for Customers

A business may ask, "What countries do we have customers in?" We would expect duplicate records for Country. This syntax will return a list of distinct values which are the countries ordered by CountryRegion:

SELECT DISTINCT [CountryRegion] -- single column
FROM [AdventureWorksLT2019].[SalesLT].[Address]
ORDER BY [CountryRegion];
GO

The results show that we have customers in Canada, the United Kingdom, and the United States.

Countries

The next question a business may ask is, "Now that I know the countries, what state/province do we have customers in the above countries?" For that, add the StateProvince field to the SELECT DISTINCT to the SQL statement, and we get the list.

SELECT DISTINCT [StateProvince],[CountryRegion] -- multiple columns
FROM [AdventureWorksLT2019].[SalesLT].[Address]
ORDER BY [CountryRegion],[StateProvince];
GO
States and Countries

The next obvious question is, "What cities are the customers in?" There is a pattern emerging. Just add City to the SELECT DISTINCT to get the City field and the distinct city, state/province, and country will be returned.

SELECT DISTINCT [City],[StateProvince],[CountryRegion]
FROM [AdventureWorksLT2019].[SalesLT].[Address]
ORDER BY [CountryRegion],[StateProvince],[City];
GO
Cities, States, and Countries

Another logical question to refine our result set: "What zip codes are the customers in?" Add the PostalCode field to show the distinct city, state/province, country, and postal code.

SELECT DISTINCT [City],[StateProvince],[CountryRegion],[PostalCode]
FROM [AdventureWorksLT2019].[SalesLT].[Address]
ORDER BY [CountryRegion],[StateProvince],[PostalCode];
GO
Cities, States, Countries, and Postal codes

SQL Distinct Statement Example: Unique List of Customers Who Have Made and Not Made Purchases

Another common question that a business user might ask is, "What are the names of the customers who have made purchases from us?" Query the Customer table and filter on a subquery of distinct CustomerIDs in the SalesOrderHeader table to give us a list of those customers.

SELECT DISTINCT [FirstName],[MiddleName],[LastName]
FROM [SalesLT].[Customer]
WHERE [CustomerID] IN (SELECT DISTINCT [CustomerID]
                       FROM [SalesLT].[SalesOrderHeader])
ORDER BY [FirstName],[MiddleName],[LastName];
GO
Customers Who Have Made Purchases

And conversely, you may want to see customers who have yet to make purchases. These could be prospects we want to make customers or data we may want to purge. All that's needed is to change the IN to NOT IN in the subquery filter.

SELECT DISTINCT [FirstName],[MiddleName],[LastName]
FROM [SalesLT].[Customer]
WHERE [CustomerID] NOT IN (SELECT DISTINCT [CustomerID]
                           FROM [SalesLT].[SalesOrderHeader])
ORDER BY [FirstName],[MiddleName],[LastName];
GO
Customers Who Have Not Made Purchases

SQL Distinct SQL Example: Unique List of Products That Have and Have Not Sold

Of course, a business would want to know what products they've sold. These are the products they need to be sure they have or can get.

SELECT  DISTINCT [Name]
FROM [SalesLT].[Product]
WHERE [ProductID] IN (SELECT DISTINCT [ProductID]
                      FROM [SalesLT].[SalesOrderDetail])
ORDER BY [Name];
GO
Products That Have Sold

It's also just as likely that they would be interested in products that have not sold and are costing money to keep in inventory. For that, change the 'IN' to a 'NOT IN' as we did with the query to show customers who have not made any purchases.

SELECT  [Name]
FROM [SalesLT].[Product]
WHERE [ProductID] NOT IN (SELECT DISTINCT [ProductID]
                      FROM [SalesLT].[SalesOrderDetail])
ORDER BY [Name];
GO
Products That Have Not Sold

SQL Distinct SQL Example: Unique List of Address Types

It's not uncommon to have a customer's billing and shipping addresses be different. If we want to see our distinct address types, query the SalesLT.CustomerAddress table with a SELECT DISTINCT on AddressType.

SELECT DISTINCT [AddressType]
FROM [SalesLT].[CustomerAddress]
ORDER BY [AddressType];
GO
Address Types

SQL Distinct SQL Example: Show Distinct Color Information of a Product

Someone may want to know the available colors of a particular product. Let's use HL Mountain Frames as an example. Filter on the product name, HL Mountain Frame, and we get a list of the colors.

SELECT DISTINCT [Color]
FROM [SalesLT].[Product]
WHERE [Name] LIKE 'HL Mountain Frame%'
ORDER BY [Color];
GO
Colors

If we don't need to know the colors and just how many colors the HL Mountain Frame comes in, add the COUNT function to the DISTINCT.

SELECT COUNT (DISTINCT [Color]) AS [HL Mountain Frame Colors]
FROM [SalesLT].[Product]
WHERE [Name] LIKE 'HL Mountain Frame%';
GO
Colors Count

Here, we can find out what products have a color associated with them, the color, and product name filtering on IS NOT NULL.

SELECT DISTINCT [Color],[Name]
FROM [SalesLT].[Product] 
WHERE [Color] IS NOT NULL
ORDER BY [Color],[Name];
Products With a Color

If we want to see what product names that do not have a color associated with them, remove the NOT from IS NOT NULL for the NULL values.

SELECT DISTINCT [Color],[Name]
FROM [SalesLT].[Product] 
WHERE [Color] IS NULL
ORDER BY [Color],[Name];
Products Without a Color
Next Steps

We've seen some simple, practical, and random examples of using SELECT DISTINCT. The following links are to SQL tips and SQL tutorials with additional information:



sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Joe Gavin Joe Gavin is from the Greater Boston area. He started working with SQL Server and Sybase in 1998 in the financial services industry and has been a SQL Server Database Administrator for a dairy cooperative since 2011. He graduated from Northeastern University in Boston with a Bachelor of Science in Engineering Technology (BSET) degree in Computer Technology. Joe has spoken at the Boston and Providence SQL Saturday events.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2023-02-23

Comments For This Article




Friday, March 10, 2023 - 8:57:21 AM - Salcio Back To Top (90995)
Not sure about using DISTINCT clause in the inner queries, that is in expression like
" ... WHERE ColumnId IN (SELECT DISTINCT ... ". I believe it may just slow the down query. It is not needed. Less code is usually better.
Also, what I am missing from this article is comparison, in performance, between GROUP BY and DISTINCT.