Use SQL Server's UNPIVOT operator to help normalize output

By:   |   Comments (25)   |   Related: 1 | 2 | > TSQL


Problem

I often see people struggle with poorly normalized schemas, where they have to transpose a query against a wide table into skinnier, more normalized output.

For example, imagine a table with data like this:

SELECT * FROM dbo.CustomerPhones

For a certain report, the desired output is actually as follows (apologies for the messy arrows):

Desired output

Ideally, you would just fix the schema, but I do acknowledge that this is not a realistic possibility in a lot of scenarios.

Solution

So, we need to code our way around this to present the data as it should be stored, in spite of how it is being stored. Here is some sample data:

CREATE TABLE dbo.CustomerPhones
(
  CustomerID INT PRIMARY KEY, -- FK
  Phone1 VARCHAR(32),
  Phone2 VARCHAR(32),
  Phone3 VARCHAR(32)
);
INSERT dbo.CustomerPhones
  (CustomerID, Phone1, Phone2, Phone3)
VALUES
  (1,'705-491-1111', '705-491-1110', NULL),
  (2,'613-492-2222', NULL, NULL),
  (3,'416-493-3333', '416-493-3330', '416-493-3339');

One way that might seem like a good idea is to perform a UNION:

SELECT CustomerID, Phone = Phone1 
  FROM dbo.CustomerPhones WHERE Phone1 IS NOT NULL
UNION ALL
SELECT CustomerID, Phone = Phone2 
  FROM dbo.CustomerPhones WHERE Phone2 IS NOT NULL
UNION ALL
SELECT CustomerID, Phone = Phone3 
  FROM dbo.CustomerPhones WHERE Phone3 IS NOT NULL;

The problem with this solution is that it does not scale - for every phone column you have, you add an additional scan, and you also have to add an entirely new UNION ALL query for every phone column.

Instead, we can use SQL Server's UNPIVOT operator. Which, contrary to popular belief, is not quite the opposite of PIVOT. An example using the above sample data:

SELECT CustomerID, Phone
FROM
(
  SELECT CustomerID, Phone1, Phone2, Phone3 
  FROM dbo.CustomerPhones
) AS cp
UNPIVOT 
(
  Phone FOR Phones IN (Phone1, Phone2, Phone3)
) AS up;

Results:

unpivot output

Now, I'll admit, the syntax is not as intuitive as you might expect. Most notably, that "Phones" alias is completely made up, and you could put anything there (except "Phone"). Basically what it is saying is, "extract a new row for every value you find in the columns Phone1, Phone2, and Phone3."

Another case

Another situation you might find is when you have multiple sets of related columns. So, for example, let's say you have a column for the type of each phone number:

CREATE TABLE dbo.CustomerPhones2
(
  CustomerID INT PRIMARY KEY, -- FK
  Phone1 VARCHAR(32),
  PhoneType1 CHAR(4),
  Phone2 VARCHAR(32),
  PhoneType2 CHAR(4),
  Phone3 VARCHAR(32),
  PhoneType3 CHAR(4)
);
INSERT dbo.CustomerPhones2 VALUES
  (1,'705-491-1111', 'cell', '705-491-1110', 'home', NULL,NULL),
  (2,'613-492-2222', 'home', NULL, NULL, NULL, NULL),
  (3,'416-493-3333', 'work', '416-493-3330', 'cell',
     '416-493-3339', 'home');

Now, we could also solve this with a UNION query, but let's take a look at another capability we have: using multiple UNPIVOTs. The only complication here is matching the output phone to the corresponding phone type - for this we need to do some string interrogation to ensure that Phone1 matches to PhoneType1, Phone2 matches to PhoneType2, etc.

SELECT CustomerID, Phone, PhoneType
FROM 
(
  SELECT CustomerID, Phone, PhoneType,
    idp = SUBSTRING(Phones, LEN(Phones) - PATINDEX('%[^0-9]%', REVERSE(Phones)) + 2, 32),
    idpt = SUBSTRING(PhoneTypes, LEN(PhoneTypes) - PATINDEX('%[^0-9]%', REVERSE(PhoneTypes)) + 2, 32)
  FROM
  (
    SELECT CustomerID, Phone1, Phone2, Phone3,
           PhoneType1, PhoneType2, PhoneType3
    FROM dbo.CustomerPhones2
  ) AS cp
  UNPIVOT 
  (
    Phone FOR Phones IN (Phone1, Phone2, Phone3)
  ) AS p
  UNPIVOT
  (
    PhoneType FOR PhoneTypes IN (PhoneType1, PhoneType2, PhoneType3)
  ) AS pt
) AS x
WHERE idp = idpt;

Results:

c

In my next tip, I'll show how to derive these UNPIVOT queries dynamically, without having advanced knowledge of how many Phone/PhoneType columns there will be. Since a weakness of this particular design pattern is that when a 4th phone is added, you add new columns and change all the queries, and when a 5th phone is added, you add new columns and change all the queries, a more dynamic query can be useful to minimize or eliminate queries of this type.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Aaron Bertrand Aaron Bertrand (@AaronBertrand) is a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com, and also blogs at sqlblog.org.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Wednesday, May 30, 2018 - 2:42:06 PM - Annie Back To Top (76054)

Hi Aaron, 

Apologies for not making myself clear. I am trying to use PIVOT in the below code to get the count of SalesOrderID per cust per month.

INITIAL CODE :

 

SELECT DATENAME(mm, OrderDate) AS [Month], CustomerID,

COUNT(SalesOrderID) AS TotalOrder

FROM Sales.SalesOrderHeader

WHERE CustomerID BETWEEN 30010 AND 30020

GROUP BY CustomerID, DATENAME(mm, OrderDate), MONTH(OrderDate)

ORDER BY MONTH(OrderDate)

GO

MY CODE TILL NOW:

SELECT DATENAME(mm, OrderDate) as [Month], 

[30010], [30011], [30012], [30013], [30014], [30015], [30016], [30017], [30018], [30019], [30020] 

FROM

(SELECT OrderDate, CustomerID, SalesOrderID

    FROM Sales.SalesOrderHeader

WHERE CustomerID BETWEEN 30010 AND 30020) AS SourceTable

PIVOT

(

COUNT(SalesOrderID) 

FOR CustomerID IN

([30010], [30011], [30012], [30013], [30014], [30015], [30016], [30017], [30018], [30019], [30020])

) AS PivotTable

ORDER BY MONTH(DATENAME(mm, OrderDate)+ ' 1 2018') 

GO


Tuesday, May 29, 2018 - 10:45:05 AM - Aaron Bertrand Back To Top (76044)

Annie, it's not quite clear what you're after. Do you want all of a customer's sales for March, for example, as a single number, summing up their sales in March 2018 + March 2017 + March 2016 + ...? Can yuo give an idea of your desired output?

 


Monday, May 28, 2018 - 11:48:40 PM - Annie Back To Top (76038)

 

  Using the similar concepts, I am trying to Unpivot the below code for AdventureWorks2012 DB but unable to get the desired output. Could someone please enlighten me :

 

SELECT DATENAME(mm, OrderDate) AS [Month], CustomerID,

COUNT(SalesOrderID) AS TotalOrder

FROM Sales.SalesOrderHeader

WHERE CustomerID BETWEEN 30010 AND 30020

GROUP BY CustomerID, DATENAME(mm, OrderDate), MONTH(OrderDate)

ORDER BY MONTH(OrderDate)

GO

 

 


Tuesday, May 1, 2018 - 4:07:29 PM - SqlMel Back To Top (75835)

 Very nice, Aaron. Thanks.

 

What I'm having right now is the exact opposite. I have a normalized phone table and have a requirement of putting all the data in one string. For example:

 

Name            Phone Numbers

John Doe       Home 999-999-9999, Cel 999-999-9999

Jane Doe       Work 888-888-8888, Cel 999-999-9999

 

 I'm thinking about the best method I could use to do this. Trying to avoid using a cursor. 


Friday, December 16, 2016 - 3:21:03 PM - Vinu Back To Top (44992)

 Thank you so much. It has been a great help!!! 

 


Friday, October 21, 2016 - 2:14:01 PM - Dave Back To Top (43612)

 Just bumped into this.  Excellent example.  Very well done here and thanks for posting!

 

Dave

 


Friday, January 29, 2016 - 3:00:04 PM - Kris Back To Top (40548)

 Awesome example for UNPIVOT

 


Tuesday, November 24, 2015 - 8:20:24 PM - ShanshanTsui Back To Top (39135)

Thanks, this is super helpful. I have one more questions. For unpivot, for phone1, phone2, and phone3, if I want to have a column as the following:

Phone

1

2

as extracting 1, 2, 3 from the wor phone, is there a way to do it? or should I use substring?  Thanks. 

 

I am referring to the following sample codes: 

SELECT CustomerID, Phone
FROM
(
  SELECT CustomerID, Phone1, Phone2, Phone3 
  FROM dbo.CustomerPhones
) AS cp
UNPIVOT 
(
  Phone FOR Phones IN (Phone1, Phone2, Phone3)
) AS up;

Thursday, September 24, 2015 - 5:35:56 PM - esau silva Back To Top (38750)

 

Hi

I'm trying to unpivot my data but getting some weird results. Below is my code and screenshot of the results.

select distinct recId, caseNumber, servtype, mins

from

(

select 

recid

,caseNumber

,[preEnrollment_type]

,[preEnrollment_minutes]

,[screening_type]

,[screeningEnA_minutes]

,[ifsp_type]

,[ifsp_minutes]

from

CaseManagementProgressNote

where

[formComplete]=1

and [reviewed]<>1

and [dataentry]<>1

and [caseManagementEntry]=1

and [serviceCoordinator] <> '[email protected]'

and [contactDateTime] >= '1/1/2015'

and [childID] is not null

) as cp

unpivot

(

servType for servTypes in ([preEnrollment_type],[screening_type],[ifsp_type])

) as up1

unpivot

(

mins for minutess in ([preEnrollment_minutes],[screeningEnA_minutes],[ifsp_minutes])

) as up2

order by

recId

 

Screenshot: https://www.dropbox.com/s/vxtoiq8q1i81pc3/unpivot.JPG?dl=0 


Thursday, May 7, 2015 - 9:46:13 AM - mary Back To Top (37122)

 

Thank you so much for the tip it is a huge help.


Monday, April 20, 2015 - 12:31:40 PM - Navitha Back To Top (36987)

Hey, 

Thanks for the tip but i want the null values as well in the desire table what should i do?

I want the table to look like

customerID Phone 

1              705-491-1111

1              705-491-1110

1              NULL

2               613-492-2222

2             NULL

2             NULL

3             416-493-3333

3             416-493-3330

3             416-493-3339

 

Thank you


Saturday, April 18, 2015 - 2:39:40 PM - yasar Back To Top (36978)

update table order the order table following column and values

user   phone1  phone2   dont_exist

A       1234      2343      2343

B        0987     2112       0987

    update the table order if dont_exist column have phone1 or phone2 update it as 1   otherwise 0  .  i need ans for this kindly help me


Friday, April 17, 2015 - 5:14:05 PM - Angela Back To Top (36969)

Really helpful thanks alot!


Friday, December 12, 2014 - 5:33:40 PM - kevin Back To Top (35602)

Based on your initial solution (CustomerPhones table), you used UNION ALL and UNPIVOT to normalize the output. Can you give an example of a PIVOT to reverse from UNPIVOT back to PIVOT (original) format using same example ?


Monday, December 8, 2014 - 6:39:40 AM - masud Back To Top (35543)

Can someone please explain to me, what is the purpose of this code? How does this work? 

idp = SUBSTRING(Phones, LEN(Phones) - PATINDEX('%[^0-9]%', REVERSE(Phones)) + 2, 32),
    idpt = SUBSTRING(PhoneTypes, LEN(PhoneTypes) - PATINDEX('%[^0-9]%', REVERSE(PhoneTypes)) + 2, 32)

How WHERE condition helping to achieve the target report?

WHERE idp = idpt

 

 


Tuesday, October 28, 2014 - 11:23:07 AM - Jeff Moden Back To Top (35097)

Apologies... I was thinking "Un-Cross Tab" in my previous post and meant to say "CROSS APPLY", instead.


Monday, October 27, 2014 - 11:14:29 PM - Jeff Moden Back To Top (35085)

Consider using CROSS TAB for simplicity, readability, and performance, innstead.

 SELECT cp.CustomerID, unpvt.Phone, unpvt.PhoneType
   INTO #Test1
   FROM dbo.CustomerPhones2 cp
  CROSS APPLY (SELECT Phone1, PhoneType1 UNION ALL
               SELECT Phone2, PhoneType2 UNION ALL
               SELECT Phone3, PhoneType3
              )unpvt (Phone , PhoneType)
  WHERE unpvt.Phone > ''

Friday, September 26, 2014 - 11:47:07 AM - Puru Back To Top (34732)

This tip helped a lot!


Wednesday, July 9, 2014 - 11:28:00 AM - Kate P Back To Top (32608)

This is a great post!  Thank you!  Do you have any advice on what you would do if you needed to keep all phone rows for each customer? So each customer would have three rows - Customer 1 would have one null row, Customer 2 two null rows, and Customer 3 no null rows.  In my specific example, I have survery responses for several questions, and I need to keep the all the question texts even when the patient did not answer it.  Thanks!


Monday, September 16, 2013 - 2:12:04 PM - Satish Back To Top (26823)

Can we just right the below

, IDP = RIGHT(PHONES, 1)

, IDPT = RIGHT(PHONE_TYPES, 1)

in place of 

idp = SUBSTRING(Phones, LEN(Phones) - PATINDEX('%[^0-9]%', REVERSE(Phones)) + 2, 32),
    idpt = SUBSTRING(PhoneTypes, LEN(PhoneTypes) - PATINDEX('%[^0-9]%', REVERSE(PhoneTypes)) + 2, 32)

you excellently explained  the general case.

Thanks for such a nice post.


Friday, August 30, 2013 - 9:04:10 AM - beatrice Back To Top (26547)

very excellent


Tuesday, July 16, 2013 - 8:49:21 AM - SMO Back To Top (25854)

Very helpful.


Monday, July 15, 2013 - 10:01:42 PM - Aaron Bertrand Back To Top (25845)

SMO, just the one linked in the article:

http://www.mssqltips.com/sqlservertip/2783/script-to-create-dynamic-pivot-queries-in-sql-server/


Sunday, July 14, 2013 - 1:24:33 PM - SMO Back To Top (25831)

Do you have any posts on Pivot?


Friday, July 12, 2013 - 8:58:22 AM - Stephen J Voss Back To Top (25812)

Wow - extremely helpful and timely. In one system, we have 500,000 client records and each one may have Home, Responsible Party, Emergency Contact, PCP (Primary Care Physician) phone numbers all in the same record.  In a second system, we have the other example, with each contact type getting its own set of phone numbers...but still related back to the client. Tremendous help for data analysts (like me) and our developers who are writing dozens of new custom reports.

 















get free sql tips
agree to terms