An Alternative (Better?) Method to UNPIVOT (SQL Spackle)

Question

Post reply

An Alternative (Better?) Method to UNPIVOT (SQL Spackle)

Dwain Camps

SSC Guru

Points: 86978
More actions
August 1, 2012 at 9:41 pm

#264561

Comments posted to this topic are about the item An Alternative (Better?) Method to UNPIVOT (SQL Spackle)
My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 1 through 15 (of 46 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004410 More actions · Answer 1

Very nicely done, Dwain.

You can also do similar with SELECT/UNION ALL in 2005.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Dwain Camps SSC Guru Points: 86978 More actions · Answer 2

Jeff Moden (8/2/2012)
Very nicely done, Dwain.
You can also do similar with SELECT/UNION ALL in 2005.

I love it when I attract attention from the big boys!

Thanks for the praise :blush: and for stopping by.:hehe:

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

P Jones SSChampion Points: 12361 More actions · Answer 3

I was going to send the link to this to a colleague who's been learning unpivot BUT he'd probably be offended withthe mickey take at the top since his friend, another colleague, really is named Philip McCracken!

Basit Farooq Ten Centuries Points: 1151 More actions · Answer 4

Very nice article.

Regards,

Basit A. Farooq (MSC Computing, MCITP SQL Server 2005 & 2008, MCDBA SQL Server 2000)

http://basitaalishan.com

dennis.hafstrom SSC Enthusiast Points: 105 More actions · Answer 5

Nice article.

I see Itzik using this approach a lot as well. Personally I like it but is stuck due to a third party program in an lower compatibility level quite often :crying:

gsimard SSC-Addicted Points: 414 More actions · Answer 6

Thanks for the article, very interresting.

Basically, your "other Unpivot" technique is a brillant way to use the new (from 2008) Table Value Constructor

Using the same method, from your first example you could replace :

INSERT INTO #Orders

SELECT 1, NULL, 3 UNION ALL SELECT 2, 5, 4 UNION ALL SELECT 1, 3, 10

By the following :

INSERT INTO #Orders VALUES

(1, NULL, 3), (2, 5, 4), (1, 3, 10)

I don't think it's a real gain but maybe it's more readable.

Cadavre SSC-Forever Points: 41690 More actions · Answer 7

Very cool. Will have to dig into this and see whether or not it's something that is viable for use in my environment.

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

Another cracking good article Dwain!

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

WayneS SSC Guru Points: 95460 More actions · Answer 9

Great article Dwain. I hadn't thought of trying the value constructor in the apply operator before... something new to play with.

Thanks!

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

jfogel SSCarpal Tunnel Points: 4155 More actions · Answer 10

jfogel

SSCarpal Tunnel

Points: 4155

August 2, 2012 at 9:08 am

#1520917

Interesting read. Much appreciated.

Cheers

spaghettidba SSC Guru Points: 105732 More actions · Answer 11

Very interesting Dwain!

A very nice read.

Thanks for sharing.

-- Gianluca Sartori

cliffb SSCarpal Tunnel Points: 4598 More actions · Answer 12

There is a way to get NULL values back in an unpivot. It's not pretty but it works.

IF OBJECT_ID('tempdb..#Orders','U') IS NOT NULL

DROP TABLE #Orders

-- DDL and sample data for UNPIVOT Example 1

CREATE TABLE #Orders

(Orderid int identity, GiftCard int, TShirt int, Shipping int)

INSERT INTO #Orders

SELECT 1, NULL, 3 UNION ALL SELECT 2, 5, 4 UNION ALL SELECT 1, 3, 10

SELECT * FROM #Orders

DECLARE @Start DATETIME

DECLARE @End DATETIME

SET @Start = GETDATE()

-- Traditional UNPIVOT

SELECT OrderID, convert(varchar(15), ProductName) [ProductName], ProductQty

FROM (

SELECT OrderID, GiftCard, TShirt, Shipping

FROM #Orders) p

UNPIVOT

(ProductQty FOR ProductName IN ([GiftCard], [TShirt], [Shipping])) as unpvt

SET @End = GETDATE()

SELECT DATEDIFF(ms,@Start,@End) AS ElapsedTime

-- Set a placeholder for NULL values.

-- Let's assume, the integers are always greater than zero.

-- if this is the case, we can use 0 to represent a NULL

DECLARE @NULL INT

SET @NULL = 0

SET @Start = GETDATE()

SELECT OrderID, convert(varchar(15), ProductName) [ProductName], NULLIF(ProductQty,0) AS ProductQty

FROM (

SELECT OrderID

, COALESCE(GiftCard,@NULL) AS GiftCard

, COALESCE(TShirt,@NULL) AS TShirt

, COALESCE(Shipping,@NULL) AS Shipping

FROM #Orders) p

UNPIVOT

(ProductQty FOR ProductName IN ([GiftCard], [TShirt], [Shipping])) as unpvt

SET @End=GETDATE()

SELECT DATEDIFF(ms,@Start,@End) AS ElapsedTime

Like i said, it's not pretty but will return your NULL values and in this case is faster than the traditional method on my laptop.

Mike McIver SSC-Addicted Points: 493 More actions · Answer 13

Good to know ... I was unawares.

Here's the last example done with pivot/unpivot ... a bit more code, performance not as good ... but much more maintainable/flexible (IMO) given the problem setup ... generalizes the computations and can adjust for the source month columns coming in and out of use. In such a real world situation Users would likely accept a performance goal of tolerable ...

Those of you who continue to profess a belief in the Users will receive the standard substandard training which will result in your eventual elimination. Tron (1982)

;WITH [CTE_DETAIL]([ID],[ProductLine],[JanRev],[JanExp],[FebRev],[FebExp],[MarRev],[MarExp],[AprRev],[AprExp],[MayRev],[MayExp],[JunRev],[JunExp],[JulRev],[JulExp],[AugRev],[AugExp],[SepRev],[SepExp],[OctRev],[OctExp],[NovRev],[NovExp],[DecRev],[DecExp])

AS

(

SELECT[ID],

[ProductLine],

--$0 AS

[JanRev],

--$0 AS

[JanExp],

--$0 AS

[FebRev],

--$0 AS

[FebExp],

--$0 AS

[MarRev],

--$0 AS

[MarExp],

$0 AS

[AprRev],

$0 AS

[AprExp],

$0 AS

[MayRev],

$0 AS

[MayExp],

$0 AS

[JunRev],

$0 AS

[JunExp],

$0 AS

[JulRev],

$0 AS

[JulExp],

$0 AS

[AugRev],

$0 AS

[AugExp],

$0 AS

[SepRev],

$0 AS

[SepExp],

$0 AS

[OctRev],

$0 AS

[OctExp],

$0 AS

[NovRev],

$0 AS

[NovExp]

,

$0 AS

[DecRev],

$0 AS

[DecExp]

FROM#ProfitLoss

),

[CTE_DETAIL_UNPIVOTED]([ID],[ProductLine],[sMONTH],[iMONTH],[sACCT],[AMOUNT])

AS

(

SELECT[ID],

[ProductLine],

LEFT([PERIOD_ITEM], 3) AS [sMONTH],

MONTH(LEFT([PERIOD_ITEM], 3) + '1, 2000') AS [iMONTH],

RIGHT([PERIOD_ITEM], LEN([PERIOD_ITEM]) - 3) AS [sACCT],

[AMOUNT]

FROM(

SELECT[ID],

[ProductLine],

[JanRev],

[JanExp],

[JanRev]-[JanExp] AS [JanMargin],

[FebRev],

[FebExp],

[FebRev]-[FebExp] AS [FebMargin],

[MarRev],

[MarExp],

[MarRev]-[MarExp] AS [MarMargin],

[AprRev],

[AprExp],

[AprRev]-[AprExp] AS [AprMargin],

[MayRev],

[MayExp],

[MayRev]-[MayExp] AS [MayMargin],

[JunRev],

[JunExp],

[JunRev]-[JunExp] AS [JunMargin],

[JulRev],

[JulExp],

[JulRev]-[JulExp] AS [JulMargin],

[AugRev],

[AugExp],

[AugRev]-[AugExp] AS [AugMargin],

[SepRev],

[SepExp],

[SepRev]-[SepExp] AS [SepMargin],

[OctRev],

[OctExp],

[OctRev]-[OctExp] AS [OctMargin],

[NovRev],

[NovExp],

[NovRev]-[NovExp] AS [NovMargin],

[DecRev],

[DecExp],

[DecRev]-[DecExp] AS [DecMargin]

FROM[CTE_DETAIL]

) AS [PIVOTED] UNPIVOT

(

[AMOUNT] FOR [PERIOD_ITEM] IN

(

[JanRev],

[JanExp],

[JanMargin],

[FebRev],

[FebExp],

[FebMargin],

[MarRev],

[MarExp],

[MarMargin],

[AprRev],

[AprExp],

[AprMargin],

[MayRev],

[MayExp],

[MayMargin],

[JunRev],

[JunExp],

[JunMargin],

[JulRev],

[JulExp],

[JulMargin],

[AugRev],

[AugExp],

[AugMargin],

[SepRev],

[SepExp],

[SepMargin],

[OctRev],

[OctExp],

[OctMargin],

[NovExp],

[NovRev],

[NovMargin],

[DecRev],

[DecExp],

[DecMargin]

)

) AS [UNPIVIOTED]

),

[CTE_DETAIL_CALC]

(

[ID],

[ProductLine],

[sMONTH],

[sACCT],

[AMOUNT],

[ $ O/(U) Prev Mo],

[ % O/(U) Prev Mo]

)

AS

(

SELECTA.[ID],

A.[ProductLine],

A.[sMONTH],

A.[sACCT],

A.[AMOUNT],

COALESCE(A.[AMOUNT] - B.[AMOUNT], $0) AS [ $ O/(U) Prev Mo],

ROUND(100*COALESCE((A.[AMOUNT] - B.[AMOUNT])/NULLIF(B.[AMOUNT],$0),$0), 1) AS [ % O/(U) Prev Mo]

FROM[CTE_DETAIL_UNPIVOTED] A LEFT JOIN

[CTE_DETAIL_UNPIVOTED] B

ONA.[ID] = B.[ID]

ANDA.[sACCT] = B.[sACCT]

ANDA.[iMONTH] = B.[iMONTH] + 1

)

SELECT[ID],

[ProductLine],

[LineItem],

MAX([Jan]) AS [Jan],

MAX([Feb]) AS [Feb],

MAX([Mar]) AS [Mar],

MAX([Apr]) AS [Apr],

MAX([May]) AS [May],

MAX([Jun]) AS [Jun],

MAX([Jul]) AS [Jul],

MAX([Aug]) AS [Aug],

MAX([Sep]) AS [Sep],

MAX([Oct]) AS [Oct],

MAX([Nov]) AS [Nov],

MAX([Dec]) AS [Dec]

FROM(

SELECT[ID],

[ProductLine],

[sACCT],

CASE [sACCT]

WHEN 'Rev' THEN 'Revenues'

WHEN 'Exp' THEN 'Expenses'

WHEN 'Margin' THEN 'Margin'

ELSE 'UNKNOWN'

END AS [LineItem],

[Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec]

FROM(SELECT[ID], [ProductLine], [sMONTH], [sACCT], [AMOUNT], [ $ O/(U) Prev Mo], [ % O/(U) Prev Mo]FROM [CTE_DETAIL_CALC]) AS [UNPIVOTED] PIVOT

(MAX([AMOUNT]) FOR [sMONTH] IN ([Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec])) AS [PIVOTED_AMOUNT]

UNION

SELECT[ID],

[ProductLine],

[sACCT],

' $ O/(U) Prev Mo' AS [LineItem],

[Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec]

FROM(SELECT[ID], [ProductLine], [sMONTH], [sACCT], [AMOUNT], [ $ O/(U) Prev Mo], [ % O/(U) Prev Mo]FROM [CTE_DETAIL_CALC]) AS [UNPIVOTED] PIVOT

(MAX([ $ O/(U) Prev Mo]) FOR [sMONTH] IN ([Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec])) AS [PIVOTED_AMOUNT]

UNION

SELECT[ID],

[ProductLine],

[sACCT],

' % O/(U) Prev Mo' AS [LineItem],

[Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec]

FROM(SELECT[ID], [ProductLine], [sMONTH], [sACCT], [AMOUNT], [ $ O/(U) Prev Mo], [ % O/(U) Prev Mo]FROM [CTE_DETAIL_CALC]) AS [UNPIVOTED] PIVOT

(MAX([ % O/(U) Prev Mo]) FOR [sMONTH] IN ([Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec])) AS [PIVOTED_AMOUNT]

) A

GROUP BY [ID],[ProductLine],[LineItem],[sACCT]

ORDER BY

[ID],

[ProductLine],

CASE [sACCT]

WHEN 'Rev' THEN 10

WHEN 'Exp' THEN 20

WHEN 'Mar' THEN 30

ELSE 1000

END,

CASE [LineItem]

WHEN 'Revenues' THEN 10

WHEN 'Expenses' THEN 10

WHEN 'Margin' THEN 10

WHEN ' $ O/(U) Prev Mo' THEN 20

WHEN ' % O/(U) Prev Mo' THEN 30

ELSE 1000

END

Mike
MacroTrenz.Com
LinkedIn

Dwain Camps SSC Guru Points: 86978 More actions · Answer 14

Before I get into thanking everybody for their comments, last weekend I looked into the performance gains achieved by the "other UNPIVOT" approach because I believe there's no such thing as a free lunch. I got some interesting results that I thought worthy of an update.

I reran the single UNPIVOT example with larger record sets and averaged results over 5 runs appear below.

I have highlighted in red the slower of the two approaches in case. In this test harness, I dumped the display results to local variables to suppress them.

From the results, we see the CROSS APPLY VALUES consistently beats the elapsed time of UNPIVOT up through 5,000,000 rows. Two specific points though:

1. There seems to be a breakeven point where CPU time for CAV begins to exceed that of UNPIVOT (at perhaps 800K rows).

2. The CAV results that show faster elapsed times than CPU (starting at around 500K rows) seemed odd to me, until I realized that I'm running these tests on a Core i5 processor machine, so this would imply that SQL is parallelizing the query. I tested this theory by adding OPTION (MAXDOP 1) and found that CPU/Elapsed time results were more normal again (i.e., elapsed time slightly greater than CPU time).

However I got these results at 5,000,000 rows (single run):

(5000000 row(s) affected)

SQL Server Execution Times:

CPU time = 4072 ms, elapsed time = 4304 ms.

SQL Server Execution Times:

CPU time = 2652 ms, elapsed time = 2688 ms.

So in other words, for CAV CPU time dropped from about 6989ms (this was the 75% sparseness case) to 2652ms. I am not sure why UNPIVOT was also reduced (4539ms to 4072ms) but obviously its a lesser drop. In any event, now CAV is beating UNPIVOT in CPU as well and also runs in lesser elapsed time! Seems to be a double benefit.

While this does seem to confirm my theory about parallelism, I'm not sure exactly what to make of the overall drop in CPU usage. Perhaps with lower parallelism the CPU needs to do less work?

As usual, the answer to "which approach should I use in my query?" is "it depends!"

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St