August 1, 2012 at 9:41 pm
Comments posted to this topic are about the item An Alternative (Better?) Method to UNPIVOT (SQL Spackle)
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
August 2, 2012 at 12:39 am
Very nicely done, Dwain.
You can also do similar with SELECT/UNION ALL in 2005.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 2, 2012 at 12:42 am
Jeff Moden (8/2/2012)
Very nicely done, Dwain.You can also do similar with SELECT/UNION ALL in 2005.
I love it when I attract attention from the big boys!
Thanks for the praise :blush: and for stopping by.:hehe:
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
August 2, 2012 at 1:22 am
I was going to send the link to this to a colleague who's been learning unpivot BUT he'd probably be offended withthe mickey take at the top since his friend, another colleague, really is named Philip McCracken!
August 2, 2012 at 2:27 am
Very nice article.
Regards,
Basit A. Farooq (MSC Computing, MCITP SQL Server 2005 & 2008, MCDBA SQL Server 2000)
http://basitaalishan.comAugust 2, 2012 at 7:01 am
Thanks for the article, very interresting.
Basically, your "other Unpivot" technique is a brillant way to use the new (from 2008) Table Value Constructor
Using the same method, from your first example you could replace :
INSERT INTO #Orders
SELECT 1, NULL, 3 UNION ALL SELECT 2, 5, 4 UNION ALL SELECT 1, 3, 10
By the following :
INSERT INTO #Orders VALUES
(1, NULL, 3), (2, 5, 4), (1, 3, 10)
I don't think it's a real gain but maybe it's more readable.
August 2, 2012 at 7:33 am
Very cool. Will have to dig into this and see whether or not it's something that is viable for use in my environment.
August 2, 2012 at 7:41 am
Another cracking good article Dwain!
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 2, 2012 at 8:02 am
Great article Dwain. I hadn't thought of trying the value constructor in the apply operator before... something new to play with.
Thanks!
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
August 2, 2012 at 9:08 am
Interesting read. Much appreciated.
Cheers
August 2, 2012 at 9:50 am
Very interesting Dwain!
A very nice read.
Thanks for sharing.
-- Gianluca Sartori
August 2, 2012 at 9:50 am
There is a way to get NULL values back in an unpivot. It's not pretty but it works.
IF OBJECT_ID('tempdb..#Orders','U') IS NOT NULL
DROP TABLE #Orders
-- DDL and sample data for UNPIVOT Example 1
CREATE TABLE #Orders
(Orderid int identity, GiftCard int, TShirt int, Shipping int)
INSERT INTO #Orders
SELECT 1, NULL, 3 UNION ALL SELECT 2, 5, 4 UNION ALL SELECT 1, 3, 10
SELECT * FROM #Orders
DECLARE @Start DATETIME
DECLARE @End DATETIME
SET @Start = GETDATE()
-- Traditional UNPIVOT
SELECT OrderID, convert(varchar(15), ProductName) [ProductName], ProductQty
FROM (
SELECT OrderID, GiftCard, TShirt, Shipping
FROM #Orders) p
UNPIVOT
(ProductQty FOR ProductName IN ([GiftCard], [TShirt], [Shipping])) as unpvt
SET @End = GETDATE()
SELECT DATEDIFF(ms,@Start,@End) AS ElapsedTime
-- Set a placeholder for NULL values.
-- Let's assume, the integers are always greater than zero.
-- if this is the case, we can use 0 to represent a NULL
DECLARE @NULL INT
SET @NULL = 0
SET @Start = GETDATE()
SELECT OrderID, convert(varchar(15), ProductName) [ProductName], NULLIF(ProductQty,0) AS ProductQty
FROM (
SELECT OrderID
, COALESCE(GiftCard,@NULL) AS GiftCard
, COALESCE(TShirt,@NULL) AS TShirt
, COALESCE(Shipping,@NULL) AS Shipping
FROM #Orders) p
UNPIVOT
(ProductQty FOR ProductName IN ([GiftCard], [TShirt], [Shipping])) as unpvt
SET @End=GETDATE()
SELECT DATEDIFF(ms,@Start,@End) AS ElapsedTime
Like i said, it's not pretty but will return your NULL values and in this case is faster than the traditional method on my laptop.
August 2, 2012 at 2:55 pm
Good to know ... I was unawares.
Here's the last example done with pivot/unpivot ... a bit more code, performance not as good ... but much more maintainable/flexible (IMO) given the problem setup ... generalizes the computations and can adjust for the source month columns coming in and out of use. In such a real world situation Users would likely accept a performance goal of tolerable ...
Those of you who continue to profess a belief in the Users will receive the standard substandard training which will result in your eventual elimination. Tron (1982)
;WITH [CTE_DETAIL]([ID],[ProductLine],[JanRev],[JanExp],[FebRev],[FebExp],[MarRev],[MarExp],[AprRev],[AprExp],[MayRev],[MayExp],[JunRev],[JunExp],[JulRev],[JulExp],[AugRev],[AugExp],[SepRev],[SepExp],[OctRev],[OctExp],[NovRev],[NovExp],[DecRev],[DecExp])
AS
(
SELECT[ID],
[ProductLine],
--$0 AS
[JanRev],
--$0 AS
[JanExp],
--$0 AS
[FebRev],
--$0 AS
[FebExp],
--$0 AS
[MarRev],
--$0 AS
[MarExp],
$0 AS
[AprRev],
$0 AS
[AprExp],
$0 AS
[MayRev],
$0 AS
[MayExp],
$0 AS
[JunRev],
$0 AS
[JunExp],
$0 AS
[JulRev],
$0 AS
[JulExp],
$0 AS
[AugRev],
$0 AS
[AugExp],
$0 AS
[SepRev],
$0 AS
[SepExp],
$0 AS
[OctRev],
$0 AS
[OctExp],
$0 AS
[NovRev],
$0 AS
[NovExp]
,
$0 AS
[DecRev],
$0 AS
[DecExp]
FROM#ProfitLoss
),
[CTE_DETAIL_UNPIVOTED]([ID],[ProductLine],[sMONTH],[iMONTH],[sACCT],[AMOUNT])
AS
(
SELECT[ID],
[ProductLine],
LEFT([PERIOD_ITEM], 3) AS [sMONTH],
MONTH(LEFT([PERIOD_ITEM], 3) + '1, 2000') AS [iMONTH],
RIGHT([PERIOD_ITEM], LEN([PERIOD_ITEM]) - 3) AS [sACCT],
[AMOUNT]
FROM(
SELECT[ID],
[ProductLine],
[JanRev],
[JanExp],
[JanRev]-[JanExp] AS [JanMargin],
[FebRev],
[FebExp],
[FebRev]-[FebExp] AS [FebMargin],
[MarRev],
[MarExp],
[MarRev]-[MarExp] AS [MarMargin],
[AprRev],
[AprExp],
[AprRev]-[AprExp] AS [AprMargin],
[MayRev],
[MayExp],
[MayRev]-[MayExp] AS [MayMargin],
[JunRev],
[JunExp],
[JunRev]-[JunExp] AS [JunMargin],
[JulRev],
[JulExp],
[JulRev]-[JulExp] AS [JulMargin],
[AugRev],
[AugExp],
[AugRev]-[AugExp] AS [AugMargin],
[SepRev],
[SepExp],
[SepRev]-[SepExp] AS [SepMargin],
[OctRev],
[OctExp],
[OctRev]-[OctExp] AS [OctMargin],
[NovRev],
[NovExp],
[NovRev]-[NovExp] AS [NovMargin],
[DecRev],
[DecExp],
[DecRev]-[DecExp] AS [DecMargin]
FROM[CTE_DETAIL]
) AS [PIVOTED] UNPIVOT
(
[AMOUNT] FOR [PERIOD_ITEM] IN
(
[JanRev],
[JanExp],
[JanMargin],
[FebRev],
[FebExp],
[FebMargin],
[MarRev],
[MarExp],
[MarMargin],
[AprRev],
[AprExp],
[AprMargin],
[MayRev],
[MayExp],
[MayMargin],
[JunRev],
[JunExp],
[JunMargin],
[JulRev],
[JulExp],
[JulMargin],
[AugRev],
[AugExp],
[AugMargin],
[SepRev],
[SepExp],
[SepMargin],
[OctRev],
[OctExp],
[OctMargin],
[NovExp],
[NovRev],
[NovMargin],
[DecRev],
[DecExp],
[DecMargin]
)
) AS [UNPIVIOTED]
),
[CTE_DETAIL_CALC]
(
[ID],
[ProductLine],
[sMONTH],
[sACCT],
[AMOUNT],
[ $ O/(U) Prev Mo],
[ % O/(U) Prev Mo]
)
AS
(
SELECTA.[ID],
A.[ProductLine],
A.[sMONTH],
A.[sACCT],
A.[AMOUNT],
COALESCE(A.[AMOUNT] - B.[AMOUNT], $0) AS [ $ O/(U) Prev Mo],
ROUND(100*COALESCE((A.[AMOUNT] - B.[AMOUNT])/NULLIF(B.[AMOUNT],$0),$0), 1) AS [ % O/(U) Prev Mo]
FROM[CTE_DETAIL_UNPIVOTED] A LEFT JOIN
[CTE_DETAIL_UNPIVOTED] B
ONA.[ID] = B.[ID]
ANDA.[sACCT] = B.[sACCT]
ANDA.[iMONTH] = B.[iMONTH] + 1
)
SELECT[ID],
[ProductLine],
[LineItem],
MAX([Jan]) AS [Jan],
MAX([Feb]) AS [Feb],
MAX([Mar]) AS [Mar],
MAX([Apr]) AS [Apr],
MAX([May]) AS [May],
MAX([Jun]) AS [Jun],
MAX([Jul]) AS [Jul],
MAX([Aug]) AS [Aug],
MAX([Sep]) AS [Sep],
MAX([Oct]) AS [Oct],
MAX([Nov]) AS [Nov],
MAX([Dec]) AS [Dec]
FROM(
SELECT[ID],
[ProductLine],
[sACCT],
CASE [sACCT]
WHEN 'Rev' THEN 'Revenues'
WHEN 'Exp' THEN 'Expenses'
WHEN 'Margin' THEN 'Margin'
ELSE 'UNKNOWN'
END AS [LineItem],
[Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec]
FROM(SELECT[ID], [ProductLine], [sMONTH], [sACCT], [AMOUNT], [ $ O/(U) Prev Mo], [ % O/(U) Prev Mo]FROM [CTE_DETAIL_CALC]) AS [UNPIVOTED] PIVOT
(MAX([AMOUNT]) FOR [sMONTH] IN ([Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec])) AS [PIVOTED_AMOUNT]
UNION
SELECT[ID],
[ProductLine],
[sACCT],
' $ O/(U) Prev Mo' AS [LineItem],
[Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec]
FROM(SELECT[ID], [ProductLine], [sMONTH], [sACCT], [AMOUNT], [ $ O/(U) Prev Mo], [ % O/(U) Prev Mo]FROM [CTE_DETAIL_CALC]) AS [UNPIVOTED] PIVOT
(MAX([ $ O/(U) Prev Mo]) FOR [sMONTH] IN ([Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec])) AS [PIVOTED_AMOUNT]
UNION
SELECT[ID],
[ProductLine],
[sACCT],
' % O/(U) Prev Mo' AS [LineItem],
[Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec]
FROM(SELECT[ID], [ProductLine], [sMONTH], [sACCT], [AMOUNT], [ $ O/(U) Prev Mo], [ % O/(U) Prev Mo]FROM [CTE_DETAIL_CALC]) AS [UNPIVOTED] PIVOT
(MAX([ % O/(U) Prev Mo]) FOR [sMONTH] IN ([Jan],[Feb],[Mar],[Apr],[May],[Jun],[Jul],[Aug],[Sep],[Oct],[Nov],[Dec])) AS [PIVOTED_AMOUNT]
) A
GROUP BY [ID],[ProductLine],[LineItem],[sACCT]
ORDER BY
[ID],
[ProductLine],
CASE [sACCT]
WHEN 'Rev' THEN 10
WHEN 'Exp' THEN 20
WHEN 'Mar' THEN 30
ELSE 1000
END,
CASE [LineItem]
WHEN 'Revenues' THEN 10
WHEN 'Expenses' THEN 10
WHEN 'Margin' THEN 10
WHEN ' $ O/(U) Prev Mo' THEN 20
WHEN ' % O/(U) Prev Mo' THEN 30
ELSE 1000
END
August 2, 2012 at 7:48 pm
Before I get into thanking everybody for their comments, last weekend I looked into the performance gains achieved by the "other UNPIVOT" approach because I believe there's no such thing as a free lunch. I got some interesting results that I thought worthy of an update.
I reran the single UNPIVOT example with larger record sets and averaged results over 5 runs appear below.
I have highlighted in red the slower of the two approaches in case. In this test harness, I dumped the display results to local variables to suppress them.
From the results, we see the CROSS APPLY VALUES consistently beats the elapsed time of UNPIVOT up through 5,000,000 rows. Two specific points though:
1. There seems to be a breakeven point where CPU time for CAV begins to exceed that of UNPIVOT (at perhaps 800K rows).
2. The CAV results that show faster elapsed times than CPU (starting at around 500K rows) seemed odd to me, until I realized that I'm running these tests on a Core i5 processor machine, so this would imply that SQL is parallelizing the query. I tested this theory by adding OPTION (MAXDOP 1) and found that CPU/Elapsed time results were more normal again (i.e., elapsed time slightly greater than CPU time).
However I got these results at 5,000,000 rows (single run):
(5000000 row(s) affected)
SQL Server Execution Times:
CPU time = 4072 ms, elapsed time = 4304 ms.
SQL Server Execution Times:
CPU time = 2652 ms, elapsed time = 2688 ms.
So in other words, for CAV CPU time dropped from about 6989ms (this was the 75% sparseness case) to 2652ms. I am not sure why UNPIVOT was also reduced (4539ms to 4072ms) but obviously its a lesser drop. In any event, now CAV is beating UNPIVOT in CPU as well and also runs in lesser elapsed time! Seems to be a double benefit.
While this does seem to confirm my theory about parallelism, I'm not sure exactly what to make of the overall drop in CPU usage. Perhaps with lower parallelism the CPU needs to do less work?
As usual, the answer to "which approach should I use in my query?" is "it depends!"
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 15 posts - 1 through 15 (of 46 total)
You must be logged in to reply to this topic. Login to reply