I want to pivot/transform the records as below:
Is there a better way to do this?
TIA
February 13, 2023 at 6:29 pm
Does the number of columns vary, depending on the number of items in the order?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
February 13, 2023 at 6:36 pm
There will only be 3 items but an order can have either 1, 2 or all 3 items.
So the final result will always have the 5 columns --> ID, OrderNo, OrderDate, Item1, Item2, Item3
February 13, 2023 at 6:42 pm
You may want to read these articles
https://www.sqlservercentral.com/articles/cross-tabs-and-pivots-part-1-converting-rows-to-columns-1
https://www.sqlservercentral.com/articles/cross-tabs-and-pivots-part-2-dynamic-cross-tabs
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
Set up some data (you should have done this when asking the question)
DROP TABLE IF EXISTS #temp
GO
SELECT *
INTO #temp
FROM (VALUES (1, 'ON_001', '12/5/2022'),
(1, 'ITEM1', 'APPLE'),
(2, 'ON_001', '12/5/2022'),
(2, 'ITEM1', 'APPLE'),
(2, 'ITEM2', 'BANANA'),
(3, 'ON_003', '12/6/2022'),
(3, 'ITEM3', 'ORANGE')
) T(ID, Referrence, Value);
A solution:
SELECT T1.ID,
T1.Referrence OrderNo,
T1.Value OrderDate,
T2.Value Item1,
T3.Value Item2,
T4.Value Item3
FROM #temp T1
LEFT JOIN #temp T2
ON T2.ID = T1.ID
AND T2.Referrence = 'ITEM1'
LEFT JOIN #temp T3
ON T3.ID = T1.ID
AND T3.Referrence = 'ITEM2'
LEFT JOIN #temp T4
ON T4.ID = T1.ID
AND T4.Referrence = 'ITEM3'
WHERE T1.Referrence LIKE 'ON[_]%'
;
February 13, 2023 at 6:59 pm
Here's the "standard" cross-tab method:
SELECT
ID,
MAX(CASE WHEN Referrence LIKE 'ON%' THEN Referrence END) AS OrderNo,
MAX(CASE WHEN Referrence LIKE 'ON%' THEN Value END) AS OrderDate,
MAX(CASE WHEN Referrence = 'ITEM1' THEN Value END) AS Item1,
MAX(CASE WHEN Referrence = 'ITEM2' THEN Value END) AS Item2,
MAX(CASE WHEN Referrence = 'ITEM3' THEN Value END) AS Item3
FROM #temp
GROUP BY ID
ORDER BY ID
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
February 13, 2023 at 8:02 pm
Thank you all!
Sorry but I made a mistake on my original posting. Revised DATA representation is below
IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE [#TestTable](
[ID] [int] NULL,
[REFERRENCE] [varchar](50) NULL,
[VALUE] [varchar](50) NULL
)
INSERT INTO #TestTable VALUES (1,'OrderNo','ON_0001')
INSERT INTO #TestTable VALUES (1,'OrderDate','12/5/2022')
INSERT INTO #TestTable VALUES (1,'ITEM1','APPLE')
INSERT INTO #TestTable VALUES (2,'OrderNo','ON_0002')
INSERT INTO #TestTable VALUES (2,'OrderDate','12/5/2022')
INSERT INTO #TestTable VALUES (2,'ITEM1','APPLE')
INSERT INTO #TestTable VALUES (2,'ITEM2','BANANA')
INSERT INTO #TestTable VALUES (3,'OrderNo','ON_0003')
INSERT INTO #TestTable VALUES (3,'OrderDate','12/6/2022')
INSERT INTO #TestTable VALUES (3,'ITEM3','ORANGE')
SELECT T1.ID,
T2.VALUE OrderNo,
T3.VALUE OrderDate,
T4.VALUE ITEM1,
T5.VALUE ITEM2,
T6.VALUE ITEM3
FROM #TestTable T1
LEFT JOIN #TestTable T2 ON T1.ID = T2.ID AND T2.REFERRENCE = 'OrderNo'
LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
WHERE T1.REFERRENCE = 'OrderNo'
February 13, 2023 at 8:29 pm
SELECT
ID,
MAX(CASE WHEN Referrence = 'OrderNo' THEN Value END) AS OrderNo,
MAX(CASE WHEN Referrence = 'OrderDate' THEN Value END) AS OrderDate,
MAX(CASE WHEN Referrence = 'ITEM1' THEN Value END) AS Item1,
MAX(CASE WHEN Referrence = 'ITEM2' THEN Value END) AS Item2,
MAX(CASE WHEN Referrence = 'ITEM3' THEN Value END) AS Item3
FROM #temp
GROUP BY ID
ORDER BY ID
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
February 13, 2023 at 8:32 pm
ARPRINCE,
Have you looked at the query plans for the two queries? The LEFT JOIN method is doing a full table scan per join. That could be a (big) performance issue, depending on the size of the original table.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
February 13, 2023 at 8:49 pm
ARPRINCE,
Have you looked at the query plans for the two queries? The LEFT JOIN method is doing a full table scan per join. That could be a (big) performance issue, depending on the size of the original table.
Yes agree, don't have a lot of records now. Thank you for your input!
February 13, 2023 at 9:41 pm
Thank you all!
Sorry but I made a mistake on my original posting. Revised DATA representation is below
IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE [#TestTable](
[ID] [int] NULL,
[REFERRENCE] [varchar](50) NULL,
[VALUE] [varchar](50) NULL
)
INSERT INTO #TestTable VALUES (1,'OrderNo','ON_0001')
INSERT INTO #TestTable VALUES (1,'OrderDate','12/5/2022')
INSERT INTO #TestTable VALUES (1,'ITEM1','APPLE')
INSERT INTO #TestTable VALUES (2,'OrderNo','ON_0002')
INSERT INTO #TestTable VALUES (2,'OrderDate','12/5/2022')
INSERT INTO #TestTable VALUES (2,'ITEM1','APPLE')
INSERT INTO #TestTable VALUES (2,'ITEM2','BANANA')
INSERT INTO #TestTable VALUES (3,'OrderNo','ON_0003')
INSERT INTO #TestTable VALUES (3,'OrderDate','12/6/2022')
INSERT INTO #TestTable VALUES (3,'ITEM3','ORANGE')
SELECT T1.ID,
T2.VALUE OrderNo,
T3.VALUE OrderDate,
T4.VALUE ITEM1,
T5.VALUE ITEM2,
T6.VALUE ITEM3
FROM #TestTable T1
LEFT JOIN #TestTable T2 ON T1.ID = T2.ID AND T2.REFERRENCE = 'OrderNo'
LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
WHERE T1.REFERRENCE = 'OrderNo'
Your query has some redundancy in it. You are selecting:
FROM #TestTable T1 WHERE T1.REFERRENCE = 'OrderNo'
and LEFT JOIN #TestTable T2 ON T1.ID = T2.ID AND T2.REFERRENCE = 'OrderNo'
so your query can be simplified to this:
SELECT T1.ID,
T1.VALUE OrderNo,
T3.VALUE OrderDate,
T4.VALUE ITEM1,
T5.VALUE ITEM2,
T6.VALUE ITEM3
FROM #TestTable T1
LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
WHERE T1.REFERRENCE = 'OrderNo'
;
The left join method is actually much faster than the "standard" cross tab method Scott provided.
Here is a test with about 10 million rows, you will need to install Jeff Moden's amazing dbo.fnTally function to insert the test data.
Create 10 million rows of test data:
IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE [#TestTable](
[ID] [int] NOT NULL,
[REFERRENCE] [varchar](50) NOT NULL,
[VALUE] [varchar](50) NULL
-- ,CONSTRAINT PK_#TestTable PRIMARY KEY CLUSTERED ([ID],[REFERRENCE])
)
;
INSERT INTO #TestTable
SELECT x.*
FROM dbo.fnTally(0,1000000) t
CROSS APPLY (VALUES (3*t.n+1,'OrderNo','ON_0001'),
(3*t.n+1, 'OrderDate','12/5/2022'),
(3*t.n+1,'ITEM1','APPLE'),
(3*t.n+2,'OrderNo','ON_0002'),
(3*t.n+2,'OrderDate','12/5/2022'),
(3*t.n+2,'ITEM1','APPLE'),
(3*t.n+2,'ITEM2','BANANA'),
(3*t.n+3,'OrderNo','ON_0003'),
(3*t.n+3,'OrderDate','12/6/2022'),
(3*t.n+3,'ITEM3','ORANGE')) x(A,B,C)
;
Then to test it:
DROP TABLE IF EXISTS #CrossTab
DROP TABLE IF EXISTS #LeftJoin
SET STATISTICS IO, TIME ON
GO
PRINT '**************************************** cross tab'
GO
SELECT
ID,
MAX(CASE WHEN Referrence = 'OrderNo' THEN Value END) AS OrderNo,
MAX(CASE WHEN Referrence = 'OrderDate' THEN Value END) AS OrderDate,
MAX(CASE WHEN Referrence = 'ITEM1' THEN Value END) AS Item1,
MAX(CASE WHEN Referrence = 'ITEM2' THEN Value END) AS Item2,
MAX(CASE WHEN Referrence = 'ITEM3' THEN Value END) AS Item3
into #CrossTab
FROM #TestTable
GROUP BY ID
ORDER BY ID
;
GO
PRINT '**************************************** left join'
GO
SELECT T1.ID,
T1.VALUE OrderNo,
T3.VALUE OrderDate,
T4.VALUE ITEM1,
T5.VALUE ITEM2,
T6.VALUE ITEM3
INTO #LeftJoin
FROM #TestTable T1
LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
WHERE T1.REFERRENCE = 'OrderNo'
Results
Cross Tab: CPU time = 17968 ms, elapsed time = 125893 ms.
Left Join: CPU time = 9047 ms, elapsed time = 2463 ms.
So according to my results the left join method is over 50 times faster than the cross tab method if you have 10 million rows of data.
February 13, 2023 at 9:55 pm
ARPRINCE wrote:Thank you all!
Sorry but I made a mistake on my original posting. Revised DATA representation is below
IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE [#TestTable](
[ID] [int] NULL,
[REFERRENCE] [varchar](50) NULL,
[VALUE] [varchar](50) NULL
)
INSERT INTO #TestTable VALUES (1,'OrderNo','ON_0001')
INSERT INTO #TestTable VALUES (1,'OrderDate','12/5/2022')
INSERT INTO #TestTable VALUES (1,'ITEM1','APPLE')
INSERT INTO #TestTable VALUES (2,'OrderNo','ON_0002')
INSERT INTO #TestTable VALUES (2,'OrderDate','12/5/2022')
INSERT INTO #TestTable VALUES (2,'ITEM1','APPLE')
INSERT INTO #TestTable VALUES (2,'ITEM2','BANANA')
INSERT INTO #TestTable VALUES (3,'OrderNo','ON_0003')
INSERT INTO #TestTable VALUES (3,'OrderDate','12/6/2022')
INSERT INTO #TestTable VALUES (3,'ITEM3','ORANGE')
SELECT T1.ID,
T2.VALUE OrderNo,
T3.VALUE OrderDate,
T4.VALUE ITEM1,
T5.VALUE ITEM2,
T6.VALUE ITEM3
FROM #TestTable T1
LEFT JOIN #TestTable T2 ON T1.ID = T2.ID AND T2.REFERRENCE = 'OrderNo'
LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
WHERE T1.REFERRENCE = 'OrderNo'
Your query has some redundancy in it. You are selecting:
FROM #TestTable T1 WHERE T1.REFERRENCE = 'OrderNo'
and
LEFT JOIN #TestTable T2 ON T1.ID = T2.ID AND T2.REFERRENCE = 'OrderNo'
so your query can be simplified to this:
SELECT T1.ID,
T1.VALUE OrderNo,
T3.VALUE OrderDate,
T4.VALUE ITEM1,
T5.VALUE ITEM2,
T6.VALUE ITEM3
FROM #TestTable T1
LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
WHERE T1.REFERRENCE = 'OrderNo'
;The left join method is actually much faster than the "standard" cross tab method Scott provided.
Here is a test with about 10 million rows, you will need to install Jeff Moden's amazing dbo.fnTally function to insert the test data.
Create 10 million rows of test data:
IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
DROP TABLE #TestTable;
CREATE TABLE [#TestTable](
[ID] [int] NOT NULL,
[REFERRENCE] [varchar](50) NOT NULL,
[VALUE] [varchar](50) NULL
-- ,CONSTRAINT PK_#TestTable PRIMARY KEY CLUSTERED ([ID],[REFERRENCE])
)
;
INSERT INTO #TestTable
SELECT x.*
FROM dbo.fnTally(0,1000000) t
CROSS APPLY (VALUES (3*t.n+1,'OrderNo','ON_0001'),
(3*t.n+1, 'OrderDate','12/5/2022'),
(3*t.n+1,'ITEM1','APPLE'),
(3*t.n+2,'OrderNo','ON_0002'),
(3*t.n+2,'OrderDate','12/5/2022'),
(3*t.n+2,'ITEM1','APPLE'),
(3*t.n+2,'ITEM2','BANANA'),
(3*t.n+3,'OrderNo','ON_0003'),
(3*t.n+3,'OrderDate','12/6/2022'),
(3*t.n+3,'ITEM3','ORANGE')) x(A,B,C)
;Then to test it:
DROP TABLE IF EXISTS #CrossTab
DROP TABLE IF EXISTS #LeftJoin
SET STATISTICS IO, TIME ON
GO
PRINT '**************************************** cross tab'
GO
SELECT
ID,
MAX(CASE WHEN Referrence = 'OrderNo' THEN Value END) AS OrderNo,
MAX(CASE WHEN Referrence = 'OrderDate' THEN Value END) AS OrderDate,
MAX(CASE WHEN Referrence = 'ITEM1' THEN Value END) AS Item1,
MAX(CASE WHEN Referrence = 'ITEM2' THEN Value END) AS Item2,
MAX(CASE WHEN Referrence = 'ITEM3' THEN Value END) AS Item3
into #CrossTab
FROM #TestTable
GROUP BY ID
ORDER BY ID
;
GO
PRINT '**************************************** left join'
GO
SELECT T1.ID,
T1.VALUE OrderNo,
T3.VALUE OrderDate,
T4.VALUE ITEM1,
T5.VALUE ITEM2,
T6.VALUE ITEM3
INTO #LeftJoin
FROM #TestTable T1
LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
WHERE T1.REFERRENCE = 'OrderNo'Results
Cross Tab: CPU time = 17968 ms, elapsed time = 125893 ms.
Left Join: CPU time = 9047 ms, elapsed time = 2463 ms.
So according to my results the left join method is over 50 times faster than the cross tab method if you have 10 million rows of data.
Where are the IO stats?
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
February 13, 2023 at 11:20 pm
Where are the IO stats?
**************************************** cross tab
Table '#TestTable'. Scan count 5, logical reads 40651, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Workfile'. Scan count 12, logical reads 53040, physical reads 5862, page server reads 0, read-ahead reads 47178, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 14733 ms, elapsed time = 124605 ms.
Warning: Null value is eliminated by an aggregate or other SET operation.
(3000003 rows affected)
**************************************** left join
Table '#TestTable'. Scan count 25, logical reads 203255, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 9018 ms, elapsed time = 2479 ms.
(3000003 rows affected)
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply