PIVOT, is there a better way to do this?

  • I want to pivot/transform the records as below:

    Transform

    • Currently, I create a table with ID,OrderNo,OrderDate,Item1,Item2,Item3.
    • Then I add records with the DISTINCT ID,OrderNo,OrderDate,NULL,NULL,NULL
    • The I update each record where the ItemX exists.

    Is there a better way to do this?

    TIA

  • Does the number of columns vary, depending on the number of items in the order?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • There will only be 3 items but an order can have either 1, 2 or all 3 items.

    So the final result will always have the 5 columns -->  ID, OrderNo, OrderDate, Item1, Item2, Item3

     

  • Set up some data (you should have done this when asking the question)

    DROP TABLE IF EXISTS #temp
    GO
    SELECT *
    INTO #temp
    FROM (VALUES (1, 'ON_001', '12/5/2022'),
    (1, 'ITEM1', 'APPLE'),
    (2, 'ON_001', '12/5/2022'),
    (2, 'ITEM1', 'APPLE'),
    (2, 'ITEM2', 'BANANA'),
    (3, 'ON_003', '12/6/2022'),
    (3, 'ITEM3', 'ORANGE')
    ) T(ID, Referrence, Value);

    A solution:

    SELECT T1.ID, 
    T1.Referrence OrderNo,
    T1.Value OrderDate,
    T2.Value Item1,
    T3.Value Item2,
    T4.Value Item3
    FROM #temp T1
    LEFT JOIN #temp T2
    ON T2.ID = T1.ID
    AND T2.Referrence = 'ITEM1'
    LEFT JOIN #temp T3
    ON T3.ID = T1.ID
    AND T3.Referrence = 'ITEM2'
    LEFT JOIN #temp T4
    ON T4.ID = T1.ID
    AND T4.Referrence = 'ITEM3'
    WHERE T1.Referrence LIKE 'ON[_]%'
    ;

    q7

  • Here's the "standard" cross-tab method:

    SELECT 
    ID,
    MAX(CASE WHEN Referrence LIKE 'ON%' THEN Referrence END) AS OrderNo,
    MAX(CASE WHEN Referrence LIKE 'ON%' THEN Value END) AS OrderDate,
    MAX(CASE WHEN Referrence = 'ITEM1' THEN Value END) AS Item1,
    MAX(CASE WHEN Referrence = 'ITEM2' THEN Value END) AS Item2,
    MAX(CASE WHEN Referrence = 'ITEM3' THEN Value END) AS Item3
    FROM #temp
    GROUP BY ID
    ORDER BY ID

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Thank you all!

    Sorry but I made a mistake on my original posting. Revised DATA representation is below

    Transform2

     

     

     

    IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
    DROP TABLE #TestTable;

    CREATE TABLE [#TestTable](
    [ID] [int] NULL,
    [REFERRENCE] [varchar](50) NULL,
    [VALUE] [varchar](50) NULL
    )


    INSERT INTO #TestTable VALUES (1,'OrderNo','ON_0001')
    INSERT INTO #TestTable VALUES (1,'OrderDate','12/5/2022')
    INSERT INTO #TestTable VALUES (1,'ITEM1','APPLE')
    INSERT INTO #TestTable VALUES (2,'OrderNo','ON_0002')
    INSERT INTO #TestTable VALUES (2,'OrderDate','12/5/2022')
    INSERT INTO #TestTable VALUES (2,'ITEM1','APPLE')
    INSERT INTO #TestTable VALUES (2,'ITEM2','BANANA')
    INSERT INTO #TestTable VALUES (3,'OrderNo','ON_0003')
    INSERT INTO #TestTable VALUES (3,'OrderDate','12/6/2022')
    INSERT INTO #TestTable VALUES (3,'ITEM3','ORANGE')




    SELECT T1.ID,
    T2.VALUE OrderNo,
    T3.VALUE OrderDate,
    T4.VALUE ITEM1,
    T5.VALUE ITEM2,
    T6.VALUE ITEM3
    FROM #TestTable T1
    LEFT JOIN #TestTable T2 ON T1.ID = T2.ID AND T2.REFERRENCE = 'OrderNo'
    LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
    LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
    LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
    LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
    WHERE T1.REFERRENCE = 'OrderNo'

  • Jonathan AC Roberts wrote:

    Set up some data (you should have done this when asking the question)[/

    Thank you and you are right, will put in data script in the future.

    • This reply was modified 1 year, 9 months ago by  ARPRINCE.
  • SELECT 
    ID,
    MAX(CASE WHEN Referrence = 'OrderNo' THEN Value END) AS OrderNo,
    MAX(CASE WHEN Referrence = 'OrderDate' THEN Value END) AS OrderDate,
    MAX(CASE WHEN Referrence = 'ITEM1' THEN Value END) AS Item1,
    MAX(CASE WHEN Referrence = 'ITEM2' THEN Value END) AS Item2,
    MAX(CASE WHEN Referrence = 'ITEM3' THEN Value END) AS Item3
    FROM #temp
    GROUP BY ID
    ORDER BY ID

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ARPRINCE,

    Have you looked at the query plans for the two queries?  The LEFT JOIN method is doing a full table scan per join.  That could be a (big) performance issue, depending on the size of the original table.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher wrote:

    ARPRINCE,

    Have you looked at the query plans for the two queries?  The LEFT JOIN method is doing a full table scan per join.  That could be a (big) performance issue, depending on the size of the original table.

     

    Yes agree, don't have a lot of records now. Thank you for your input!

    ExecutionPlan

     

     

     

  • ARPRINCE wrote:

    Thank you all!

    Sorry but I made a mistake on my original posting. Revised DATA representation is below

    Transform2

    IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
    DROP TABLE #TestTable;

    CREATE TABLE [#TestTable](
    [ID] [int] NULL,
    [REFERRENCE] [varchar](50) NULL,
    [VALUE] [varchar](50) NULL
    )


    INSERT INTO #TestTable VALUES (1,'OrderNo','ON_0001')
    INSERT INTO #TestTable VALUES (1,'OrderDate','12/5/2022')
    INSERT INTO #TestTable VALUES (1,'ITEM1','APPLE')
    INSERT INTO #TestTable VALUES (2,'OrderNo','ON_0002')
    INSERT INTO #TestTable VALUES (2,'OrderDate','12/5/2022')
    INSERT INTO #TestTable VALUES (2,'ITEM1','APPLE')
    INSERT INTO #TestTable VALUES (2,'ITEM2','BANANA')
    INSERT INTO #TestTable VALUES (3,'OrderNo','ON_0003')
    INSERT INTO #TestTable VALUES (3,'OrderDate','12/6/2022')
    INSERT INTO #TestTable VALUES (3,'ITEM3','ORANGE')




    SELECT T1.ID,
    T2.VALUE OrderNo,
    T3.VALUE OrderDate,
    T4.VALUE ITEM1,
    T5.VALUE ITEM2,
    T6.VALUE ITEM3
    FROM #TestTable T1
    LEFT JOIN #TestTable T2 ON T1.ID = T2.ID AND T2.REFERRENCE = 'OrderNo'
    LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
    LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
    LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
    LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
    WHERE T1.REFERRENCE = 'OrderNo'

    Your query has some redundancy in it. You are selecting:

    FROM #TestTable T1 WHERE T1.REFERRENCE = 'OrderNo'

    and LEFT JOIN #TestTable T2 ON T1.ID = T2.ID AND T2.REFERRENCE = 'OrderNo'

    so your query can be simplified to this:

    SELECT T1.ID,
    T1.VALUE OrderNo,
    T3.VALUE OrderDate,
    T4.VALUE ITEM1,
    T5.VALUE ITEM2,
    T6.VALUE ITEM3
    FROM #TestTable T1
    LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
    LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
    LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
    LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
    WHERE T1.REFERRENCE = 'OrderNo'
    ;

    The left join method is actually much faster than the "standard" cross tab method Scott provided.

    Here is a test with about 10 million rows, you will need to install Jeff Moden's amazing dbo.fnTally function to insert the test data.

    Create 10 million rows of test data:

    IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
    DROP TABLE #TestTable;

    CREATE TABLE [#TestTable](
    [ID] [int] NOT NULL,
    [REFERRENCE] [varchar](50) NOT NULL,
    [VALUE] [varchar](50) NULL
    -- ,CONSTRAINT PK_#TestTable PRIMARY KEY CLUSTERED ([ID],[REFERRENCE])
    )
    ;

    INSERT INTO #TestTable
    SELECT x.*
    FROM dbo.fnTally(0,1000000) t
    CROSS APPLY (VALUES (3*t.n+1,'OrderNo','ON_0001'),
    (3*t.n+1, 'OrderDate','12/5/2022'),
    (3*t.n+1,'ITEM1','APPLE'),
    (3*t.n+2,'OrderNo','ON_0002'),
    (3*t.n+2,'OrderDate','12/5/2022'),
    (3*t.n+2,'ITEM1','APPLE'),
    (3*t.n+2,'ITEM2','BANANA'),
    (3*t.n+3,'OrderNo','ON_0003'),
    (3*t.n+3,'OrderDate','12/6/2022'),
    (3*t.n+3,'ITEM3','ORANGE')) x(A,B,C)
    ;

    Then to test it:

    DROP TABLE IF EXISTS #CrossTab
    DROP TABLE IF EXISTS #LeftJoin
    SET STATISTICS IO, TIME ON
    GO
    PRINT '**************************************** cross tab'
    GO
    SELECT
    ID,
    MAX(CASE WHEN Referrence = 'OrderNo' THEN Value END) AS OrderNo,
    MAX(CASE WHEN Referrence = 'OrderDate' THEN Value END) AS OrderDate,
    MAX(CASE WHEN Referrence = 'ITEM1' THEN Value END) AS Item1,
    MAX(CASE WHEN Referrence = 'ITEM2' THEN Value END) AS Item2,
    MAX(CASE WHEN Referrence = 'ITEM3' THEN Value END) AS Item3
    into #CrossTab
    FROM #TestTable
    GROUP BY ID
    ORDER BY ID
    ;
    GO
    PRINT '**************************************** left join'
    GO
    SELECT T1.ID,
    T1.VALUE OrderNo,
    T3.VALUE OrderDate,
    T4.VALUE ITEM1,
    T5.VALUE ITEM2,
    T6.VALUE ITEM3
    INTO #LeftJoin
    FROM #TestTable T1
    LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
    LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
    LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
    LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
    WHERE T1.REFERRENCE = 'OrderNo'

    Results

    Cross Tab: CPU time = 17968 ms, elapsed time = 125893 ms.

    Left Join: CPU time = 9047 ms, elapsed time = 2463 ms.

    So according to my results the left join method is over 50 times faster than the cross tab method if you have 10 million rows of data.

  • Jonathan AC Roberts wrote:

    ARPRINCE wrote:

    Thank you all!

    Sorry but I made a mistake on my original posting. Revised DATA representation is below

    Transform2

    IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
    DROP TABLE #TestTable;

    CREATE TABLE [#TestTable](
    [ID] [int] NULL,
    [REFERRENCE] [varchar](50) NULL,
    [VALUE] [varchar](50) NULL
    )


    INSERT INTO #TestTable VALUES (1,'OrderNo','ON_0001')
    INSERT INTO #TestTable VALUES (1,'OrderDate','12/5/2022')
    INSERT INTO #TestTable VALUES (1,'ITEM1','APPLE')
    INSERT INTO #TestTable VALUES (2,'OrderNo','ON_0002')
    INSERT INTO #TestTable VALUES (2,'OrderDate','12/5/2022')
    INSERT INTO #TestTable VALUES (2,'ITEM1','APPLE')
    INSERT INTO #TestTable VALUES (2,'ITEM2','BANANA')
    INSERT INTO #TestTable VALUES (3,'OrderNo','ON_0003')
    INSERT INTO #TestTable VALUES (3,'OrderDate','12/6/2022')
    INSERT INTO #TestTable VALUES (3,'ITEM3','ORANGE')




    SELECT T1.ID,
    T2.VALUE OrderNo,
    T3.VALUE OrderDate,
    T4.VALUE ITEM1,
    T5.VALUE ITEM2,
    T6.VALUE ITEM3
    FROM #TestTable T1
    LEFT JOIN #TestTable T2 ON T1.ID = T2.ID AND T2.REFERRENCE = 'OrderNo'
    LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
    LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
    LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
    LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
    WHERE T1.REFERRENCE = 'OrderNo'

    Your query has some redundancy in it. You are selecting:

    FROM #TestTable T1 WHERE T1.REFERRENCE = 'OrderNo'

    and LEFT JOIN #TestTable T2 ON T1.ID = T2.ID AND T2.REFERRENCE = 'OrderNo'

    so your query can be simplified to this:

    SELECT T1.ID,
    T1.VALUE OrderNo,
    T3.VALUE OrderDate,
    T4.VALUE ITEM1,
    T5.VALUE ITEM2,
    T6.VALUE ITEM3
    FROM #TestTable T1
    LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
    LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
    LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
    LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
    WHERE T1.REFERRENCE = 'OrderNo'
    ;

    The left join method is actually much faster than the "standard" cross tab method Scott provided.

    Here is a test with about 10 million rows, you will need to install Jeff Moden's amazing dbo.fnTally function to insert the test data.

    Create 10 million rows of test data:

    IF OBJECT_ID('tempdb..#TestTable') IS NOT NULL
    DROP TABLE #TestTable;

    CREATE TABLE [#TestTable](
    [ID] [int] NOT NULL,
    [REFERRENCE] [varchar](50) NOT NULL,
    [VALUE] [varchar](50) NULL
    -- ,CONSTRAINT PK_#TestTable PRIMARY KEY CLUSTERED ([ID],[REFERRENCE])
    )
    ;

    INSERT INTO #TestTable
    SELECT x.*
    FROM dbo.fnTally(0,1000000) t
    CROSS APPLY (VALUES (3*t.n+1,'OrderNo','ON_0001'),
    (3*t.n+1, 'OrderDate','12/5/2022'),
    (3*t.n+1,'ITEM1','APPLE'),
    (3*t.n+2,'OrderNo','ON_0002'),
    (3*t.n+2,'OrderDate','12/5/2022'),
    (3*t.n+2,'ITEM1','APPLE'),
    (3*t.n+2,'ITEM2','BANANA'),
    (3*t.n+3,'OrderNo','ON_0003'),
    (3*t.n+3,'OrderDate','12/6/2022'),
    (3*t.n+3,'ITEM3','ORANGE')) x(A,B,C)
    ;

    Then to test it:

    DROP TABLE IF EXISTS #CrossTab
    DROP TABLE IF EXISTS #LeftJoin
    SET STATISTICS IO, TIME ON
    GO
    PRINT '**************************************** cross tab'
    GO
    SELECT
    ID,
    MAX(CASE WHEN Referrence = 'OrderNo' THEN Value END) AS OrderNo,
    MAX(CASE WHEN Referrence = 'OrderDate' THEN Value END) AS OrderDate,
    MAX(CASE WHEN Referrence = 'ITEM1' THEN Value END) AS Item1,
    MAX(CASE WHEN Referrence = 'ITEM2' THEN Value END) AS Item2,
    MAX(CASE WHEN Referrence = 'ITEM3' THEN Value END) AS Item3
    into #CrossTab
    FROM #TestTable
    GROUP BY ID
    ORDER BY ID
    ;
    GO
    PRINT '**************************************** left join'
    GO
    SELECT T1.ID,
    T1.VALUE OrderNo,
    T3.VALUE OrderDate,
    T4.VALUE ITEM1,
    T5.VALUE ITEM2,
    T6.VALUE ITEM3
    INTO #LeftJoin
    FROM #TestTable T1
    LEFT JOIN #TestTable T3 ON T1.ID = T3.ID AND T3.REFERRENCE = 'OrderDate'
    LEFT JOIN #TestTable T4 ON T1.ID = T4.ID AND T4.REFERRENCE = 'ITEM1'
    LEFT JOIN #TestTable T5 ON T1.ID = T5.ID AND T5.REFERRENCE = 'ITEM2'
    LEFT JOIN #TestTable T6 ON T1.ID = T6.ID AND T6.REFERRENCE = 'ITEM3'
    WHERE T1.REFERRENCE = 'OrderNo'

    Results

    Cross Tab: CPU time = 17968 ms, elapsed time = 125893 ms.

    Left Join: CPU time = 9047 ms, elapsed time = 2463 ms.

    So according to my results the left join method is over 50 times faster than the cross tab method if you have 10 million rows of data.

    Where are the IO stats?

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher wrote:

    Where are the IO stats?

    **************************************** cross tab

    Table '#TestTable'. Scan count 5, logical reads 40651, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

    Table 'Workfile'. Scan count 12, logical reads 53040, physical reads 5862, page server reads 0, read-ahead reads 47178, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 14733 ms, elapsed time = 124605 ms.

    Warning: Null value is eliminated by an aggregate or other SET operation.

    (3000003 rows affected)

    **************************************** left join

    Table '#TestTable'. Scan count 25, logical reads 203255, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 9018 ms, elapsed time = 2479 ms.

    (3000003 rows affected)

  • ScottPletcher wrote:

    Your query has some redundancy in it. You are selecting:

    Yes- thank you!

     

     

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply