How to transpose from Row to Columns with out using Pivot

  • Hi

    I'm using SQL Server 2005 and have a table with telephone nr in. When I link it back to main table with the customer details I get duplicate values for the same client seeing that the client can have more than 1 telephone nr.

    The table is as follows and I would like to transpose just the first 3 telephone numbers for each client.

    table :

    Client NrOccuranceTelephone

    A110115551234

    A120115554321

    A130115559999

    A210115558888

    A310115550000

    A320115551111

    Becomes

    Client NrTelephone 1Telephone 2Telephone 3

    A1011555123401155543210115559999

    A20115558888NULLNULL

    A301155500000115551111NULL

    I see that the Pivot uses a aggregation function everytime so I assume that it will not work in this case?

    Any suggestion are most welcome

    Thx

  • Thanks I found this on this site, works perfectly

    SELECT

    ACCT_DEBTOR,

    max(LandLineNumber1) as LandLineNumber1,

    max(LandLineNumber2) as LandLineNumber2,

    max(LandLineNumber3) as LandLineNumber3

    FROM (

    SELECT

    ACCT_DEBTOR,

    OCCURRENCE,

    (Case WHEN OCCURRENCE=1 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber1,

    (Case WHEN OCCURRENCE=2 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber2,

    (Case WHEN OCCURRENCE=3 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber3

    FROM Post_File082_Landline_No

    ) AS LandlineNumbers

  • You don't need the subselect:

    SELECT

    ACCT_DEBTOR,

    MAX(Case WHEN OCCURRENCE=1 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber1,

    MAX(Case WHEN OCCURRENCE=2 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber2,

    MAX(Case WHEN OCCURRENCE=3 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber3

    FROM Post_File082_Landline_No

    GROUP BY ACCT_DEBTOR



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Tested it without the sub select, and works perfectly. Thanks!

  • Also you can achieve the same using PIVOT

    e.g.

    SELECT ACCT_DEBTOR,

    [1] AS [LandLineNumber1],

    [2] AS [LandLineNumber2],

    [3] AS [LandLineNumber3]

    FROM (SELECT ACCT_DEBTOR,OCCURRENCE,LANDLINE_CONTACT_NO FROM Post_File082_Landline_No) a

    PIVOT(MAX(LANDLINE_CONTACT_NO) FOR OCCURRENCE IN ([1],[2],[3])) p

    Far away is close at hand in the images of elsewhere.
    Anon.

  • HI, yes, my original post was to get the value without using a aggregation, but I see that the solution which i'm using includes it in any case! I didnt think it thru, seeing that the pivot would work in this case as well.

    I'll test the pivot as well, seeing that I could create an index on the view which I'm using the query in now, and to return any data takes about 90 seconds or so. I'll test the pivot and give some feedback on the performance as well, but I dont want to spend to much effort on the testing of the data as this is only temp data.

    Many thanks

  • Pivot table solution on 9.5 mil records returned : 134 seconds

    and with first solution 132 seconds.

  • Thank you for posting the results!

    Since they are almost equal I'm wondering what the execution plans would look like...

    Would you mind posting the actual execution plan for both options (saved as .sqlplan and attached as zip file) or check it yourself whether they are any different?

    Maybe one of the performance gurus comes across and might find some additional tuning options...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Attached is the SQL Plan. I could never really understand the plans but hopefully someone will.

    the only thing which I can add is that the group by query ranges from 116 seconds to 242 seconds depending on the indexes used on it, but the pivot with an index on OCCURRENCE only completed in 704 seconds but for the rest between 131 seconds and 208.

  • I'm by far no expert on execution plans but to me it looks like there is a missing index on column ACCT_DEBTOR that causes the major time of the non-pivot query for sorting. So I would use an index on the two columns ACCT_DEBTOR, OCCURRENCE with column LANDLINE_CONTACT_NO included. I would add rsp. include the other two columns to avoid a bookmark lookup that may drive the query analyzer not to use this index at all.

    If my interpretation is correct than SQL Server would pick the index, having all the data already in the right order eliminating the sort. The index should also help the pivot.

    Running over 2 minutes for those 12mill rows seems a little bit slow (depending on the hardware, of course).

    If you have a test system, you could verify how the index will affect the query and your overall performance (including disc space since the index will probably need more than 100Mb...).



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • An actual execution plan would be much more useful than estimated plans (we could see the run-time distribution of data across the threads for example).

    In the posted plans, the PIVOT plan uses a hash aggregate whereas the non-PIVOT chooses a sort followed by a stream aggregate. This may be optimal, but it would be interesting to add OPTION (HASH GROUP) to the non-PIVOT query, to see a closer apples-to-apples comparison.

    I guess we should also check that DBCC DROPCLEANBUFFERS was run before each test to ensure a level playing field (cold cache).

    It's all a bit academic without appropriate indexes, or statistics at least.

    Interesting though.

    Paul

  • Paul White (6/24/2009)


    An actual execution plan would be much more useful than estimated plans (we could see the runtime distribution of data across the threads for example).

    ...

    It's all a bit academic without appropriate indexes, or statistics at least.

    Interesting though.

    Paul

    Thanx for looking into it!

    What indexes would you consider as "appropriate"?

    As far as I can see it doesn't really make sense to compare the two performance wise as long as both aren't "tuned".

    The next steps I was thinking of getting towards comparable results was:

    1) getting the index issue fixed

    2) getting actual execution plans using indexes defined above (for both, non-pivot and pivot)

    3) If execution plans do look ok start comparing apples and apples and see where we end at.

    What I've seen when looking at the exec plans was that there seems to be room for significant performance improvement for the non-pivot solution whereas I couldn't really see a way to get the pivot faster.

    Making long story short:

    This thread made me wondering about the benefit of using PIVOT at all (knowing that I've recommended it quite often in the past though... :unsure:)



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • lmu92 (6/24/2009)


    What indexes would you consider as "appropriate"?

    A nonclustered index on ACCT_DEBTOR with OCCURRENCE and LANDLINE_CONTACT_NO as included columns, I think.

    (I would also be naturally inclined to create a clustered index on something (and then drop it again if it isn't needed), just to get rid of any nastiness like forwarded records or fragmentation. Precise indexing choices depend on information we don't have.)

    lmu92 (6/24/2009)


    As far as I can see it doesn't really make sense to compare the two performance wise as long as both aren't "tuned".

    The next steps I was thinking of getting towards comparable results was:

    1) getting the index issue fixed

    2) getting actual execution plans using indexes defined above (for both, non-pivot and pivot)

    3) If execution plans do look ok start comparing apples and apples and see where we end at.

    Sounds great.

    lmu92 (6/24/2009)


    What I've seen when looking at the exec plans was that there seems to be room for significant performance improvement for the non-pivot solution whereas I couldn't really see a way to get the pivot faster.

    The PIVOT will benefit from the nice new index. The QO may choose a Stream Aggregate with that index, but I would test both grouping options - unless sorted results are required the hash group normally performs better on large result sets.

    lmu92 (6/24/2009)


    This thread made me wondering about the benefit of using PIVOT at all (knowing that I've recommended it quite often in the past though... :unsure:)

    Apparently some people find it more readable (!)

    I'm not one of those people - and I rarely use PIVOT for probably the same reasons you have in mind.

    Paul

  • Hi guys, thought I would paste some of the performance figures.

    I used the following statements each time I ran 1 of the test

    CHECKPOINT;

    DBCC DROPCLEANBUFFERS;

    CREATE NONCLUSTERED INDEX Test1 ON Post_File082_Landline_No .......

    RUN TEST

    DROP INDEX

    Sorry about the attachment JPG, but could not get the table correctly pasted in here

  • Cool stuff Chris. Of course it leaves me wondered about the hash aggregate versus stream aggregate thing but hey.

    So PIVOT can be slightly more efficient - if you can be bothered to fight the syntax, and if the query you need suits it.

    😎

Viewing 15 posts - 1 through 15 (of 33 total)

You must be logged in to reply to this topic. Login to reply