How to transpose from Row to Columns with out using Pivot

Question

Post reply

How to transpose from Row to Columns with out using Pivot

chris.stuart

SSCarpal Tunnel

Points: 4965
More actions
June 22, 2009 at 10:21 am

#384817

Hi
I'm using SQL Server 2005 and have a table with telephone nr in. When I link it back to main table with the customer details I get duplicate values for the same client seeing that the client can have more than 1 telephone nr.
The table is as follows and I would like to transpose just the first 3 telephone numbers for each client.
table :
Client NrOccuranceTelephone
A110115551234
A120115554321
A130115559999
A210115558888
A310115550000
A320115551111
Becomes
Client NrTelephone 1Telephone 2Telephone 3
A1011555123401155543210115559999
A20115558888NULLNULL
A301155500000115551111NULL
I see that the Pivot uses a aggregation function everytime so I assume that it will not work in this case?
Any suggestion are most welcome
Thx

Viewing 15 posts - 1 through 15 (of 33 total)

You must be logged in to reply to this topic. Login to reply

chris.stuart SSCarpal Tunnel Points: 4965 More actions · Answer 1

Thanks I found this on this site, works perfectly

SELECT

ACCT_DEBTOR,

max(LandLineNumber1) as LandLineNumber1,

max(LandLineNumber2) as LandLineNumber2,

max(LandLineNumber3) as LandLineNumber3

FROM (

SELECT

ACCT_DEBTOR,

OCCURRENCE,

(Case WHEN OCCURRENCE=1 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber1,

(Case WHEN OCCURRENCE=2 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber2,

(Case WHEN OCCURRENCE=3 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber3

FROM Post_File082_Landline_No

) AS LandlineNumbers

LutzM SSC Guru Points: 107049 More actions · Answer 2

You don't need the subselect:

SELECT

ACCT_DEBTOR,

MAX(Case WHEN OCCURRENCE=1 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber1,

MAX(Case WHEN OCCURRENCE=2 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber2,

MAX(Case WHEN OCCURRENCE=3 THEN LANDLINE_CONTACT_NO ELSE null END) AS LandLineNumber3

FROM Post_File082_Landline_No

GROUP BY ACCT_DEBTOR

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

chris.stuart SSCarpal Tunnel Points: 4965 More actions · Answer 3

Tested it without the sub select, and works perfectly. Thanks!

David Burrows SSC Guru Points: 65143 More actions · Answer 4

Also you can achieve the same using PIVOT

e.g.

SELECT ACCT_DEBTOR,

[1] AS [LandLineNumber1],

[2] AS [LandLineNumber2],

[3] AS [LandLineNumber3]

FROM (SELECT ACCT_DEBTOR,OCCURRENCE,LANDLINE_CONTACT_NO FROM Post_File082_Landline_No) a

PIVOT(MAX(LANDLINE_CONTACT_NO) FOR OCCURRENCE IN ([1],[2],[3])) p

Far away is close at hand in the images of elsewhere.
Anon.

chris.stuart SSCarpal Tunnel Points: 4965 More actions · Answer 5

HI, yes, my original post was to get the value without using a aggregation, but I see that the solution which i'm using includes it in any case! I didnt think it thru, seeing that the pivot would work in this case as well.

I'll test the pivot as well, seeing that I could create an index on the view which I'm using the query in now, and to return any data takes about 90 seconds or so. I'll test the pivot and give some feedback on the performance as well, but I dont want to spend to much effort on the testing of the data as this is only temp data.

Many thanks

chris.stuart SSCarpal Tunnel Points: 4965 More actions · Answer 6

Pivot table solution on 9.5 mil records returned : 134 seconds

and with first solution 132 seconds.

LutzM SSC Guru Points: 107049 More actions · Answer 7

Thank you for posting the results!

Since they are almost equal I'm wondering what the execution plans would look like...

Would you mind posting the actual execution plan for both options (saved as .sqlplan and attached as zip file) or check it yourself whether they are any different?

Maybe one of the performance gurus comes across and might find some additional tuning options...

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

chris.stuart SSCarpal Tunnel Points: 4965 More actions · Answer 8

Attached is the SQL Plan. I could never really understand the plans but hopefully someone will.

the only thing which I can add is that the group by query ranges from 116 seconds to 242 seconds depending on the indexes used on it, but the pivot with an index on OCCURRENCE only completed in 704 seconds but for the rest between 131 seconds and 208.

LutzM SSC Guru Points: 107049 More actions · Answer 9

I'm by far no expert on execution plans but to me it looks like there is a missing index on column ACCT_DEBTOR that causes the major time of the non-pivot query for sorting. So I would use an index on the two columns ACCT_DEBTOR, OCCURRENCE with column LANDLINE_CONTACT_NO included. I would add rsp. include the other two columns to avoid a bookmark lookup that may drive the query analyzer not to use this index at all.

If my interpretation is correct than SQL Server would pick the index, having all the data already in the right order eliminating the sort. The index should also help the pivot.

Running over 2 minutes for those 12mill rows seems a little bit slow (depending on the hardware, of course).

If you have a test system, you could verify how the index will affect the query and your overall performance (including disc space since the index will probably need more than 100Mb...).

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Paul White SSC Guru Points: 150468 More actions · Answer 10

An actual execution plan would be much more useful than estimated plans (we could see the run-time distribution of data across the threads for example).

In the posted plans, the PIVOT plan uses a hash aggregate whereas the non-PIVOT chooses a sort followed by a stream aggregate. This may be optimal, but it would be interesting to add OPTION (HASH GROUP) to the non-PIVOT query, to see a closer apples-to-apples comparison.

I guess we should also check that DBCC DROPCLEANBUFFERS was run before each test to ensure a level playing field (cold cache).

It's all a bit academic without appropriate indexes, or statistics at least.

Interesting though.

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

LutzM SSC Guru Points: 107049 More actions · Answer 11

Paul White (6/24/2009)
An actual execution plan would be much more useful than estimated plans (we could see the runtime distribution of data across the threads for example).
...
It's all a bit academic without appropriate indexes, or statistics at least.
Interesting though.
Paul

Thanx for looking into it!

What indexes would you consider as "appropriate"?

As far as I can see it doesn't really make sense to compare the two performance wise as long as both aren't "tuned".

The next steps I was thinking of getting towards comparable results was:

1) getting the index issue fixed

2) getting actual execution plans using indexes defined above (for both, non-pivot and pivot)

3) If execution plans do look ok start comparing apples and apples and see where we end at.

What I've seen when looking at the exec plans was that there seems to be room for significant performance improvement for the non-pivot solution whereas I couldn't really see a way to get the pivot faster.

Making long story short:

This thread made me wondering about the benefit of using PIVOT at all (knowing that I've recommended it quite often in the past though... :unsure:)

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Paul White SSC Guru Points: 150468 More actions · Answer 12

lmu92 (6/24/2009)
What indexes would you consider as "appropriate"?

A nonclustered index on ACCT_DEBTOR with OCCURRENCE and LANDLINE_CONTACT_NO as included columns, I think.

(I would also be naturally inclined to create a clustered index on something (and then drop it again if it isn't needed), just to get rid of any nastiness like forwarded records or fragmentation. Precise indexing choices depend on information we don't have.)

lmu92 (6/24/2009)
As far as I can see it doesn't really make sense to compare the two performance wise as long as both aren't "tuned".
The next steps I was thinking of getting towards comparable results was:
1) getting the index issue fixed
2) getting actual execution plans using indexes defined above (for both, non-pivot and pivot)
3) If execution plans do look ok start comparing apples and apples and see where we end at.

Sounds great.

lmu92 (6/24/2009)
What I've seen when looking at the exec plans was that there seems to be room for significant performance improvement for the non-pivot solution whereas I couldn't really see a way to get the pivot faster.

The PIVOT will benefit from the nice new index. The QO may choose a Stream Aggregate with that index, but I would test both grouping options - unless sorted results are required the hash group normally performs better on large result sets.

lmu92 (6/24/2009)
This thread made me wondering about the benefit of using PIVOT at all (knowing that I've recommended it quite often in the past though... :unsure:)

Apparently some people find it more readable (!)

I'm not one of those people - and I rarely use PIVOT for probably the same reasons you have in mind.

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

chris.stuart SSCarpal Tunnel Points: 4965 More actions · Answer 13

Hi guys, thought I would paste some of the performance figures.

I used the following statements each time I ran 1 of the test

CHECKPOINT;

DBCC DROPCLEANBUFFERS;

CREATE NONCLUSTERED INDEX Test1 ON Post_File082_Landline_No .......

RUN TEST

DROP INDEX

Sorry about the attachment JPG, but could not get the table correctly pasted in here

Paul White SSC Guru Points: 150468 More actions · Answer 14

Cool stuff Chris. Of course it leaves me wondered about the hash aggregate versus stream aggregate thing but hey.

So PIVOT can be slightly more efficient - if you can be bothered to fight the syntax, and if the query you need suits it.

😎

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi