sql pivot problem with unknown number of records to columns.

Question

Post reply

sql pivot problem with unknown number of records to columns.

skanth

Right there with Babe

Points: 730
More actions
February 28, 2013 at 2:23 am

#271016

Hi Please help me out in this scenario.
Table
IDNAME
1a
1b
1c
1d
2e
2f
3g
3h
3i
4j
5K
5L
5m
5N
5O
5P
required output
id name1name 2name3name4name 5name6
1abc
2ef
3ghi
4j
5klmnop
i need a query which gives the out put

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Cadavre SSC-Forever Points: 41690 More actions · Answer 1

Using this sample data: -

IF object_id('tempdb..#testEnvironment') IS NOT NULL

BEGIN

DROP TABLE #testEnvironment;

END;

SELECT ID, NAME

INTO #testEnvironment

FROM (VALUES(1,'a'),(1,'b'),(1,'c'),(1,'d'),(2,'e'),(2,'f'),

(3,'g'),(3,'h'),(3,'i'),(4,'j'),(5,'K'),(5,'L'),

(5,'m'),(5,'N'),(5,'O'),(5,'P')

)a(ID, NAME);

Then something like this: -

DECLARE @SQL NVARCHAR(MAX);

SELECT @SQL='SELECT ID,'+CHAR(13)+CHAR(10)+

STUFF((SELECT ','+CHAR(13)+CHAR(10)+'NULLIF(MAX(CASE WHEN POS = '+CAST(POS AS VARCHAR(3))+' THEN NAME ELSE '+CHAR(39)+CHAR(39)+' END),'+CHAR(39)+CHAR(39)+

') AS '+QUOTENAME('NAME'+CAST(POS AS VARCHAR(3)))

FROM (SELECT DISTINCT ROW_NUMBER() OVER(PARTITION BY ID ORDER BY NAME)

FROM #testEnvironment

)a(POS)

FOR XML PATH(''), TYPE).value('.','VARCHAR(MAX)'),1,3,'')+CHAR(13)+CHAR(10)+'FROM (SELECT ID, NAME, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY NAME)'+CHAR(13)+CHAR(10)+

SPACE(6)+'FROM #testEnvironment'+CHAR(13)+CHAR(10)+SPACE(6)+')a(ID,NAME,POS)'+CHAR(13)+CHAR(10)+'GROUP BY ID;';

EXECUTE sp_executesql @SQL;

Which returns: -

ID NAME1 NAME2 NAME3 NAME4 NAME5 NAME6

----------- ----- ----- ----- ----- ----- -----

1 a b c d NULL NULL

2 e f NULL NULL NULL NULL

3 g h i NULL NULL NULL

4 j NULL NULL NULL NULL NULL

5 K L m N O P

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Steve JP Ten Centuries Points: 1070 More actions · Answer 2

Using the above sample from Cadavre to load the data here is an alternative. Admittedly you will need to prejudge the possible number of options and pad out the sql but this code can be used in views as it isnt dynamic:

Select ID

,[Name1] = [1]

,[Name2] = [2]

,[Name3] = [3]

,[Name4] = [4]

,[Name5] = [5]

,[Name6] = [6]

,[Name7] = [7]

,[Name8] = [8]

from (

select distinct

ID

,Name

,RID = Dense_Rank() OVER(PARTITION BY ID ORDER BY NAME)

from #testEnvironment

) as XX

pivot ( min(Name) for RID in ( [1],[2],[3],[4],[5],[6],[7],[8] )

) as pvt

order by ID

Cadavre SSC-Forever Points: 41690 More actions · Answer 3

Steve JP (2/28/2013)
Using the above sample from Cadavre to load the data here is an alternative. Admittedly you will need to prejudge the possible number of options and pad out the sql but this code can be used in views as it isnt dynamic:
Select ID
,[Name1] = [1]
,[Name2] = [2]
,[Name3] = [3]
,[Name4] = [4]
,[Name5] = [5]
,[Name6] = [6]
,[Name7] = [7]
,[Name8] = [8]
from (
select distinct
ID
,Name
,RID = Dense_Rank() OVER(PARTITION BY ID ORDER BY NAME)
from #testEnvironment
) as XX
pivot ( min(Name) for RID in ( [1],[2],[3],[4],[5],[6],[7],[8] )
) as pvt
order by ID

The OP said for an "unknown" number, which means that realistically, dynamic is the only way to go.

If it was for a known number, then I find that the PIVOT operator is generally slower than using CASE statements then aggregating, e.g.

SELECT ID,

NULLIF(MAX(CASE WHEN POS = 1 THEN NAME ELSE '' END),'') AS [NAME1],

NULLIF(MAX(CASE WHEN POS = 2 THEN NAME ELSE '' END),'') AS [NAME2],

NULLIF(MAX(CASE WHEN POS = 3 THEN NAME ELSE '' END),'') AS [NAME3],

NULLIF(MAX(CASE WHEN POS = 4 THEN NAME ELSE '' END),'') AS [NAME4],

NULLIF(MAX(CASE WHEN POS = 5 THEN NAME ELSE '' END),'') AS [NAME5],

NULLIF(MAX(CASE WHEN POS = 6 THEN NAME ELSE '' END),'') AS [NAME6]

FROM (SELECT ID, NAME, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY NAME)

FROM #testEnvironment

)a(ID,NAME,POS)

GROUP BY ID;

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Steve JP Ten Centuries Points: 1070 More actions · Answer 4

Throwing the 2 versions of the code through a test rig shows that the methods are reasonable similar on speed. Its how the methods get used is probable more important than the milliseconds difference in its perfromance.

For each sceniaro there was 50 interations which had a random number of values for the each row and the time taken to run each statement was recorded. For my code I padded the code out to hold a potential of 15 columns [Name1].....[Name15] just to make the testing easier & trouble freee 🙂 I havent tested the width of the values that populate the columns and at a guess this will have an effect.

The first run had a random number of columns upto 5 values for each row.

RowsCadavre Steve_JP

10125

10032

500137

1000159

25002018

50003029

75004144

100004549

50000210242

100000370442

The second run had a random number of columns upto 10 values for each row.

RowsCadavre Steve_JP

10103

10051

500157

10002313

25003825

50007050

750010075

1000012697

50000649487

1000001170864

Hopefully this helpful. At least both methods avoid RBAR

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 5

Another version with use of dynamic SQL produces required output:

SET NOCOUNT ON;

-- setup sample table

SELECT ID, NAME

INTO #testEnvironment

FROM (VALUES(1,'a'),(1,'b'),(1,'c'),(1,'d'),(2,'e'),(2,'f'),

(3,'g'),(3,'h'),(3,'i'),(4,'j'),(5,'K'),(5,'L'),

(5,'m'),(5,'N'),(5,'O'),(5,'P')

)a(ID, NAME);

-- Get required output

DECLARE @SQL NVARCHAR(4000) = 'SELECT ID';

SELECT @SQL = @SQL + '

,MAX(CASE N WHEN ' + N + ' THEN NAME ELSE '''' END) as Name' + N

FROM (SELECT DISTINCT CAST(ROW_NUMBER() OVER (PARTITION BY ID ORDER BY NAME) AS VARCHAR) N

FROM #testEnvironment) Q;

SELECT @SQL = @SQL + '

FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY NAME) N

FROM #testEnvironment) Q

GROUP BY ID';

EXEC (@SQL);

DROP TABLE #testEnvironment;

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Cadavre SSC-Forever Points: 41690 More actions · Answer 6

Steve JP (2/28/2013)
Throwing the 2 versions of the code through a test rig shows that the methods are reasonable similar on speed. Its how the methods get used is probable more important than the milliseconds difference in its perfromance.

Interesting, can you provide the code you used in your test rig? I find that the CASE versions are generally much quicker than PIVOT, so I'd like to see why we have a different result here.

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/