sql query help

Question

Post reply

sql query help

sql_fan

SSC-Addicted

Points: 462
More actions
January 23, 2006 at 8:47 am

#96450

id Name
1 xxxx
2 yyyy
3 zzzz
1 pppp
3 dfgf
2 dhfl
I have this data in my sql table and I am trying to write a query to get the data like shown below.
id name
1 xxxx,pppp
2 yyyy,dhfl
3 zzzz,dfgf
Can anyone help in this.
Thanks.

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 1

The usual answer is 'do it on the client, SQL is no good at this sort of thing'

However, if you really need to anyway, there are a few different ways to achieve it.

Search the forum for 'PIVOT' and you'll get quite a few suggestions.

/Kenneth

Ray M SSC-Insane Points: 21093 More actions · Answer 2

Reference post I just did with same basic question.

http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=8&messageid=252864&post=true

sc1 Grasshopper Points: 11 More actions · Answer 3

Try This:

CREATE FUNCTION dbo.MyFunction

(@ID int)

returns varchar(2000)

as

begin

declare @rtrn varchar(2000)

declare @temp varchar(500)

declare @rowcount int

set @temp = ''

set @rowcount = 1

set @rtrn = ''

while(@rowcount > 0)

begin

select top 1 @temp = [Name]

from MyTable

where [id] = @ID

and cast(cast([Name] as varbinary)as int) >

cast(cast(@temp as varbinary)as int)

order by [id],cast(cast([Name] as varbinary)as int)

set @rowcount = @@rowcount

if(@rowcount > 0)

begin

select @rtrn = @rtrn + case when len(@rtrn) > 0

then ',' + @temp else @temp end

where len(@rtrn + case when len(@rtrn) > 0

then ',' + @temp else @temp end)<2000

set @rowcount = @@rowcount

end

return @rtrn

end

select distinct id, dbo.MyFunction(id) as [name]

from Mytable

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 4

I really wish people would stop saying that... sometimes SQL is the only client application you have.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 5

To add to my previous, RayM's answer is absolutely correct. Further, if you have to do such a thing on a hundred thousand rows to send as a file, then the "client" code will be comparatively dog slow. RayM's answer is simple and quick... to do such a thing in Java or VB or whatever, would certainly be at least, if not more, complex and will always be slower.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 6

Well, 'sometimes' isn't making it standard, so why stop saying how something 'should' be done..?

The fact remains, SQL sucks at row-by-row operations, doing it by hiding a loop in a UDF doesn't make the fact go away. The language just isn't designed for that type of operations. So, the 'best' is always to rip out the data as a set, and do the 'stuff' elsewhere.

The above is the starting point when no other details are known, right? (as the poster didn't provide any other details than just asking 'how can I do x'?) Thus, the generic recommendation for this generic question is: 'it is best to do on the client, not the server'. (not involving 2005 and clr etc here )

Enter real life.. Sure, sometimes SQL is the only 'client', that doesn't change what to be said when the client is unknown. Relative to 'best' is 'good enough', which in the end is the deciding factor anyway, regardless of what's concieved to be 'best'. An example of 'good enough' is many times to do exactly as Ray's UDF.

Still, from a SQL perspective, it's not 'optimal' in any way, but may for each and everyone's purpose be 'good enough'. It depends. Rules are made to be broken, the thing is just that it's better to break them intentionally than being unaware of it.

So, I disagree. I don't think we should, as a first response to questions on 'how do I write SQL to do a pivot', not say 'It's 'best' done on the 'client'.

just my .02, though

/Kenneth

Jesper-244176 SSCertifiable Points: 7142 More actions · Answer 7

Do you have precisely two occurencies of each Id in your data (like in your test example)? Then you could try the following:

select Id, max(Name) + ',' + min(Name) from yourtable group by Id

If this is not the case, please state what the max number of occurencies of each Id is (or is it unbounded?).

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 8

Hmmmm... Ken, I can't say I agree... here's some test code to demonstrate why...

The first part of the code makes a test table of a million rows. After all, if I'm to demonstrate that value and power of doing such a task in SQL instead of an application is the right way to go, I might as well make it a good one... the code takes about 40 seconds or less to run on a properly configured server...

--===== Create a populate a million row test table to demonstrate the

-- the power of SQL as compared to an "application". A "real life"

-- example would be much wider but this will suffice for test data

-- and is easy to write a comparison test for in an "application".

SELECT TOP 1000000

IDENTITY(INT,1,1) AS RowNum,

CAST(RAND(CAST(NEWID() AS VARBINARY))*9999+1 AS INT) AS UserID,

CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))

+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)) AS SomeValue

INTO dbo.CSVTest

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2

--===== A table is not properly formed unless a Primary Key has been assigned

ALTER TABLE dbo.CSVTest

ADD PRIMARY KEY CLUSTERED (RowNum)

--===== ...and, we'll do a little code optimization by adding an index to

-- support the upcoming query.

CREATE NONCLUSTERED INDEX CSVTest_UserID_SomeValue

ON dbo.CSVTest (UserID,SomeValue)

The code that forms the UserID column creates random numbers from 1 to 9999 throughout the million rows. The code that forms the SomeValue column creates random occurances of the values "AA" through "AZ" throughout the million rows. To keep the test uncomplicated and true, dupes are allowed and will be present.

Ok, now for a function very similar to that function that RAYM wrote...

--===== Create a function to produce a CSV column from a set of data

-- identified by the input parameter @user-id

CREATE FUNCTION dbo.fTestCSV(@UserID INT)

RETURNS VARCHAR(8000)

AS

BEGIN

--===== Declare a working variable

DECLARE @CSV VARCHAR(8000)

--===== Concatenate the "SomeValue" column into a single CSV value

-- for a given user ID using a set based "loop"

SELECT @CSV = COALESCE(@CSV+',','')+CAST(SomeValue AS VARCHAR(10))

FROM CSVTest

WHERE UserID = @user-id

RETURN @CSV

END

We're all set for the test... the goal is to create a single CSV column for each UserID that contains all of the values of the SomeValue column in a million row table. Run this in Query Analyzer in the grid mode to simulate something like a "grid object" in an application. The actual code to run this test is surrounded by other code to measure the duration and performance... the code under test looks like this...

SELECT UserID,dbo.fTestCSV(UserID)

FROM dbo.CSVTest

GROUP BY UserID

...and here's that rascal embedded in parametric code... to keep from having to calculate the differences between the start and end parametrics, open a new window in Query Analyzer and paste the following code in it and then run it...

--===== Ensure that the data is NOT cached so we can get a

-- true measure of performance of the code being tested.

DBCC DROPCLEANBUFFERS

DBCC FREEPROCCACHE

--===== Declare a couple of variables to measure duration

DECLARE @StartTime DATETIME

DECLARE @EndTime DATETIME

--===== Show the condition of the resources used so far

-- Should be very close to zero if you run this in

-- a new window of Query Analyzer

SELECT CPU/1000.0 AS [CPU Seconds],

Physical_IO AS [Disk Reads/Writes],

MemUsage * 8.096 AS [KBytes of Memory Used]

FROM Master.dbo.SysProcesses

WHERE SPID = @@SPID

--===== Start the duration "timer"

SET @StartTime = GETDATE()

--===== Run the code being tested

SELECT UserID,dbo.fTestCSV(UserID)

FROM dbo.CSVTest

GROUP BY UserID

--===== Stop the duration "timer

SET @EndTime = GETDATE()

--===== Again, show the condition of the resources used.

-- To get the true measure, subtract the previous

-- values of resources display from these.

SELECT CPU/1000.0 AS [CPU Seconds],

Physical_IO AS [Disk Reads/Writes],

MemUsage * 8.096 AS [KBytes of Memory Used]

FROM Master.dbo.SysProcesses

WHERE SPID = @@SPID

--===== Finally, display the total duration of the test

-- code in decimal seconds

SELECT DATEDIFF(ms,@StartTime,@EndTime)/1000.0

On my humble 1.8Ghz single processor box, I get the following stats...

3.516 Seconds Total duration including display time by the "application"

3.062 Seconds Total CPU time used (3.109 - 0.047)

90 Total Disk Read/Writes (97-7)

0.000 KBytes Total memory used (0.000-0.000)

9,999 Total number of records transmitted to the "application"

Basically, SQL Server processed more than 320,000 rows each second... I'm thinking that even a very well written application might not be able to keep up with that. Certainly, the network folks are going to love it because instead of transmitting a million rows to an app, this transmits less than 1/100^th of that.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jesper-244176 SSCertifiable Points: 7142 More actions · Answer 9

Very interesting - and surprising, at least to me...

I have removed the rest of my comments now that Jeff has rewritten his post

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 10

Jesper, thanks for pulling me off the ceiling... you're the best.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 11

No problems Jeff (just had to google on 'recipro....' that wierd word.. )

If you feel like blowing off steam offline you're more than welcome to pm me.

I do sense an amount of misunderstanding on the matter.

If not, that's fine too

/Kenneth

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 12

Thanks for the offer, Ken... we're all set.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)