February 22, 2009 at 1:39 pm
Hi All,
I have a table with the columns empname,dateofbirth,group.An employee can be in different group during his service.
Data is as follows.
henry - 10/20/1960 - 2
henry - 10/20/1960 - 3
henry - 10/20/1960 - 5
steeve - 08/17/1965 - 8
steeve - 08/17/1965 - 10
steeve - 08/17/1965 - 9
steeve - 08/17/1965 - 4
steeve - 08/17/1965 - 2
Laura - 09/12/1967 -3
I need to display the data in a single row all groups as columns as below.
henry - 10/20/1960 - 2 - 3 - 5
steeve - 08/17/1965 - 8 -10-9-4-2
Laura - 09/12/1967 -3
Regards
Mac
February 22, 2009 at 4:17 pm
This ought to get you started. The key to this is using "FOR XML" to do concatenation of the strings for the groups each employee belonged to. To understand this, focus on everything from STUFF down to [Concatenated].
The subquery runs a select of the rows for a given empName/svcDate combination, ordered by the wrkGroup column. It formats the output for XML format, but substitutes a comma for the XML tag to produce a concatenated comma delimited string. The STUFF removes the leading comma from position 1 of the string. The concatenated string is returned, together with the empName and svcDate. The GROUP BY in the outer query is necessary to keep the concatenated string from being returned multiple times.
You should be able to adapt this to your data. Let me know if you have any questions.
Bob
-------------
declare @data table (empname varchar(30), svcdate varchar(10), wrkGroup int)
insert into @data
Select 'Henry', '10/20/1960',2 union all
Select 'Henry', '10/20/1960',3 union all
Select 'Henry', '10/20/1960',5 union all
Select 'Steve','08/17/1965',8 union all
Select 'Steve','08/17/1965',10 union all
Select 'Steve','08/17/1965',9 union all
Select 'Steve','08/17/1965',4 union all
Select 'Steve','08/17/1965',2 union all
Select 'Laura','09/12/1967',3
select * from @data
select empname,svcDate,STUFF((SELECT ',' + cast(wrkGroup as varchar(5))
FROM @data d2
where d2.empName = d1.empname and d2.svcDate = d1.svcDate
ORDER BY wrkGroup
FOR XML PATH('')
),1,1,'') as [Concatenated]
from @data d1
group by empName,svcDate
order by empName,svcDate
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 22, 2009 at 8:38 pm
sriram (2/22/2009)
I need to display the data in a single row all groups as columns as below.henry - 10/20/1960 - 2 - 3 - 5
steeve - 08/17/1965 - 8 -10-9-4-2
Laura - 09/12/1967 -3
Regards
Mac
I'm curious, Mac... Why do you need to do this? I mean, what are the business requirements that demand this?
--Jeff Moden
Change is inevitable... Change for the better is not.
February 22, 2009 at 8:51 pm
Why ask why? 😉
I get these kinds of requests from our reports group. Some people want to squeeze as much information as they can onto a screen or page, so they can scan with their eyes. With hundreds of employes the format
henry - 10/20/1960 - 2 - 3 - 5
steeve - 08/17/1965 - 8 -10-9-4-2
Laura - 09/12/1967 -3
reads (scans) easier than
henry - 10/20/1960
- 2
- 3
- 5
steeve - 08/17/1965
- 8
-10
-9
-4
-2
Laura - 09/12/1967
-3
Obviously it could be just formatted as XML, or we could just return the entire rowset to the user interface in the same format as his input data, but that multiplies how much data has to be moved and also adds complexity to the user interface.
You know I respect your opinions, Jeff, so please tell me why this bothers you enough to question the need to do it at all?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 22, 2009 at 9:28 pm
Bob Hovious (2/22/2009)
Why ask why? 😉I get these kinds of requests from our reports group. Some people want to squeeze as much information as they can onto a screen or page, so they can scan with their eyes. With hundreds of employes the format
That would be precisely why I ask "why"... this type of formating should be done in the GUI/Report Generator if one is available. The server is too valuable a resource to do GUI formatting with. Distribute the load... send the data to the GUI as a results set and let the GUI do the formatting.
And, no, it's not like you have to send more data to do this in this particular case.
Now, let's see what the OP has to say...
--Jeff Moden
Change is inevitable... Change for the better is not.
February 22, 2009 at 9:45 pm
To be more clear...
If it were a cross tab of aggregated numbers, it could save a huge amount on how much you stuff into the pipe. But, even then, the data (compressed by aggregation) could be sent vertically and pivoted at the GUI end.
Yeah... I know... I just got done recently with two articles on cross-tabs... and I've been kicking myself ever since because of the way that people have abused what the server can do instead of what the GUI should be doing.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 23, 2009 at 2:48 pm
We can get it using cursor. But many people do not like it.
February 23, 2009 at 3:09 pm
You said a dirty word... I can't hear you... lalalalalaLAlalALALALALAAAAAAAA
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 23, 2009 at 4:33 pm
Cursor... my word... A very effective non-cursor solution has already been posted. Why ya gotta go swearing like that? 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
February 23, 2009 at 4:36 pm
Sooooo.... Sriram.... your question has people using the "C" word. Can we please hear back from you as to why you want to do this? Not looking to hang you out to dry or anything... would just like to know and, maybe, just maybe, there might even be a better solution depending on why you think you need to do this.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 23, 2009 at 6:14 pm
Hey Jeff, thought this might interest you. The script generates just over 10,000 employee/department rows in #temp2 and does a concatenation using the FOR XML.
Usually, the concatenation increases the I/O time by 15-25% over the time needed to simply display all the rows in temp2. Sometimes the concatenation even seems to run faster than a simple display. ORDER BY doesn't seem to change the picture much, but I didn't do an order by on dept in addition to lastname/firstname when simply displaying the rows in #temp2. That would probably cut the margin by 5 percent or so.
I used this to create a LastNames and a FirstNames table and just typed in 50 off the top of my head, in some cases repeating common last names like 'Smith' and 'Jones', but any random character strings will do.
CREATE TABLE [dbo].[LastNames](
[LastNameID] [int] IDENTITY(1,1) NOT NULL,
[LastName] [varchar](20) NULL,
CONSTRAINT [PK_LastNames] PRIMARY KEY CLUSTERED
(
[LastNameID] ASC
)
CREATE TABLE [dbo].[FirstNames](
[FirstNameID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [varchar](20) NULL,
CONSTRAINT [PK_FirstNames] PRIMARY KEY CLUSTERED
(
[FirstNameID] ASC
)
------------------------------
-- concatenation test
------------------------------
set nocount on;
--- pseudo random number generation for numbers between 1 and @X
SELECT TOP 100000
IDENTITY(INT,1,1) AS RowID,
ABS(CHECKSUM(NEWID())) % 50 + 1 AS FKey,
ABS(CHECKSUM(NEWID())) % 50 + 1 AS LKey,
ABS(CHECKSUM(NEWID())) % 8 + 1 AS Dept
INTO #temp
FROM Tally;
select distinct firstName,lastname,dept
into #temp2
from #temp
join Lastnames on lastNameID = Lkey
join FirstNames on firstNameID = Fkey
select Firstname,Lastname,count(*)
from #temp2
group by firstName,LastName
having count(*) < 6
order by count(*) desc
select count(*) from #temp2
set statistics time on;
select firstName,LastName
,stuff((SELECT ',' + cast(dept as varchar(2))
FROM #temp2 t2
WHERE t2.LastName = t1.LastName and t2.firstName = t1.firstName -- must match GROUP BY below
ORDER BY dept
FOR XML PATH('')
),1,1,'') as [Departments]
from #temp2 t1
GROUP BY LastName,firstName -- without GROUP BY multiple rows are returned
--ORDER BY LastName,firstName
set statistics time off;
set statistics time on;
select * from #temp2
--ORDER BY LastName,firstName
set statistics time off;
drop table #temp
drop table #temp2
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 23, 2009 at 6:49 pm
Nice chunk of test generation code, Bob. Well done and thanks of posting it.
Yeah, if ya just gotta do some concatenation, I very much like the FOR XML path method... it's surprisingly fast even if no indexes come into play.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 23, 2009 at 8:29 pm
You're welcome, Jeff This has gotten me to wondering about something that I would appreciate your thoughts on. If we were storing results in a temp or permanent table the concatenation should result in less logging, because we are only storing a fraction of the number of rows. When pushing results to any UI, I assume their is some load on the SQL Server for buffering and and talking to the network. (Or are separate resources dedicated to that?) These are costs that aren't reflected in the SQL time and I/O stats that we see.
My philosophy is always to minimize physical I/O, because drives run at mechanical speeds and ram runs at the speed of light on a wire. Years ago, the first time I ever compressed a dos drive, I was surprised to find out that things loaded faster, not slower. Then I realized that the processor could uncompress data faster than the actuator could position itself to read more. (I'm assuming no RBAR is driving the number of calculations to infinity and beyond.) Concatenation like this is actually a form of compression, because we aren't having to repeat the LastName/FirstName combination for each department.
In any event, I'm comfortable enough with the time differential to keep concatening with FOR XML.
I think that load is going to be amongst the least of my worries.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 23, 2009 at 8:46 pm
In full recovery mode, as with I/O, I don't believe it's the number of rows that actually matters... it's the byte count.
For I/O, using aggregation, like SUM, for numbers has the effect of reducing the number of bytes actually passed (send just the answers, not all the data).
With string aggregation (ie. concatenation), the only thing you're saving is a CrLf EOL mark... not really worth taxing the server to do the concatenation in my mind. I'd rather pass the rows (which must be read either way) and let the client do the concatenation.
I agree on the compression thing you spoke of... not very many people take into account the relatively huge amount of time positioning the read-write head arm takes. That's why some erroneously say that a fragmented drive doesn't really matter... they have no idea. It's also why I like SpeedDisk better than DiskKeeper... SpeedDisk actually reorganizes and repacks all the files in the outer cyclinders of the hard drives where the media speed is much higher than at the inner cyclinders and you can pack more data per cylinder so the R/W head doesn't need to move as much to read more data.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 23, 2009 at 9:01 pm
Have things changed since I first learned about disk storage? Inner or outer tracks didn't used to matter because both were moving at the same RPM. I was taught that, arguably the outside is going faster because more disk travels under the read/write head in the same amount of time, but you couldn't read and write with the same precision, so basically you wound up with the same number of bits per track no matter how far out from the center you moved.
The fastest storage on disk was always in the middle because the actuator could only move so far from the middle. Storage on the edge had longer to wait for a mechanical move and seek if the actuator is in close to the center, and vice versa. I haven't read up on it in 20 years though, so I wouldn't be surprised if the rules were different now.
Also, I think you missed the point about compression.
If I return 1000 rows that read
John|Smith|1,2,3 (assume that's average number of characters)
instead of 10000 rows that read
John|Smith|1
John|Smith|2
John|Smith|3
it looks to me like I'm saving (John|Smith)*2 in exchange for a couple of commas That's seems like a pretty good trade.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Viewing 15 posts - 1 through 15 (of 25 total)
You must be logged in to reply to this topic. Login to reply