June 28, 2010 at 9:35 pm
Comments posted to this topic are about the item Generating a Distinct Delimited List Using XML
June 28, 2010 at 11:18 pm
Appreciate your article.
This is with lot of explanation, but we can achieve the output more efficient way with the sorting of rolename also:
select distinct a.userid, substring(b.rolename, 1, len(b.rolename )-1) from @UserRole a
cross apply (select distinct RoleName + ', ' from @UserRole where a.userid = userid
for xml path('')) b (rolename)
June 28, 2010 at 11:20 pm
WITH CTE
AS
(
SELECT ROW_NUMBER() OVER
(
PARTITION BY u2.UserID
ORDER BY MIN(u2.RoleAssignedDate)
) AS RowNum,
u2.UserID,
u2.RoleName
FROM @UserRole u2
GROUP BY u2.UserID,
u2.RoleName
)
SELECT DISTINCT u1.UserID,
SUBSTRING
(
(
SELECT ', ' + c.RoleName AS '*'
FROM CTE c
WHERE u1.UserID = c.UserID
ORDER BY c.RowNum
FOR XML PATH('')
), 3, 8000
) AS Roles
FROM @UserRole u1;
Should give the same result without the XQuery.
June 28, 2010 at 11:55 pm
Thanks Kailash Mishra.
The query that you gave will give result in an alphabetical order instead of the order in which the roles were assigned. The article is about retrieving delimited values in the order they were stored.
Thanks & regards,
Puja
June 29, 2010 at 12:00 am
Thanks Mike for new approach. 🙂
June 29, 2010 at 2:43 am
I like this article, it was a problem I had to solve previously.
I particularly like the way you solved the problem of the final ',' at the end of the string. Very neat.
I have to say I was totally lost by the end of your explanation and found the code too difficult to follow.
The following is an adaptation of your 2nd stage code, but using Group By RoleName which then allows Order By MIN(RoleAssignedDate) and thereby giving the order required.
SELECT DISTINCT u1.UserID,
REPLACE(
(SELECT u2.RoleName + ',' AS 'data()'
FROM @UserRole u2
WHERE u2.UserID = u1.UserID
GROUP BY RoleName
ORDER BY MIN(RoleAssignedDate)
FOR XML PATH('')
) + '$', ',$', ''
) AS Roles
FROM @UserRole u1
Cheers,
Nick
June 29, 2010 at 3:39 am
Let's do some testing...SET NOCOUNT ON
-- Prepare sample data
DECLARE@UserRole TABLE
(
UserID INT NOT NULL,
RoleName VARCHAR(100) NOT NULL,
ProjectID INT NOT NULL,
RoleAssignedDate DATETIME NOT NULL
)
-- Populate sample data
INSERT@UserRole
(
UserID,
RoleName,
ProjectID,
RoleAssignedDate
)
VALUES(1112, 'Technical Director', 2041, '1967-02-20 04:21:13.490'),
(1357, 'Training', 1614, '1961-09-14 16:18:59.990'),
(1836, 'Technical Director', 1628, '1987-07-30 11:22:45.060'),
( 715, 'Accounting', 1487, '1995-01-08 11:46:17.670'),
( 162, 'Approver', 1548, '2001-07-02 11:34:14.260'),
(1975, 'Technical Director', 1614, '1955-11-10 03:55:05.560'),
(1112, 'System Administrator', 831, '1956-04-25 08:26:54.040'),
( 162, 'Technical Director', 986, '1989-12-17 15:51:04.330'),
( 715, 'Accounting', 461, '1954-11-27 00:45:52.830'),
(1357, 'Developer', 2064, '2006-12-15 23:43:55.470'),
(1867, 'Technical Director', 1416, '2003-04-11 16:50:01.070'),
(1975, 'Developer', 1548, '1967-05-11 17:01:26.840'),
(1975, 'Accounting', 1089, '1988-06-20 00:52:16.070'),
( 162, 'Marketing', 1443, '1995-06-10 14:29:23.290'),
(1112, 'Accounting', 2109, '1999-12-05 09:07:46.620'),
( 162, 'Technical Director', 1089, '1975-09-25 12:13:12.590'),
( 162, 'International Sales Manager', 1628, '1984-05-30 09:25:18.330'),
(1836, 'Technical Customer', 420, '1993-05-28 00:49:31.090'),
(1357, 'Technical Director', 2036, '1979-04-25 23:38:48.120'),
(1836, 'Developer', 1628, '2006-07-26 20:36:53.420')
-- Start timing
SET STATISTICS IO ON
SET STATISTICS TIME ON
-- Peso
PRINT '*** Peso ***********************************************************************'
SELECTb.UserID,
STUFF(f.Roles, 1, 1, '') AS Roles
FROM(
SELECTUserID
FROM@UserRole
GROUP BYUserID
) AS b
CROSS APPLY(
SELECTTOP(100) PERCENT
',' + w.RoleName
FROM@UserRole AS w
WHEREw.UserID = b.UserID
GROUP BYw.RoleName
ORDER BYMIN(RoleAssignedDate)
FOR XMLPATH('')
) AS f(Roles)
ORDER BYb.UserID
-- Puja Shah
PRINT '*** Puja Shah ******************************************************************'
SELECTb.UserID,
LEFT(b.Roles, LEN(b.Roles) - CHARINDEX(',', REVERSE(b.Roles))) AS Roles
FROM(
SELECT DISTINCTa.UserID,
CAST(a.Roles.query('distinct-values(/Root/Roles)') AS VARCHAR(MAX)) AS Roles
FROM(
SELECTu1.UserID,
CAST(
(
SELECT DISTINCTu2.RoleName + ',' AS Roles,
ROW_NUMBER() OVER(PARTITION BY u2.UserID ORDER BY u2.RoleAssignedDate) AS RID
FROM@UserRole AS u2
WHEREu2.UserID = u1.UserID
ORDER BYRID
FOR XMLPATH(''),
ROOT('Root')
) AS XML
) AS Roles
FROM@UserRole AS u1
) AS a
) AS b
ORDER BYb.UserID
-- Kailash Mishra<- Fails ordering
PRINT '*** Kailash Mishra *************************************************************'
SELECT DISTINCTa.UserID,
SUBSTRING(b.RoleName, 1, LEN(b.RoleName) -1)
FROM@UserRole AS a
CROSS APPLY(
SELECT DISTINCTb.RoleName + ','
FROM@UserRole AS b
WHEREb.UserID = a.UserID
FOR XMLPATH('')
) AS b(RoleName)
ORDER BYa.UserID
-- Mike C
PRINT '*** Mike C *********************************************************************'
;WITH cte
AS (
SELECTROW_NUMBER() OVER (PARTITION BY u2.UserID ORDER BY MIN(u2.RoleAssignedDate)) AS RowNum,
u2.UserID,
u2.RoleName
FROM@UserRole AS u2
GROUP BYu2.UserID,
u2.RoleName
)
SELECT DISTINCTu1.UserID,
SUBSTRING(
(
SELECT',' + c.RoleName AS '*'
FROMcte AS c
WHEREu1.UserID = c.UserID
ORDER BYc.RowNum
FOR XMLPATH('')
), 2, 8000
) AS Roles
FROM@UserRole AS u1
ORDER BYu1.UserID
-- Nick Hanson
PRINT '*** Nick Hanson ****************************************************************'
SELECT DISTINCTu1.UserID,
REPLACE (
(
SELECTu2.RoleName + ',' AS 'data()'
FROM@UserRole AS u2
WHEREu2.UserID = u1.UserID
GROUP BYu2.RoleName
ORDER BYMIN(u2.RoleAssignedDate)
FOR XMLPATH('')
) + '$', ',$', ''
) AS Roles
FROM@UserRole AS u1
ORDER BYu1.UserID
-- Stop timing
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
N 56°04'39.16"
E 12°55'05.25"
June 29, 2010 at 7:13 am
Another variation on the previous suggestions. See what happens to all the other solutions when any of the roles is changed into for example 'R&D'. Plus, I don't like the replace()-approach to get rid of the last ','. I rather don't generate it if it's not needed instead of taking it off in an additional processing step:
SELECT
u1.UserID,
(
select case row_number() over (order by (select 1)) when 1 then '' else ', ' end
+ t.RoleName AS [text()]
from (
select row_number() over (order by min(u2.RoleAssignedDate)) as nr, u2.RoleName
from @UserRole u2
where u2.UserID = u1.UserID
group by u2.RoleName
) t
order by t.nr
for xml path(''), type
).value('.', 'nvarchar(max)') as Roles
from (
select
UserID
from @UserRole
group by
UserID
) u1
order by u1.UserID
June 29, 2010 at 8:04 am
I recommend using a CLR for this. Microsoft has a great example here http://msdn.microsoft.com/en-us/library/ms131056.aspx. I'm curious what the performance differences are. Admittedly, I have not compared the two because I've never had the CLR function not perform adequately.
You can then use this as an aggregate, for example.
SELECT dbo.List(myColumn)
FROM myTable
GROUP BY SomeOtherColumn
June 29, 2010 at 8:21 am
--An alternative without ROW_NUMBER()
--How about this?
select distinct
u.UserID
, stuff((
select ', ' + u2.RoleName
from @UserRole u2
where u2.UserID = u.UserID
and u2.RoleAssignedDate = (
select min(u3.RoleAssignedDate)
from @UserRole u3
where u3.UserID = u2.UserID
and u3.RoleName = u2.RoleName
)
order by u2.RoleAssignedDate asc
for xml path('')
), 1, 2, '') 'RoleName'
from @UserRole u
June 29, 2010 at 10:32 am
This is a good article thanks!
I was going to mention that using GROUP BY instead of DISTINCT/ROW_NUMBER might be a good way to solve this, but I see someone has already suggested that. IMO DISTINCT should hardly ever be used, as it usually isn't the best option, like in this case it prevents you from being able to order the results by what you want.
June 29, 2010 at 2:03 pm
Hi Brian, why would you want to eliminate row_number() from the query? Simply copy and paste your code into the test script that SwePeso posted and see for yourself how your version compares to the alternatives already presented. Peso's suggestion has by far the least IO and cpu time; I've taken that and added a fix for a functional problem in all suggestions: the special-characters &, < and > that for xml translates into & amp ;, & lt ; and & gt ; respectively (I put some additional spaces in that aren't in the real codes because the forum software shows the complete codes as &, < and > again). Your example re-introduces that problem and performs much worse because it re-introduces distinct instead of the group by in a sub query. So is there any specific reason why you think your version should be used instead of mine?
edit: codes were not shown correctly by the forum software.
June 29, 2010 at 2:48 pm
OK, then use GROUP:
--An alternative without ROW_NUMBER()
--How about this?
select
u.UserID
, stuff((
select ', ' + u2.RoleName
from @UserRole u2
where u2.UserID = u.UserID
and u2.RoleAssignedDate = (
select min(u3.RoleAssignedDate)
from @UserRole u3
where u3.UserID = u2.UserID
and u3.RoleName = u2.RoleName
)
order by u2.RoleAssignedDate asc
for xml path('')
), 1, 2, '') 'RoleName'
from @UserRole u
group by u.UserID
*** R.P.Rozema *****************************************************************
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
Table '#4F1BAE10'. Scan count 8, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 4 ms.
*** Brian Barkauskas ***********************************************************
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
Table '#4F1BAE10'. Scan count 8, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.
June 29, 2010 at 3:10 pm
Actually for readability, this one's even better,
It has only 32 "words" compared to your 70,
182 non-space characters to your 363,
and runs just as fast:
select
u.UserID
, stuff((
select ', ' + u2.RoleName
from @UserRole u2
where u2.UserID = u.UserID
group by u2.RoleName
order by min(u2.RoleAssignedDate)
for xml path('')
),1,2,'') 'RoleName'
from @UserRole u
group by u.UserID
June 29, 2010 at 5:21 pm
Brian Barkauskas (6/29/2010)
OK, then use GROUP:
--An alternative without ROW_NUMBER()
--How about this?
select
u.UserID
, stuff((
select ', ' + u2.RoleName
from @UserRole u2
where u2.UserID = u.UserID
and u2.RoleAssignedDate = (
select min(u3.RoleAssignedDate)
from @UserRole u3
where u3.UserID = u2.UserID
and u3.RoleName = u2.RoleName
)
order by u2.RoleAssignedDate asc
for xml path('')
), 1, 2, '') 'RoleName'
from @UserRole u
group by u.UserID
*** R.P.Rozema *****************************************************************
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
Table '#4F1BAE10'. Scan count 8, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 4 ms.
*** Brian Barkauskas ***********************************************************
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
Table '#4F1BAE10'. Scan count 8, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.
You good folks simply aren't using enough data to claim victory in any of the cases.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 26 total)
You must be logged in to reply to this topic. Login to reply