Without using group_by or xml_path how to concatenate multiple rows of a same id into a single row

Question

Without using group_by or xml_path how to concatenate multiple rows of a same id into a single row

manibad

SSCommitted

Points: 1556
More actions
May 9, 2018 at 4:12 am

#330801

Input:
ID             DESCRIPTION
1            An
1            Apple
1            a
1            day
1            keeps
1            doctor
1            away
2            hello
3            The
3
3            XYZ
3            XYZZ
OUTPUT
ID             DESCRIPTION
1            An,Apple,a,day,keeps,doctor,away
2            hello
3            The,,XYZ,XYZZ
Have implemented using xml_path with stuff.
But I will be having millions of records like this.So i need a query to achieve my output which does not use group by or xml_path..
is it possible?
thanks in advance

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 1

What do you mean you'll be having millions of records? There are only three rows in your required result set. Please show us your xml_path query, and explain in more detail why it doesn't meet your requirements.

John

Thom A SSC Guru Points: 99143 More actions · Answer 2

What's wrong with using FOR XML PATH? You could do it using a self reference variable, but performance with FOR XML PATH would be far better.

As a quick example, I just created these queries and ran them in my dev environment:

SET STATISTICS IO ON; SET STATISTICS TIME ON;

DECLARE @Refs varchar(MAX) = '';

SELECT @Refs = @Refs + ',' + Polref@ FROM ic_brpolicy WHERE B@ = 0;

SET @Refs = STUFF(@Refs,1,1,'');

SELECT @Refs; GO

DECLARE @Refs varchar(MAX)

SELECT @Refs = STUFF((SELECT ',' + Polref@ FROM ic_brpolicy WHERE B@ = 0 FOR XML PATH('')),1,1,'');

SELECT @Refs; GO

SET STATISTICS IO OFF; SET STATISTICS TIME OFF;
The first query (using the self referencing variable), well, as of writing this post it's still executing (I'm at 8 and a half minutes right now). The second query, however, well, that ran in 339ms. That was only against about 230,000 rows.

FOR XML PATH is the way to go.

Thom~

Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk

manibad SSCommitted Points: 1556 More actions · Answer 3

John Mitchell-245523 - Wednesday, May 9, 2018 4:21 AM
What do you mean you'll be having millions of records? There are only three rows in your required result set. Please show us your xml_path query, and explain in more detail why it doesn't meet your requirements.
John

By saying millions of records means in my source i have nearly 10 million records in my source.So the query which i use should be efficient.

By using xml_path:
select distinct t1.ID,
STUFF((SELECT distinct ',' + t2.desc
   from Tst t2
   where t1.ID = t2.ID
    FOR XML PATH(''), TYPE
    ).value('.', 'NVARCHAR(MAX)')
   ,1,0,'') data
from Tst t1;

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 4

manibad - Wednesday, May 9, 2018 4:29 AM
John Mitchell-245523 - Wednesday, May 9, 2018 4:21 AM
What do you mean you'll be having millions of records? There are only three rows in your required result set. Please show us your xml_path query, and explain in more detail why it doesn't meet your requirements.
John
By saying millions of records means in my source i have nearly 10 million records in my source.So the query which i use should be efficient.
By using xml_path:
select distinct t1.ID,
STUFF((SELECT distinct ',' + t2.desc
   from Tst t2
   where t1.ID = t2.ID
    FOR XML PATH(''), TYPE
    ).value('.', 'NVARCHAR(MAX)')
   ,1,0,'') data
from Tst t1;

OK then. What are your "efficiency" criteria, and in what way does the query you posted fail to meet them? Strikes me that the most efficient way to solve this would be to normalise your database. I know you don't always have control over the structure of your database, but it's worth considering if you do.

John

DesNorton SSC-Insane Points: 24688 More actions · Answer 5

manibad - Wednesday, May 9, 2018 4:29 AM
John Mitchell-245523 - Wednesday, May 9, 2018 4:21 AM
What do you mean you'll be having millions of records? There are only three rows in your required result set. Please show us your xml_path query, and explain in more detail why it doesn't meet your requirements.
John
By saying millions of records means in my source i have nearly 10 million records in my source.So the query which i use should be efficient.
By using xml_path:
select distinct t1.ID,
STUFF((SELECT distinct ',' + t2.desc
   from Tst t2
   where t1.ID = t2.ID
    FOR XML PATH(''), TYPE
    ).value('.', 'NVARCHAR(MAX)')
   ,1,0,'') data
from Tst t1;

I stand to be corrected, but I believe that Group By should give you better performance.
THIS post by Aaron Bertrand gives nice coverage of the difference.

How to post data/code on a forum to get the best help.

Jeff Moden SSC Guru Points: 1004706 More actions · Answer 6

manibad - Wednesday, May 9, 2018 4:12 AM
Input:
ID             DESCRIPTION
1            An
1            Apple
1            a
1            day
1            keeps
1            doctor
1            away
2            hello
3            The
3
3            XYZ
3            XYZZ
OUTPUT
ID             DESCRIPTION
1            An,Apple,a,day,keeps,doctor,away
2            hello
3            The,,XYZ,XYZZ
Have implemented using xml_path with stuff.
But I will be having millions of records like this.So i need a query to achieve my output which does not use group by or xml_path..
is it possible?
thanks in advance

In 2012, you probably won't find a more efficient method than FOR XML PATH using GROUP BY.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)