Concatenating Rows

  • Min-437192 (10/18/2009)


    Yours is pretty cool, but what do I need to modify if I want to get rid of duplicates? 🙂

    Here's one way that will handle the VARCHAR(MAX) datatype about 11 seconds faster on a million row test than the other non-XML "MAX" methods.

    SELECT t1.SomeRowID,STUFF((SELECT [font="Arial Black"]DISTINCT[/font] ',' + t2.SomeCharValue

    FROM dbo.SomeTable t2

    WHERE t2.SomeRowID = t1.SomeRowID --Correlation here

    FOR XML PATH(''),TYPE).value('.','VARCHAR(8000)')

    ,1,1,'') AS ConcatenatedValue

    FROM dbo.SomeTable t1

    GROUP BY t1.SomeRowID --- without GROUP BY multiple rows are returned

    ORDER BY t1.SomeRowID

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • And here's a function for items that don't exceed 8K...

    --===== Create a function that uses VARCHAR(8000) with a DUPE Eliminator

    CREATE FUNCTION dbo.Concat8KTestDupElim

    (@SomeInt INT)

    RETURNS VARCHAR(8000)

    AS

    BEGIN

    DECLARE @ReturnString VARCHAR(8000)

    SELECT @ReturnString = ISNULL(@ReturnString+',' ,'') + d.SomeCharValue

    FROM [font="Arial Black"](SELECT DISTINCT SomeCharValue FROM dbo.SomeTable --Need to hardcode

    WHERE SomeInt = @SomeInt) d[/font]

    RETURN @ReturnString

    END

    GO

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Min-437192 (10/18/2009)


    Yours is pretty cool, but what do I need to modify if I want to get rid of duplicates? 🙂

    Simply add DISTINCT clause.

    create table fruit(name varchar(20))

    insert into fruit SELECT 'apple'

    union all select 'banana'

    union all select 'banana'

    union all select 'orange'

    union all select 'orange'

    union all select 'melon'

    union all select 'melon'

    union all select 'grape'

    union all select 'grape'

    declare

    @l varchar(8000)

    ,@comma varchar(2)

    set @l = ''

    set @comma = ''

    select @l = @l + @comma + name

    ,@comma = ', '

    from (select DISTINCT name from fruit) AS fruit

    ORDER BY name

    select @l

  • Jeff Moden (10/18/2009)


    And here's a function for items that don't exceed 8K...

    --===== Create a function that uses VARCHAR(8000) with a DUPE Eliminator

    CREATE FUNCTION dbo.Concat8KTestDupElim

    (@SomeInt INT)

    RETURNS VARCHAR(8000)

    AS

    BEGIN

    DECLARE @ReturnString VARCHAR(8000)

    SELECT @ReturnString = ISNULL(@ReturnString+',' ,'') + d.SomeCharValue

    FROM [font="Arial Black"](SELECT DISTINCT SomeCharValue FROM dbo.SomeTable --Need to hardcode

    WHERE SomeInt = @SomeInt) d[/font]

    RETURN @ReturnString

    END

    GO

  • Carlo Romagnano (10/18/2009)


    Min-437192 (10/18/2009)


    Yours is pretty cool, but what do I need to modify if I want to get rid of duplicates? 🙂

    Simply add DISTINCT clause.

    create table fruit(name varchar(20))

    insert into fruit SELECT 'apple'

    union all select 'banana'

    union all select 'banana'

    union all select 'orange'

    union all select 'orange'

    union all select 'melon'

    union all select 'melon'

    union all select 'grape'

    union all select 'grape'

    declare

    @l varchar(8000)

    ,@comma varchar(2)

    set @l = ''

    set @comma = ''

    select @l = @l + @comma + name

    ,@comma = ', '

    from (select DISTINCT name from fruit) AS fruit

    ORDER BY name

    select @l

    Thank you. I like this solution because it's simpler for my simple mind. 🙂

  • No problem... just remember that it leaves a trailing comma and does the whole table unless you add a criteria.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have tried the SQL CLR sample in MSDN that accomplishes the same thing by using a User-Defined Aggregate function. I think it works great (no issue so far and the performance is pretty good) and I have since created another UDAGG for doing a custom MAX function. (it's MAX(value, rank) and will return the value with the greatest rank assigned.

    http://msdn.microsoft.com/en-us/library/ms165055.aspx

    What do you guys think of UDAGG functions in general? As a .Net developer I have found it pretty powerful but would like to know how the experienced DBA thinks of it in terms of ease of development, performance, security, maintainability, etc. I am particularly interesting in seeing performance comparison between the CLR solution with the XML/ CTE solutions posted here.

  • Solution 2 can better be accomplished as follows:

    select(select name as "data()"

    from fruit

    order by id

    for xml path('')) as FruitList

    The as "data()" part removes the xml and leaves you with a space delimited list

    🙂

  • I see it's your first post on SSC... Welcome aboard!

    Again, as with many of the other posts, that does the whole table. If that's what you want, then you're all set. If not, then you need to add a bit of correlation to the outside SELECT. It will also produce some undesireable results if certain special characters show up in a name. And how would you use that if you wanted a non-space delimiter?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Good question! In the nested select you can filter, manipulate and cleanup the data as much as necessary before replacing the space delimiter with your choice of delimiter. If the data is properly cleaned and sanitized, this method will give you a properly delimited list without a trailing delimiter.

    I would probably write a function which would sanitize the data rather than actually use replace or trim within the nested select statement below:

    Example: Cleanup the list and replace with a comma delimiter

    declare @cDelimiter varchar(1)

    set @cDelimiter = ',';

    select replace((select ltrim(rtrim(name)) as "data()"

    from fruit

    where name like '%a%'

    order by id

    for xml path('')),' ',@cDelimiter) as FruitList

    Returns: Apple,Banana,Orange,Grape

    However, if one of the names contains a space we still have work to do...

  • Roxy Wehner (10/23/2009)


    Good question! In the nested select you can filter, manipulate and cleanup the data as much as necessary before replacing the space delimiter with your choice of delimiter. If the data is properly cleaned and sanitized, this method will give you a properly delimited list without a trailing delimiter.

    I would probably write a function which would sanitize the data rather than actually use replace or trim within the nested select statement below:

    Example: Cleanup the list and replace with a comma delimiter

    declare @cDelimiter varchar(1)

    set @cDelimiter = ',';

    select replace((select ltrim(rtrim(name)) as "data()"

    from fruit

    where name like '%a%'

    order by id

    for xml path('')),' ',@cDelimiter) as FruitList

    Returns: Apple,Banana,Orange,Grape

    However, if one of the names contains a space we still have work to do...

    All those "problems" are solved with the code that some have previously posted. Here's the code...

    SELECT t1.SomeRowID,STUFF((SELECT ',' + t2.SomeCharValue

    FROM dbo.SomeTable t2

    WHERE t2.SomeRowID = t1.SomeRowID --Correlation here

    FOR XML PATH(''),TYPE).value('.','VARCHAR(8000)') --Allows special characters

    ,1,1,'') AS ConcatenatedValue

    FROM dbo.SomeTable t1

    GROUP BY t1.SomeRowID --- without GROUP BY multiple rows are returned

    ORDER BY t1.SomeRowID

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'm not too sure about the without using a cursor and without excessive wasted performance." part

  • What would be best way to insert a word like "and" between last two items instead of that specific comma?

  • bnordberg (10/14/2009)


    I recently had to concatenate ~1.5 billion rows of text (average length of 110 characters). These were medical notes where each line was a sepearte row. I tested CLR, Stored procedures ... nothing was a fast as the XML method. However I ran into tons of unicode characters that I had to replace. So I had a yucky 18 or so level nested replace. I was able to get the concatenation to work in 2.4 hours creating 34 million documents out of the 1.5 billion. The CLR would have definitly been useful to avoid the unicode character problem, but still I can't get it to beat the xml method.

    Hi, have you tried to extract strings from XML correctly as it described here?

    http://www.codeproject.com/Tips/122630/String-concatenation-in-Transact-SQL.aspx

    Anton Burtsev

  • Another option is:

    DECLARE @FruitList NVARCHAR(max)

    with

    fruit as (

    select 'Apple' as name, 101 as id union all

    select 'Banana' as name, 102 as id union all

    select 'Orange' as name, 103 as id union all

    select 'Melon' as name, 104 as id union all

    select 'Grape' as name, 105 as id

    )

    SELECT @FruitList=COALESCE(@FruitList + ', ', '')+ name

    FROM fruit

    SELECT @FruitList

    y

Viewing 15 posts - 106 through 120 (of 159 total)

You must be logged in to reply to this topic. Login to reply