Concatenation of multiple records

  • Hi all, this is not an urgent question, it would however make a few processes here a fair amount simpler. In my SQL warehouse I am importing data from our production system (3rd party, so I can’t actually change the table/logic) for a set of custom notes entered by the users. However, the table is set up to hold 72 characters per record, when the user types more, a new record is created by the software and added with a sequence number. From the front end of the 3rd party system, these notes appear to the users as continuous.

    I would like to take that data once it enters the warehouse and merge the notes together. That is if customer ID AAB has a note which takes 10 records, I want to make one record with all 10 note records concatenated in a text field.

    Notes Table Layout (simplified)

    CREATE TABLE [dbo].[Notes] (

    [CustomerID] [char] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [NoteID] [char] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [Seqence] [int] NULL ,

    [Note] [char] (72) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ) ON [PRIMARY]

    Sample Data

    INSERT INTO Notes (CustomerID, NoteID, Seqence, Note)

    SELECT 'AAA1' AS CustomerID, 'XX12' AS NoteID, 10 AS Sequence, 'Note one for AAA1' AS Note

    union all

    SELECT 'AAA1' AS CustomerID, 'XX12' AS NoteID, 12 AS Sequence, 'Note two for AAA1' AS Note

    union all

    SELECT 'AAA1' AS CustomerID, 'XX12' AS NoteID, 25 AS Sequence, 'Note three for AAA1' AS Note

    union all

    SELECT 'BBB1' AS CustomerID, 'XX10' AS NoteID, 0 AS Sequence, 'Note One for BBB1' AS Note

    union all

    SELECT 'BBB1' AS CustomerID, 'XX10' AS NoteID, 30 AS Sequence, 'Note two for BBB1' AS Note

    union all

    SELECT 'AAA1' AS CustomerID, 'XX05' AS NoteID, 3 AS Sequence, 'Second Note for AAA1' AS Note

    From this I am attempting to arrive at…

    CustomerIDNoteIDNote

    AAA1XX12Note one for AAA1 Note two for AAA1 Note three for AAA1

    AAA1XX05Second Note for AAA1

    BBB1XX10Note One for BBB1 Note two for BBB1

    Some of the things which are giving me issues are, for any given customer ID, there can be any number of note sets (NoteID’s), and for any NoteID there can be any number of note sequences. (In the actual data it presently tops at 250 or so). The odd sequence numbers are intentional, the system assigns these via a method I don’t understand, I do know that the ascending sort on the sequence# does put the notes in the correct readable order.

    I have tried several ways to do this, and have had very limited success. If anyone has an idea on this I would be thrilled to hear it.

    Thanks!

  • What is the max that will be concatenated, i.e. the destination column datatype? char(8000), varchar(max), text?

    What version of SQL Server? SQL2005 and SQL2008 have more features related to this task.

  • As I look at the data I think I will be putting the result into a text data type, it is plausible that the data could grow beyond a varchar(8000).

    I am on SQL 2000, heh, apparently to make things more difficult it seems.

    Thanks!

  • Are you looking at an upgrade in the near future?

  • The text datatype means this must be done procedurally, i.e. in a loop of some kind.

    I'm not sure, but something like this may be more efficient outside of SQL Server. If you have another development environment, I recommend trying that first.

    Otherwise, you can use a cursor and loop through all records in order. On the first row of a particular CustomerID and NoteID, insert a row into your destination table. Get the pointer to the text data of that new row using the textptr() function. On the next row of the particular CustomerID and NoteID, append to the text data using that text pointer with the WRITETEXT statement.

    See "Modifying ntext, text, or image Values" in SQL Server Books Online for details.

  • We will be upgrading eventually to SQL2005, though it does not seem to be on any high priority list.

    I was starting to think the way to do this might end up procedural. Luckily this is something I want to do, not need to do with the data, gives me time to consider.

    Thanks everyone so far!!

  • You can select the distinct values of customerid and noteid, then pass these to a scalar function which queries the underly table something like this:

    create function gc

    @customerid varchar(50

    @noteID varchar(50)

    as

    declare @ss varchar(max)

    set @ss=''

    select @ss=dd + ', '

    from ctable

    where customerid=@customerid and nodeid=@noteid

    return @ss

    NB Code is very rough....

    Dan

  • I read that functions cannot return text datatype, at least in sql2000.

  • Ahh the joys of working with a poorly designed vendor software! (Watch out for that puddle of sarcasm...) Apparently they designed the note fields on the width of the screen the data is entered on. With each consecutive line being a new record.

    I am not keen on either a scalar function or cursor on this as the table is presently 1,000,000 records, which will continue to grow. It will more than likely grow faster going forward with greater emphasis being placed in documenting transactions here.

  • I have the same issue - except I have SQL 2005. I found this to be a great article http://www.projectdmx.com/tsql/rowconcatenate.aspx

    I am doing the for XML Path method and it is very fast. It concatenates ~4K records into ~ 380 records in ~30 seconds. The CTE was very slow - over 2 minutes for the same recordset.

  • 30 seconds for only 4K records? Take a look at the following...

    http://www.sqlservercentral.com/articles/Test+Data/61572/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hey, thanks Jeff!

    Great info in the article... now to find some spare time to try to adapt that.

    There are times I want regular SQL experiment times here. (my inner mad scientist demands it.) Hmm.. perhaps it is the mad scientist attitude which makes the rest of IT get that look of fear.. :hehe:

  • Heh... it's the twiching eye that really gets to them. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The eye! Of course, and here I thought it might be the lab coat and the hunchback ....

  • Heh... nah... I've got a couple of folks like that at work... they're actually pretty nice if you can get them to stop talking about their damned hump! 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply