Reporting in A comma delimited list

  • I have a table that has 160,000 Records that has Clientid,Publication.  I have to create a table where I have the clientid and then a comma delimited list of all the publications related to this clientid so I can spit it out into a report.    I created a udf (See Below) But this is taking forever to run because of the number of records.

    Does anyone have any other ideas?

     

    as

    begin

    declare @fields varchar(8000)

    set

    @fields = null

    select

    @fields = coalesce( @fields + ', ', '' ) + rtrim( Description )

    from

    ClientPublication

    where

    clientid = @tableid

    return

    @fields

    end

    --function

  • What does you query look like for the actually run that calls the UDF?

  • select

    distinct Clientid, dbo.concatFieldNames(clientID)Publication

    INTO

    #clientpublication

    from

    ClientPublication

     

     

    It takes about 20 minutes to run

  • OK I thought so. The thing is this, even thou you have distinct all the rows will be handled first (meaning the function is fired for all rows) then distinct is applied which means the more rows the longer it takes.

    Now I don't know for sure this will be super significant but it should be a great deal better. The below query get's the disting lisrt first then applies the function on fewer rows than your current query does.

    Try this

    select Clientid, dbo.concatFieldNames(clientID)Publication

    INTO #clientpublication

    FROM

     (SELECT

      DISTINCT Clientid

      from ClientPublication) AS DCP

    Also keep in mind because you have to build your comma delimited list even this may take a bit and will progressively get worse the more data that is added to your DB in the future.

  • There's really only one way to get performance out of this code...

    Without modification, the  original query/function on only 10,000 rows takes 19 seconds and Antares' takes 17.  On 20,000 rows, the original query/function takes a whopping 142 seconds and Antares', although less than half the time, still takes 62 seconds.  Just imagine what those numbers will become on 160,000 rows! 

    So, what to do... here's the answer...

    CREATE  INDEX BigTestIndex ON dbo.ClientPublication (ClientID,Description)

    With that in place, the original query/function on 20,000 rows drops to about 2 seconds and Antares' drops to about 1.  On 200,000 rows, the original takes 15 seconds and Antare's remains very practical at about 3 seconds. 

    Of course, times will be dependent on how big Description is and a couple of other factors, but it made no difference as to whether the new index is clustered or not.

    Hope this helps...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Performance will be even better with

    CREATE CLUSTERED INDEX BigTestIndex ON dbo.ClientPublication (ClientID,Description)

    _____________
    Code for TallyGenerator

  • Nope... it doesn't... I tested it.  See my previous...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It's probably because there are 2 or 3 descriptions per ID.

    And that index is the only one.

    It's not the case for a real system.

    Create clustered index on another column, make 30 - 50 descriptions per ID and you'll see the difference.

    _____________
    Code for TallyGenerator

  • Actually, I did that... and a couple of other renditions, as well. 

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Very strange.

    Last time I used it I've got noticeable performance improvement by making index on key column clustered.

    _____________
    Code for TallyGenerator

  • Yep... it is strange... the function does Index Seeks on the 2 part index and the outer Select that uses the function does and Index Scan because it's looking at the whole table.  I moved the clustering back and forth between the two and even tried it with no clustering... all with no noticeable changes in time.  And, the data I used is more or less random... you'd think it would make a huge difference.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks all of you for your help.

  • You bet, Pam.  Let us know how it works out.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Huge Difference.  It only takes 3 seconds to run now.  I don't know why I didn't think of that.

  • Outstanding!  Thanks for the feedback!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply