Reporting in A comma delimited list

Question

Post reply

Reporting in A comma delimited list

Pam Ozer

SSCommitted

Points: 1722
More actions
March 28, 2007 at 10:01 am

#116769

I have a table that has 160,000 Records that has Clientid,Publication. I have to create a table where I have the clientid and then a comma delimited list of all the publications related to this clientid so I can spit it out into a report. I created a udf (See Below) But this is taking forever to run because of the number of records.
Does anyone have any other ideas?

as
begin
declare @fields varchar(8000)
set
@fields = null
select
@fields = coalesce( @fields + ', ', '' ) + rtrim( Description )
from
ClientPublication
where
clientid = @tableid
return
@fields
end
--function

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Antares686 SSC Guru Points: 125444 More actions · Answer 1

What does you query look like for the actually run that calls the UDF?

Pam Ozer SSCommitted Points: 1722 More actions · Answer 2

select

distinct Clientid, dbo.concatFieldNames(clientID)Publication

INTO

#clientpublication

from

ClientPublication

It takes about 20 minutes to run

Antares686 SSC Guru Points: 125444 More actions · Answer 3

OK I thought so. The thing is this, even thou you have distinct all the rows will be handled first (meaning the function is fired for all rows) then distinct is applied which means the more rows the longer it takes.

Now I don't know for sure this will be super significant but it should be a great deal better. The below query get's the disting lisrt first then applies the function on fewer rows than your current query does.

Try this

select Clientid, dbo.concatFieldNames(clientID)Publication

INTO #clientpublication

FROM

(SELECT

DISTINCT Clientid

from ClientPublication) AS DCP

Also keep in mind because you have to build your comma delimited list even this may take a bit and will progressively get worse the more data that is added to your DB in the future.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 4

There's really only one way to get performance out of this code...

Without modification, the original query/function on only 10,000 rows takes 19 seconds and Antares' takes 17. On 20,000 rows, the original query/function takes a whopping 142 seconds and Antares', although less than half the time, still takes 62 seconds. Just imagine what those numbers will become on 160,000 rows!

So, what to do... here's the answer...

CREATE INDEX BigTestIndex ON dbo.ClientPublication (ClientID,Description)

With that in place, the original query/function on 20,000 rows drops to about 2 seconds and Antares' drops to about 1. On 200,000 rows, the original takes 15 seconds and Antare's remains very practical at about 3 seconds.

Of course, times will be dependent on how big Description is and a couple of other factors, but it made no difference as to whether the new index is clustered or not.

Hope this helps...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sergiy SSC Guru Points: 110208 More actions · Answer 5

Performance will be even better with

CREATE CLUSTERED INDEX BigTestIndex ON dbo.ClientPublication (ClientID,Description)

_____________
Code for TallyGenerator

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 6

Nope... it doesn't... I tested it. See my previous...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sergiy SSC Guru Points: 110208 More actions · Answer 7

It's probably because there are 2 or 3 descriptions per ID.

And that index is the only one.

It's not the case for a real system.

Create clustered index on another column, make 30 - 50 descriptions per ID and you'll see the difference.

_____________
Code for TallyGenerator

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 8

Actually, I did that... and a couple of other renditions, as well.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sergiy SSC Guru Points: 110208 More actions · Answer 9

Very strange.

Last time I used it I've got noticeable performance improvement by making index on key column clustered.

_____________
Code for TallyGenerator

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 10

Yep... it is strange... the function does Index Seeks on the 2 part index and the outer Select that uses the function does and Index Scan because it's looking at the whole table. I moved the clustering back and forth between the two and even tried it with no clustering... all with no noticeable changes in time. And, the data I used is more or less random... you'd think it would make a huge difference.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Pam Ozer SSCommitted Points: 1722 More actions · Answer 11

Pam Ozer

SSCommitted

Points: 1722

March 29, 2007 at 6:00 am

#697617

Thanks all of you for your help.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 12

You bet, Pam. Let us know how it works out.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Pam Ozer SSCommitted Points: 1722 More actions · Answer 13

Huge Difference. It only takes 3 seconds to run now. I don't know why I didn't think of that.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 14

Outstanding! Thanks for the feedback!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)