March 28, 2007 at 10:01 am
I have a table that has 160,000 Records that has Clientid,Publication. I have to create a table where I have the clientid and then a comma delimited list of all the publications related to this clientid so I can spit it out into a report. I created a udf (See Below) But this is taking forever to run because of the number of records.
Does anyone have any other ideas?
as
begin
declare @fields varchar(8000)
set
@fields = null
select
@fields = coalesce( @fields + ', ', '' ) + rtrim( Description )
from
ClientPublication
where
clientid = @tableid
return
@fields
end
--function
March 28, 2007 at 10:08 am
What does you query look like for the actually run that calls the UDF?
March 28, 2007 at 11:14 am
select
distinct Clientid, dbo.concatFieldNames(clientID)Publication
INTO
#clientpublication
from
ClientPublication
It takes about 20 minutes to run
March 28, 2007 at 11:18 am
OK I thought so. The thing is this, even thou you have distinct all the rows will be handled first (meaning the function is fired for all rows) then distinct is applied which means the more rows the longer it takes.
Now I don't know for sure this will be super significant but it should be a great deal better. The below query get's the disting lisrt first then applies the function on fewer rows than your current query does.
Try this
select Clientid, dbo.concatFieldNames(clientID)Publication
INTO #clientpublication
FROM
(SELECT
DISTINCT Clientid
from ClientPublication) AS DCP
Also keep in mind because you have to build your comma delimited list even this may take a bit and will progressively get worse the more data that is added to your DB in the future.
March 28, 2007 at 7:34 pm
There's really only one way to get performance out of this code...
Without modification, the original query/function on only 10,000 rows takes 19 seconds and Antares' takes 17. On 20,000 rows, the original query/function takes a whopping 142 seconds and Antares', although less than half the time, still takes 62 seconds. Just imagine what those numbers will become on 160,000 rows!
So, what to do... here's the answer...
CREATE INDEX BigTestIndex ON dbo.ClientPublication (ClientID,Description)
With that in place, the original query/function on 20,000 rows drops to about 2 seconds and Antares' drops to about 1. On 200,000 rows, the original takes 15 seconds and Antare's remains very practical at about 3 seconds.
Of course, times will be dependent on how big Description is and a couple of other factors, but it made no difference as to whether the new index is clustered or not.
Hope this helps...
--Jeff Moden
Change is inevitable... Change for the better is not.
March 28, 2007 at 7:50 pm
Performance will be even better with
CREATE CLUSTERED INDEX BigTestIndex ON dbo.ClientPublication (ClientID,Description)
_____________
Code for TallyGenerator
March 28, 2007 at 8:14 pm
Nope... it doesn't... I tested it. See my previous...
--Jeff Moden
Change is inevitable... Change for the better is not.
March 28, 2007 at 8:19 pm
It's probably because there are 2 or 3 descriptions per ID.
And that index is the only one.
It's not the case for a real system.
Create clustered index on another column, make 30 - 50 descriptions per ID and you'll see the difference.
_____________
Code for TallyGenerator
March 28, 2007 at 10:07 pm
Actually, I did that... and a couple of other renditions, as well.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 28, 2007 at 11:04 pm
Very strange.
Last time I used it I've got noticeable performance improvement by making index on key column clustered.
_____________
Code for TallyGenerator
March 29, 2007 at 5:57 am
Yep... it is strange... the function does Index Seeks on the 2 part index and the outer Select that uses the function does and Index Scan because it's looking at the whole table. I moved the clustering back and forth between the two and even tried it with no clustering... all with no noticeable changes in time. And, the data I used is more or less random... you'd think it would make a huge difference.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 29, 2007 at 6:00 am
Thanks all of you for your help.
March 29, 2007 at 6:05 am
You bet, Pam. Let us know how it works out.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 29, 2007 at 6:56 am
Huge Difference. It only takes 3 seconds to run now. I don't know why I didn't think of that.
March 29, 2007 at 5:22 pm
Outstanding! Thanks for the feedback!
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply