June 16, 2008 at 7:30 am
Jeff - I will post my own script later. I kind of got busy yesterday and didn't get a chance to come back and finish the thought so to speak.
I basically gussied this particular scenario up to a 1M test, playing with a few scenarios, making the "average size of the group" a variable. The break-even point was somewhere around 20 (meaning - if each person had 20 pets in this scenario); more pets, and the function beat out the running total; more owners, and the running total setup won.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 16, 2008 at 7:40 am
Matt Miller (6/16/2008)
Jeff - I will post my own script later. I kind of got busy yesterday and didn't get a chance to come back and finish the thought so to speak.I basically gussied this particular scenario up to a 1M test, playing with a few scenarios, making the "average size of the group" a variable. The break-even point was somewhere around 20 (meaning - if each person had 20 pets in this scenario); more pets, and the function beat out the running total; more owners, and the running total setup won.
With a few exceptions (Steve and his ranch possibly) if an owner has 20 pets, you should just run sp_ASPCA and solve the problem
---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
June 16, 2008 at 1:36 pm
Jeff Moden (6/15/2008)
Then the XML solution will work.Just curious... why did you post on a 2k forum instead of a 2k5 forum? Folks wouldn't have wasted their time trying to come up with a 2k solution if you had...
Ohh... got me a litle confused..... I didn't notice I'm publishing in 2k forum, just saw this now (after you paid my attension to this).
I'm realy sorry for this ...
thanks alot to anyone that helped!
Adi
June 16, 2008 at 2:40 pm
jcrawf02 (6/16/2008)
...you should just run sp_ASPCA and solve the problem
Heh... I've found that it also works great on some developers 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
June 16, 2008 at 2:42 pm
adi.cohen (6/16/2008)
Ohh... got me a litle confused..... I didn't notice I'm publishing in 2k forum, just saw this now (after you paid my attension to this).I'm realy sorry for this ...
thanks alot to anyone that helped!
Adi
Sorry my own self... I sometimes come across as being a little bit harsh on things like this... I don't really mean to. I guess you can blame it on the users I have to put up with at work 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
June 16, 2008 at 2:44 pm
Jeff Moden (6/16/2008)
jcrawf02 (6/16/2008)
...you should just run sp_ASPCA and solve the problemHeh... I've found that it also works great on some developers 😀
You can try that, but the invisible fence collars are effective as well!
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 16, 2008 at 2:50 pm
Matt Miller (6/16/2008)
Jeff Moden (6/16/2008)
jcrawf02 (6/16/2008)
...you should just run sp_ASPCA and solve the problemHeh... I've found that it also works great on some developers 😀
You can try that, but the invisible fence collars are effective as well!
Yeah, but not nearly as much fun as euthanasia... or spaying/neutering...
--Jeff Moden
Change is inevitable... Change for the better is not.
June 16, 2008 at 3:59 pm
Jeff Moden (6/16/2008)
Matt Miller (6/16/2008)
Jeff Moden (6/16/2008)
jcrawf02 (6/16/2008)
...you should just run sp_ASPCA and solve the problemHeh... I've found that it also works great on some developers 😀
You can try that, but the invisible fence collars are effective as well!
Yeah, but not nearly as much fun as euthanasia... or spaying/neutering...
I don't know...those collars hooked up to a "human-sized" battery, say - a 9V or better, that could be some serious fun...:)
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 16, 2008 at 4:04 pm
All right - back to reality for one post....
This was what I was using to test the performance I mentioned earlier. Essentially - I'm using a @sample to control the average size of the groups. I now have it essentially at break-even (average grouping = 12.5). Drop the sample to 50K (average grouping = 20), and the function wins; increase to 125000 (grouping = 8) and the running total wins in reverse.
drop table #petnames
drop table #petnames2
drop table persons
drop table person_pets
drop table #temp
go
drop function concatf
go
create function concatf(@id as int)
RETURNS varchar(max)
as begin
declare @retstr as varchar(max)
set @retstr='';
select @retstr=@retstr+','+name
from person_pets
where person_id=@id;
set @retstr=substring(@retstr,2,len(@retstr));
return @retstr
end
go
create table persons(id int identity(1,1) primary key clustered, name varchar(20))
create table person_pets(person_ID int, name varchar(20))
declare @sample int
set @sample=80000;
insert persons(name)
select top(@sample) left(cast(newId() as varchar(100)),10)
from sys.columns sc1, sys.columns sc2;
insert person_pets
select top 1000000 rand(checksum(newid()))*@sample+1,
left(cast(newId() as varchar(100)),10)
from sys.columns sc1, sys.columns sc2;
--a couple of vars
declare @g datetime
set @g=getdate();
declare @tmppetnames varchar(max)
set @tmppetnames='';
declare @dummy varchar(max)
declare @prevperson int
set @prevperson=0;
--====== capture all of the persons
Select person_Id,
identity(int,1,1) as rid,
[Name],
Cast('' as varchar(max)) as pet_name
Into #temp
From person_pets
create unique clustered index pktmp on #temp(person_Id,RID)
--====== add in all of the pet names
Update #temp
Set @tmppetnames=pet_name = case when @prevperson=person_ID then @tmppetnames +',' else '' end + Name,
@prevperson=person_id,
@dummy=@tmppetnames
From #temp with (index(pktmp), tablockx)
select persons.*, pets
into #petnames
from
persons
left outer join
(select person_id, max(pet_name) pets
from #temp
group by person_id) petlist
on persons.id=petlist.person_id
select 'running', datediff(ms,@g,getdate());
go
--concatenation method
declare @g datetime
set @g=getdate();
create index j on person_pets(person_ID,name)
select persons.*, dbo.concatf(ID) as pnames
into #petnames2
from persons
select 'func', datediff(ms,@g,getdate());
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 16, 2008 at 8:41 pm
Ummmm.... sorry Matt... the function is not written correctly and it has not been called correctly. The function has two extra SET statements which slows it down and the call to the function will produce duplicates (if I'm not mistaken and I sometimes am ;))
The function should be written as follows for the max performance...
CREATE FUNCTION Concatf(@ID AS INT)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @RetStr AS VARCHAR(MAX)
SELECT @RetStr=COALESCE(@RetStr+',','')+NAME
FROM Person_Pets
WHERE Person_ID = @ID;
RETURN @RetStr
END
Notice that VARCHAR(8000) is probably more than enough to handle even Steve Jone's animals and it makes the function nearly twice as fast as VARCHAR(MAX). Compared to VARCHAR(8000), VARCHAR(MAX) carries a relatively terrible overhead.
Remember the duplication that occurs... remember the following article...
http://www.sqlservercentral.com/articles/Test+Data/61572/
--Jeff Moden
Change is inevitable... Change for the better is not.
June 16, 2008 at 8:57 pm
You're right on at least one side. The function can be built better. I was aware that the concat wouldn't lead to that large of a string, so varchar(max) was a cost that could probably be avoided.
That being said - the dupes you're talking about aren't happening, since the "persons" table is the "distinct" list of personID (the ones that are duplicated throughout the petnames). Meaning - I am following your advice to make sure you don't run it against the 1M table, only against the smaller, distinct list.
So yes - you're correct - the better function makes life quite a bit harder to beat, but still - it does even out in the end. It looks to me that the break-even is now around 125000. Even after dropping the maximum down to 2000 (which was more than enough) and using your concatf function.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 16, 2008 at 9:10 pm
Outstanding... thanks for the feedback, Matt. And, I'm sorry... I didn't realize that you were already beating the distinct drum on the usage of the function...
--Jeff Moden
Change is inevitable... Change for the better is not.
June 16, 2008 at 9:19 pm
Jeff Moden (6/16/2008)
Outstanding... thanks for the feedback, Matt. And, I'm sorry... I didn't realize that you were already beating the distinct drum on the usage of the function...
Oh - no issue at all. If we're going to test for performance - might as well test with the best possible setup, so the function needs to be tight. So - thanks for catching me on that one.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 17, 2008 at 5:48 am
Update.....
I kind of came across an ugly realization. The running total scenario has potential to "blow up" your tempDB rather spectacularly, if you play on the "extreme low end" of this scenario. As the UPDATE cycles through the rows in a group, the variable is holding on to ever larger strings, and pushing a copy into EACH RECORD as it navigates through. So - with very large groups, you end up with some HUGE temp tables.
I have something I'm ruminating, which might fix that problem, but as of now that algorithm could cause some real uglies. Amazing what a phrase like "processing duplicates" will jar out of you.
As of now - that still leaves the concat function as the overall best scenario in the case when you "don't know", since its perf doesn't balloon out on the "side it doesn't like" in the same way.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 17, 2008 at 7:46 am
Yup - I can fix the one issue on tempDB, but it still performs rather poorly on few, big groups. So - while it can beat the concat function under certain data distribution scenarios, I'd say the concat function is STILL the better option.
In other words - the running total scenario runs a LOT worse than the function in its worse case, but the function doesn't perform that much worse in its worse case. So - the function is a better/safer approach, especially when you don't have a good bead on data distribution.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Viewing 15 posts - 16 through 30 (of 34 total)
You must be logged in to reply to this topic. Login to reply