September 22, 2009 at 2:01 pm
the sqlist (9/22/2009)
Jeff Moden (9/22/2009)
I'm using SQL SERVER Profiler with the data produced by the queries routed to a temp table. Routing the output to the screen gives false readings because it takes so long to display the data that it looks like all methods are the same. That's why I called outputting to the screen the "great equalizer".
I guess you're right, here is a much simpler way to test performance though:
set statistics time on
select number into #tmp from [dbo].[ufn_Tally_zb] (1,100000,1) order by number
drop table #tmp
Of course, you have to execute both functions using the example template above in different windows and not at the same time. Mine it is much slower indeed becasue creates from start all 1Meg of rows and filters them later. For big numbers though the difference drops drastically.
However, in order to make the function run on SQL 2000 and earlier you have to use my approach. Actually performance wasn't really my goal here and I asume I could tweak it a little bit. Not today though. 😛
Heh.... now I'm not sure how you are testing. Like I said before... Lynn's runs in under 6 seconds for a 10M gen... yours (after adding 1 extra cross join to get there) took 65 seconds.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 22, 2009 at 2:02 pm
Lynn Pettis (9/22/2009)
row_number() over (order by a1.N)
Something I have noticed when working with the windowing functions in queries using them, if the query itself does not have an ORDER BY, the ORDER BY in the windowing function orders the output.
I'd be interested if anyone else has seen this behaviour as well.
I have. And it makes sense. The engine has to sort the data to assign row numbers to it (or rank/dense rank), and changing it would actually add work.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
September 22, 2009 at 2:04 pm
Lynn Pettis (9/22/2009)
First, leave it to you squeeze another 33% out of a routine, awesome. I'll have to incorporate that change into my code. I think Steve needs to add a disclaimer to the top and bottom of this articel:
Heh... I just get lucky. And disclaimers are for bad articles... not this one.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 22, 2009 at 2:06 pm
Jeff Moden (9/22/2009)
Heh.... now I'm not sure how you are testing. Like I said before... Lynn's runs in under 6 seconds for a 10M gen... yours (after adding 1 extra cross join to get there) took 65 seconds.
You're rigth, here are my results,
test:
set statistics time on
select * into #tmp from [dbo].[ufn_Tally2] (1,10000000,1) --order by number
drop table #tmp
results:
SQL Server Execution Times:
CPU time = 4890 ms, elapsed time = 4933 ms.
(10000000 row(s) affected)
under 5 secs.
Don't just give the hungry man a fish, teach him how to catch it as well.
the sqlist
September 22, 2009 at 2:09 pm
Jeff Moden (9/22/2009)
timothyawiseman (9/22/2009)
Great article Lynn. It seems that even if you aren't permitted to change the schema that if you are going to use it multiple times you might get better results by putting it into a temp table instead of generating it dynamically over and over.Still, if it must be dynamically generated, this is a great technique.
True enough. But I've found that the same sites that don't allow a permanent Tally table also have an "angry" DBA the will frequently not allow any TempDB usage nor any access to things like sysColumns in any code... that's where dynamic generation can really save the day especially if it's nasty fast like this one is.
Heh... here's to "angry" DBA's... I get to charge for the time I take to work around their restrictions. 😛
And most of them won't realize that "you can't do anything in tempdb" really means "you can't do anything at all in the whole database except for simple CRUD, and don't you dare use Order By in any of your selects". SQL Server has a real tendency to do a lot of work in tempdb behind the scenes, even if you don't explicitly tell it to.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
September 22, 2009 at 2:19 pm
BWAA-HAA!!! Correct... guess it really matters how "angry" the DBA really is. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
September 22, 2009 at 2:20 pm
Jeff Moden (9/22/2009)
Lynn Pettis (9/22/2009)
First, leave it to you squeeze another 33% out of a routine, awesome. I'll have to incorporate that change into my code. I think Steve needs to add a disclaimer to the top and bottom of this articel:Heh... I just get lucky. And disclaimers are for bad articles... not this one.
I don't know, Jeff, even good articles have great discussions attached just like this one. You call it luck squeezing more performace, I call applied mathematics. Had I taken the time to look at the math, I would have seen what you did as well. I was so busy looking at the routine itself and comparing it to that other one, I didn't even consider tuning mine more.
September 22, 2009 at 2:48 pm
Jeff beat me to the punch on limiting the total number of rows processed as an added speed boost.
Lynn, try this with the function:
select *
from dbo.ufn_tally2(1,-1,1);
You can correct for that one by changing the final select:
select
((N - 1) * case when @pEndValue 0 then @pIncrement * -1 else @pIncrement end) + @pStartValue as N
from
Tally
The added case statement handles having a non-negative increment when the top number is less than the bottom number. My assumption on this one is that the increment should have been negative. Another valid assumption would be that it should just return the starting number, since a single increment exceeds the value of the end number. That could also be coded, but would require a case statement in the Top value calculation instead of the final select.
Also:
select *
from dbo.ufn_tally2(1,20,0);
If the increment is 0, you'll get a div-0 error. If you want to avoid that, I assume you just want the min number and nothing else (increment 0 = add 0 as many times as you want, and you still have the base number). You can correct for that by modifying this part:
select top (isnull((abs(case when @pStartValue < @pEndValue
then @pEndValue
else @pStartValue
end -
case when @pStartValue < @pEndValue
then @pStartValue
else @pEndValue
end))/abs(nullif(@pIncrement, 0))+ 1, 1))
I added a NullIf around the increment value, and an IsNull around the whole thing, with a 1 as the alternate return value. That gives you just the base number, nothing else.
Not that anyone is likely to deliberately feed the function nonsense parameters like these, but in a dynamic environment, you never know what you're going to end up with if you use any sort of math to generate the parameters, or if you use user-entered values. If any part of the code calling the function ends up with a floating point calculation that rounds in a funky way, these minor changes will help keep it from crashing.
I haven't tested the performance on these changes. They probably add to the cost. (Edit: Of course, the competing function also doesn't correct for these errors, and adding correction to it would add to its cost as well.)
Other than those couple of suggestions, it looks like reasonably good code for what it's meant to do.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
September 22, 2009 at 2:51 pm
Lynn Pettis (9/22/2009)
Jeff Moden (9/22/2009)
Lynn Pettis (9/22/2009)
First, leave it to you squeeze another 33% out of a routine, awesome. I'll have to incorporate that change into my code. I think Steve needs to add a disclaimer to the top and bottom of this articel:Heh... I just get lucky. And disclaimers are for bad articles... not this one.
I don't know, Jeff, even good articles have great discussions attached just like this one. You call it luck squeezing more performace, I call applied mathematics. Had I taken the time to look at the math, I would have seen what you did as well. I was so busy looking at the routine itself and comparing it to that other one, I didn't even consider tuning mine more.
I agree. Even the best articles benefit from the discussion. That's what makes this site better than a print magazine.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
September 22, 2009 at 2:52 pm
Hmmmm.... hadn't thought about that. A note about the discussions would probably be appropriate here.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 22, 2009 at 2:59 pm
See, and now we are finding some very interesting errors in the code that I missed. Thanks, I do appreciate it. 🙂
September 22, 2009 at 3:13 pm
I incorporated the changes that both Gus and Jeff suggested and using the simple SET STATISTICS TIME ON method of testing, the routine generates 1,000,000 rows in approximately 900ms.
Here is the code as an alter function:
USE [SandBox]
GO
/****** Object: UserDefinedFunction [dbo].[ufn_Tally2] Script Date: 09/22/2009 09:06:54 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
alter function [dbo].[ufn_Tally2](
@pStartValue bigint = 1,
@pEndValue bigint = 1000000,
@pIncrement bigint = 1
)
returns table
as
return(
with BaseNum (
N
) as (
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1
),
L1 (
N
) as (
select
bn1.N
from
BaseNum bn1
cross join BaseNum bn2
cross join BaseNum bn3
),
L2 (
N
) as (
select
a1.N
from
L1 a1
cross join L1 a2
),
L3 (
N
) as (
select top (isnull(abs(@pEndValue - @pStartValue) / abs(nullif(@pIncrement,0))+ 1,1)) --Limit N per parameters
a1.N
from
L2 a1
cross join L2 a2
),
Tally (
N
) as (
select
row_number() over (order by a1.N)
from
L3 a1
)
select
((N - 1) * case when @pEndValue 0 then @pIncrement * -1 else @pIncrement end) + @pStartValue as N
from
Tally
);
September 22, 2009 at 3:34 pm
Just to keep from doing the apples'n'oranges thing, I ran the new code with both changes. The optimization came pretty close to being negated by the extra checking. It took 5.670 seconds to gen the 10M rows into a temp table... but now it's just about "Gumby Proof".
As a side bar, no matter what the form has taken on this thread so far, they all still blow the doors off of any form of recursion. I'm really surprised that no one caught that on SQL Mag.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 22, 2009 at 4:21 pm
Jeff,
Test this one out:
USE [SandBox]
GO
/****** Object: UserDefinedFunction [dbo].[ufn_Tally2] Script Date: 09/22/2009 09:06:54 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
alter function [dbo].[ufn_Tally2](
@pStartValue bigint = 1,
@pEndValue bigint = 1000000,
@pIncrement bigint = 1
)
returns table
as
return(
with BaseNum (
N
) as (
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1
),
L1 (
N
) as (
select
bn1.N
from
BaseNum bn1
cross join BaseNum bn2
cross join BaseNum bn3
),
L2 (
N
) as (
select
a1.N
from
L1 a1
cross join L1 a2
),
L3 (
N
) as (
select top (isnull(abs(@pEndValue - @pStartValue) / abs(nullif(@pIncrement,0))+ 1,1)) --Limit N per parameters
a1.N
from
L2 a1
cross join L2 a2
),
Tally (
N
) as (
select
row_number() over (order by a1.N)
from
L3 a1
)
select
((N - 1) * (sign(@pEndValue - @pStartValue) * @pIncrement)) + @pStartValue as N
from
Tally
);
September 22, 2009 at 5:00 pm
I'll have to test it tomorrow. I'm on a different machine for the night.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 46 through 60 (of 159 total)
You must be logged in to reply to this topic. Login to reply