February 25, 2010 at 12:26 am
By the way... I took a look at most of the code in the contest you have going on... all you folks that are converting the number to VARCHAR() 6, 7, 8, etc... I'm thinking that your code may not actually be creating the correct output when someone runs the code for 150,000,000 which contains 9 digits... 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
February 25, 2010 at 6:44 am
CirquedeSQLeil (2/24/2010)
WayneS (2/24/2010)
FYI, Gianluca's unpivot works the fastest on my system (813ms), though Lynn's is right behind (833ms) - initial runs. But, when run repeatedly (and randomly), they actually flip-flop back and forth, but both stay consistently ~ 850ms. Both codes from that posted at ask...What is your test method?
Nothing fancy... or noteworthy. Just copy each from ask.ssc into a different query window, and run repeatedly.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
February 25, 2010 at 6:49 am
reidres (2/24/2010)
...Gianluca's unpivot works for 1 000 000 rows only, no more or less. So it is not scalable nor flexible--and who only wants a million rows?
That would be because he only does enough cross-joins in the millionrows CTE to get it to one million... add another cross-join of the thousandrows CTE, and it will blossom to 1 billion.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
February 25, 2010 at 8:52 am
WayneS (2/25/2010)
CirquedeSQLeil (2/24/2010)
WayneS (2/24/2010)
FYI, Gianluca's unpivot works the fastest on my system (813ms), though Lynn's is right behind (833ms) - initial runs. But, when run repeatedly (and randomly), they actually flip-flop back and forth, but both stay consistently ~ 850ms. Both codes from that posted at ask...What is your test method?
Nothing fancy... or noteworthy. Just copy each from ask.ssc into a different query window, and run repeatedly.
K. I think you are comparing apples to oranges there. Did you modify all to make sure they were outputting or simply dumping into a temp table, or just return last value?
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 25, 2010 at 9:02 am
Lucky I went for varchar(255) then Jeff.
That should cover it!
Cheers
Steve
February 26, 2010 at 8:24 am
I can cut somewhere between 0.2 and 4 per cent off the elapsed time at 150000000 rows for Jason's method (but limiting scalability to 4 billion rows instead of 1000 billion) by using a large enough base cte that 5 cross joins are needed instead of the large number needed with a 10 row base cte. At least that's on my machine the difference in run speed between the larger base cte and the 10 row one varies between 0.2% and 3.8%, but we all know this stuff is environment-sensitive and there may be no difference at all on another machine. Since the 6th root of 2**32 is a about 40.3, this requires a 41 row base cte if 2**32 is where I want to scale to.
Here's the code (it's clearly a tuned version of Jason's code, not anything new)
declare @limit bigint set @Limit = 150000000 -- @Limit can be anything from 1 to 42180533641
set statistics time on
;with base(n) as
(select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all select 9 union all select 10 union all
select 11 union all select 12 union all select 12 union all select 14 union all select 15 union all
select 16 union all select 17 union all select 18 union all select 19 union all select 20 union all
select 21 union all select 22 union all select 22 union all select 24 union all select 25 union all
select 26 union all select 27 union all select 28 union all select 29 union all select 30 union all
select 31 union all select 32 union all select 32 union all select 34 union all select 35 union all
select 36 union all select 37 union all select 38 union all select 39 union all select 40 union all
select 41 ),
bigun(n) as
(select 1 from
base b1
cross join base b2
cross join base b3
cross join base b4
cross join base b5
cross join base b6 )
SELECT count(all case
when n % 15 = 0 then 'FizzzBuzz' --Multiples of 3 AND 5
when n % 3 = 0 then 'Fizz'
when n % 5 = 0 then 'Buzz'
else cast(n as VarChar(10))
end)
FROM ( SELECT ROW_NUMBER() OVER (ORDER BY n)
FROM bigun ) D (n)
WHERE n <= abs(@Limit) ;
set statistics time off;
go
Restricting the scalability to the range of a positive (int) would lead to a 36 row base CTE, but doesn't have any useful effect on the speed (in fact it seems to have a bad effect: it always runs slightly slower with 36 base elements than with 41, which I don't unerstand because with 41 all 5 joins are needed to get to 150000000, just as they are with 36 elements. I'm going to look at the actual query plans - predicted ones won't be useful - but don't reasslly expect to discover what's going on).
Tom
February 26, 2010 at 1:49 pm
Sure beats this more conventional CTE approach by about a factor of 4.
CAVEAT
Don't use this one. There are better ones out there!
set statistics time on
;
with
cte1 as
(
select top 150000000 row_number() over (order by (select 0)) as row from master.sys.all_columns
cross join master.sys.all_columns AS T2
cross join master.sys.all_columns AS T3
)
SELECT count(all case
when row%15 = 0 then 'FIZZBUZZ'
when row%5 = 0 then 'BUZZ'
when row%3 = 0 then 'FIZZ'
else cast(row as varchar(9))
end)
from cte1
set statistics time off
February 26, 2010 at 5:26 pm
However if I rewrite my previous post using the WHERE clause instead of TOP, it makes a massive difference and is very, very close to Tom's.
set statistics time on
;
with
cte1 as
(
select row_number() over (order by (select 0)) as row from master.sys.all_columns
cross join master.sys.all_columns AS T2
cross join master.sys.all_columns AS T3
)
SELECT count(all case
when row%15 = 0 then 'FIZZBUZZ'
when row%5 = 0 then 'BUZZ'
when row%3 = 0 then 'FIZZ'
else cast(row as varchar(9))
end)
from cte1 where row < 150000001
set statistics time off
February 26, 2010 at 5:36 pm
Tom.Thomson (2/26/2010)
I can cut somewhere between 0.2 and 4 per cent off the elapsed time at 150000000 rows for Jason's method (but limiting scalability to 4 billion rows instead of 1000 billion) by using a large enough base cte that 5 cross joins are needed instead of the large number needed with a 10 row base cte. At least that's on my machine the difference in run speed between the larger base cte and the 10 row one varies between 0.2% and 3.8%, but we all know this stuff is environment-sensitive and there may be no difference at all on another machine. Since the 6th root of 2**32 is a about 40.3, this requires a 41 row base cte if 2**32 is where I want to scale to.Here's the code (it's clearly a tuned version of Jason's code, not anything new)
declare @limit bigint set @Limit = 150000000 -- @Limit can be anything from 1 to 42180533641
set statistics time on
;with base(n) as
(select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all
select 6 union all select 7 union all select 8 union all select 9 union all select 10 union all
select 11 union all select 12 union all select 12 union all select 14 union all select 15 union all
select 16 union all select 17 union all select 18 union all select 19 union all select 20 union all
select 21 union all select 22 union all select 22 union all select 24 union all select 25 union all
select 26 union all select 27 union all select 28 union all select 29 union all select 30 union all
select 31 union all select 32 union all select 32 union all select 34 union all select 35 union all
select 36 union all select 37 union all select 38 union all select 39 union all select 40 union all
select 41 ),
bigun(n) as
(select 1 from
base b1
cross join base b2
cross join base b3
cross join base b4
cross join base b5
cross join base b6 )
SELECT count(all case
when n % 15 = 0 then 'FizzzBuzz' --Multiples of 3 AND 5
when n % 3 = 0 then 'Fizz'
when n % 5 = 0 then 'Buzz'
else cast(n as VarChar(10))
end)
FROM ( SELECT ROW_NUMBER() OVER (ORDER BY n)
FROM bigun ) D (n)
WHERE n <= abs(@Limit) ;
set statistics time off;
go
Restricting the scalability to the range of a positive (int) would lead to a 36 row base CTE, but doesn't have any useful effect on the speed (in fact it seems to have a bad effect: it always runs slightly slower with 36 base elements than with 41, which I don't unerstand because with 41 all 5 joins are needed to get to 150000000, just as they are with 36 elements. I'm going to look at the actual query plans - predicted ones won't be useful - but don't reasslly expect to discover what's going on).
Thanks for doing that. I will test it out and see how it works. Do you mind if I use this code for an article (blog or otherwise) if my testing shows the improvements?
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 26, 2010 at 7:43 pm
CirquedeSQLeil (2/26/2010)
Thanks for doing that. I will test it out and see how it works. Do you mind if I use this code for an article (blog or otherwise) if my testing shows the improvements?
Hi Jason, I reckon you did all the spade work on this and I've just played about at the edges, as it were hangingon to your coat tails. So as far as I'm concerned it's all yours to do as you wish with.
Tom
February 26, 2010 at 8:04 pm
Jason... you might also want to try Gianluca's enhancement with the original "seed" using UNPIVOT. Between what he came up with, your original work, and Tom's enhancement, you should end up with the fastest code on the block.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 26, 2010 at 9:42 pm
Jeff Moden (2/26/2010)
Jason... you might also want to try Gianluca's enhancement with the original "seed" using UNPIVOT. Between what he came up with, your original work, and Tom's enhancement, you should end up with the fastest code on the block.
I like the suggestion. I had tested Gianluca's code, but hadn't delved into it.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 26, 2010 at 9:46 pm
Tom.Thomson (2/26/2010)
CirquedeSQLeil (2/26/2010)
Thanks for doing that. I will test it out and see how it works. Do you mind if I use this code for an article (blog or otherwise) if my testing shows the improvements?Hi Jason, I reckon you did all the spade work on this and I've just played about at the edges, as it were hangingon to your coat tails. So as far as I'm concerned it's all yours to do as you wish with.
Thanks - you will get credit for your work.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 28, 2010 at 5:41 am
In the meantime is this a possibility? This variation also sets the threshold
to 4750104241 (1681 x 1681 x 1681), the same as 41^6.
set statistics time on
;
with
cte1 as
(select TOP 1681 1 as col from master..spt_values)
,
cte2 as
(select row_number() over (order by (select 0)) as row from cte1
cross join cte1 AS T2
cross join cte1 AS T3)
SELECT count(all case
when row%15 = 0 then 'FIZZBUZZ'
when row%5 = 0 then 'BUZZ'
when row%3 = 0 then 'FIZZ'
else cast(row as varchar(9))
end)
from cte2 where row < 150000001
set statistics time off
February 28, 2010 at 9:09 am
steve-893342 (2/28/2010)
In the meantime is this a possibility? This variation also sets the thresholdto 4750104241 (1681 x 1681 x 1681), the same as 41^6.
set statistics time on
;
with
cte1 as
(select TOP 1681 1 as col from master..spt_values)
,
cte2 as
(select row_number() over (order by (select 0)) as row from cte1
cross join cte1 AS T2
cross join cte1 AS T3)
SELECT count(all case
when row%15 = 0 then 'FIZZBUZZ'
when row%5 = 0 then 'BUZZ'
when row%3 = 0 then 'FIZZ'
else cast(row as varchar(9))
end)
from cte2 where row < 150000001
set statistics time off
It's absolutely a possibility. Just understand that since you're reading from a table, this will have a lot more reads than the pure memory solution that others have offered. I've not tested physical cross-joins in a CTE but I do know that simple cross-joins on a physical table to make very large numbers of rows can cause explosive growth of log files even when the database is in the SIMPLE recovery mode. When I tested the use of more than 1 cross-join to make a billion row return several months ago, one of the log files (I forget which one) exploded to over 40GB. Doing the same test with the memory only solutions caused zero growth.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 211 through 225 (of 363 total)
You must be logged in to reply to this topic. Login to reply