February 24, 2010 at 2:24 pm
OK, Tom. The thing with your TSQL seems to be that while you are indeed generating the numbers, you are not specifically doing so in a way that the numbers can be converted to their mod-indicated values at the place/point-in-time you are generating your potential number list. Subject to further study...
February 24, 2010 at 2:48 pm
Lynn Pettis (2/24/2010)
Don't know, Tom, on the server I am using here at work yours actually runs about 4 time longer than my code, about 4 seconds.Really comes down to "It Depends".
I would expect mine to be slower on a reasonable machine, because it has to read 532 records (when going to 150000000) from a view in the master db and yours doesn't access any pre-existing data at all, but 4 times is a lot more than I thought it would be. I think from one of your earlier comments that you've changed yours since the version I have (moved the numbering to an extra layer outside the last join, or something like that, and it produced some sort of improvement) so that may account for some of the difference.
And yes, at the end of the day "It Depends". So when possible, I measure in the target environment (after thorough testing and a good deal of measurement in both development and test environments) - and that occassionally gives me a surprise.
Tom
February 24, 2010 at 2:56 pm
OK, I converted Tom's code so that the conversion is done where the select statement is done outside of the CTE and the time for 1 000 000 is the same as for Lynn, which is 0:23 on my machine, with 1 000 000 rows with the Fizz Buzz in the display.
February 24, 2010 at 3:00 pm
Tom.Thomson (2/24/2010)
reidres (2/24/2010)
Well we are looking for apples, not oranges.I'm slightly concerned about the apples and oranges issue: is what you are measuring just generating a cte with the numbers in and then counting them?
If so, I'm a bit surprised by the results you have quoted: I modified Lynn's version to just count instead of copying to a temp table, and on my machine that does 150,000,000 in roughly a quarter of the time Jason's takes to do its maximum 100,000,000. I'm wondering just what it is that I've got horribly wrong in copying the code, or maybe I've copied the wrong version from the wrong place.
Anyway, just for fun, here's an entry from me (based on the assumption that we are just generating the CTE and counting it). On my machine it's about 7% faster than Lynn's - assuming I'm measuring the right thing (maybe only because my machine is on my side - it's a very faithful old beast:hehe:).
use play -- choose a database to work in
go
declare @Limit bigint;
set @Limit = 150000000;
set statistics time on;
with some_columns as -- just as many as we need to get to the limit (cube root)
-- 1+POWER because @Limit is an int so POWER will return an int and may round down :(
(SELECT top (1+power(@Limit,1.0/3.0)) 0 as Z from Master.sys.all_columns),
numbers (J) as
(SELECT top (@Limit) ROW_NUMBER() OVER (ORDER BY AC1.Z) AS J
from some_columns AC1
CROSS JOIN some_columns AC2
CROSS JOIN some_columns AC3
)
select COUNT(*) from numbers
set statistics time off
go
Found a way to slow your code down, add the logic for FizzBuzz instead of just counting the numbers generated.
February 24, 2010 at 3:30 pm
reidres (2/24/2010)
OK, Tom. The thing with your TSQL seems to be that while you are indeed generating the numbers, you are not specifically doing so in a way that the numbers can be converted to their mod-indicated values at the place/point-in-time you are generating your potential number list. Subject to further study...
OK, I tried this instead (with smaller numbers) and it's completely dominated by streaming the output. Tried making the same change also with the (outdated) version of Lynn's code that I have, with the same result: elapsed time is completely dominated by streaming the output. I agree with whoever it was (Lynn, ISTR) who pointed out that displaying 150000000 rows is going to be the great equaliser?
use play -- choose a database to work in
go
declare @Limit bigint;
set @Limit = 150000000;
set statistics time on;
with some_columns as -- just as many as we need to get to the limit (cube root)
-- 1+POWER because @Limit is an int so POWER will return an int and may round down :(
(SELECT top (1+power(@Limit,1.0/3.0)) 0 as Z from Master.sys.all_columns)
SELECT case when J%15=0 then 'FizzBuzz' multiple of 5 and of 3
when J%3=0 then 'Fizz'
when J%5=0 then 'Buzz'
else cast(J as varchar(8))
end as FizzBuzzNumber
from
(SELECT top (@Limit) ROW_NUMBER() OVER (ORDER BY AC1.Z) AS J
from some_columns AC1
CROSS JOIN some_columns AC2
CROSS JOIN some_columns AC3
) numbers
go
edit to put the code into a code box
Tom
February 24, 2010 at 4:25 pm
This will be the new testing standard, since I own that right now and no one is fighting me for it. You do a COUNT(ALL [put your mod conversion code here]). Tom tried to trick me by setting varchar(8) to see if I was really testing for the 150 000 000. Well your little trick didn't work Tom! So I have upped it to varchar(20) [which in and edit I just changed to plain varchar]. This is Tom's and it comes in @ 3:22 on my machine. The test shows that the numbers have been generated and causes the evaluation of the mod expression for every row.
declare @Limit bigint;
set @Limit = 150000000;
set statistics time on;
with some_columns as -- just as many as we need to get to the limit (cube root)
-- 1+POWER because @Limit is an int so POWER will return an int and may round down
(SELECT top (1+power(@Limit,1.0/3.0)) 0 as Z from Master.sys.all_columns)
SELECT COUNT(ALL case when J%15=0 then 'FizzBuzz'-- multiple of 5 and of 3
when J%3=0 then 'Fizz'
when J%5=0 then 'Buzz'
--else cast(J as varchar(20))
else cast(J as varchar)
end)
from
(SELECT top (@Limit) ROW_NUMBER() OVER (ORDER BY AC1.Z) AS J
from some_columns AC1
CROSS JOIN some_columns AC2
CROSS JOIN some_columns AC3
) numbers
go
It doesn't matter who tests it, but if you want me to test your stuff, post it here according to the form described in this post.
February 24, 2010 at 6:03 pm
OK, I have run all four of the ones that are scalable/credible contenders and I have, for 150 000 000 rows using the COUNT (ALL xxxx) method described earlier. In the order I tested them, which turned out to be the order of times.
Lynn : 3:51
Tom : 3:16
Kev : 1:35 (Itzik-based solution)
CirqueDeSQLeil: 0:43 (the solution he posted on this thread not on ASK thread)
I looked at some of the other solutions and there were not that seemed like contenders to the list above, either for scalability or other issues. Fatherjack's COALESCE solution looked interesting but is very slow and not set for scalability. Some of the solutions are just set to do a certain number of rows, which cannot be altered as easily as changing a number.
February 24, 2010 at 7:49 pm
reidres (2/24/2010)
Tom tried to trick me by setting varchar(8) to see if I was really testing for the 150 000 000. Well your little trick didn't work Tom!
It wasn't a trick, it was a silly and careless and mistake on my part.
The original question was only up to 100, but the output column had to be 8 wide to hold 'FizzBuzz' so I picked varchar(8) (and that has been lying around in my scraps of code since I first spent 5 minutes throwing a set-oriented solution and a non-set-oriented solution together). When I suggested that a better form of the question would be to specify 100000000 I changed my code not to display the results; I had a version that copied to a table, but I only tested that only at 1000000 and 10000000 and 25000000 because it was rather slow (copying to the table took far longer than building the numbers); the versions I tested at 100000000 and at 150000000 just counted and didn't do the fizzbuzz bit so I didn't get any warnings there either; for the last version I just copied in the output code that had worked before but had only been tested up to 1000000, and because I didn't want to wait while my rather slow system pushed out that 150000000 lines to the screen I only tested to 1000000, so didn't trip over the too small column. I broke one of my own rules, and handed over code that hadn't been tested up to the the design limit.
I guess I will have to upgrade my graphics driver and associated display performance optimisations - I don't like missing a bug like that just because the slowness of displaying results on my system makes me inclined to take shortcuts in testing.
Tom
February 24, 2010 at 9:15 pm
FYI, Gianluca's unpivot works the fastest on my system (813ms), though Lynn's is right behind (833ms) - initial runs. But, when run repeatedly (and randomly), they actually flip-flop back and forth, but both stay consistently ~ 850ms. Both codes from that posted at ask...
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
February 24, 2010 at 9:57 pm
WayneS (2/24/2010)
FYI, Gianluca's unpivot works the fastest on my system (813ms), though Lynn's is right behind (833ms) - initial runs. But, when run repeatedly (and randomly), they actually flip-flop back and forth, but both stay consistently ~ 850ms. Both codes from that posted at ask...
Over here we are testing for 150 000 000 rows. Gianluca's unpivot works for 1 000 000 rows only, no more or less. So it is not scalable nor flexible--and who only wants a million rows? If you just want the fastest list possible, perhaps a list of 15 would be fastest. The unpivot @ 1M rows does not blow anyone out of the water even @ 1M to 1M, but if he designs it to scale we will see what it does @ 150 000 000 rows.
February 24, 2010 at 10:06 pm
WayneS (2/24/2010)
FYI, Gianluca's unpivot works the fastest on my system (813ms), though Lynn's is right behind (833ms) - initial runs. But, when run repeatedly (and randomly), they actually flip-flop back and forth, but both stay consistently ~ 850ms. Both codes from that posted at ask...
What is your test method?
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 24, 2010 at 10:26 pm
CirquedeSQLeil (2/24/2010)
WayneS (2/24/2010)
FYI, Gianluca's unpivot works the fastest on my system (813ms), though Lynn's is right behind (833ms) - initial runs. But, when run repeatedly (and randomly), they actually flip-flop back and forth, but both stay consistently ~ 850ms. Both codes from that posted at ask...What is your test method?
I am not answering for Wayne, but I ran the unpivot vs yours @ 1M rows in a loop 20 times each a couple of times and they are a statistical tie in performance at 1M.
Lots of interesting approaches up here. Love it!
February 24, 2010 at 10:41 pm
Jeff Moden (2/24/2010)
Manie Verster (2/23/2010)
Jeff, I forgot some points in my previous post to you. To hardcode a query is never good and there you have me. It looks like I am getting floored here today:hehe::hehe::hehe:! The other point (also a good one) is that I did not read the question properly and therefore did it wrong. I should have replaced the numbers and not add columns to the query. The global temp table point I cannot agree with. This was a trademark of my mentor to use the global temp table and in all this 10 years I have been working for him, not one query failed with a global temp table. I will however research these temp tables and maybe I can write an article about it.:hehe::hehe::hehe::hehe:Jeff, as a last point, although I like to take you on. I like you and in the time since I joined SSC I learned a lot from you. I would like you to add a query in Steve's Ask competition because I want to kick your a...........backside.
Heh... that's the spirit, Manny. You took it all very well, thank you.
I'll try to remember to post some code here that shows when and how the use of Global Temp tables can cause a failure during multiple simultaneous runs.
As a side bar, I said I wouldn't post a solution to the FizzBuzz problem and I won't go back on my word to that effect just because there' a contest. I have to let you good folks have all the fun on that one. Sorry.
Here you go, Manny. I clipped the important part from your previous code and changed the data source of the FROM clause so it'll be able to build few million rows (5 in this case).
First, open one window in SSMS and paste the following code... change the time in the WAITFOR to, oh, 5 minutes from now whatever "now" is when you do this and then click "Execute".
WAITFOR TIME '23:27:00'
declare @end datetime
if isnull(object_id('tempdb..##rowtest'),0) <> 0 drop table ##rowtest
select top 5000000 identity(int,1,1) idkey into ##rowtest from MASTER.sys.All_Columns ac1, MASTER.sys.All_Columns ac2
select idkey, case when idkey % 3 = 0 then 'Fizz' else '' end [fizz],
case when idkey % 5 = 0 then 'Buzz' else '' end [buzz],
case when idkey%3 = 0 and idkey%5 = 0 then 'FizzBuzz' else '' end [fizzbuzz]
from ##rowtest
Then, open up a second window in SSMS and paste the same from above. Change the time to be the same time as in the first window plus 1 second. For example, if the first window has a time of 23:27:00, then make the time in this second window 23:27:01. Then click 'Execute' and go back to watch what happens in the first window...
... here's what I get in the first window when the second window drops your Global Temp table.
(5000000 row(s) affected)
Msg 208, Level 16, State 1, Line 7
Invalid object name '##rowtest'.
If you've been using Global Temp tables without a problem for 10 years like you say, then you've been very, very lucky with both your table naming convention and your run timing, indeed.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 24, 2010 at 11:05 pm
reidres (2/24/2010)
OK, I have run all four of the ones that are scalable/credible contenders and I have, for 150 000 000 rows using the COUNT (ALL xxxx) method described earlier. In the order I tested them, which turned out to be the order of times.Lynn : 3:51
Tom : 3:16
Kev : 1:35 (Itzik-based solution)
CirqueDeSQLeil: 0:43 (the solution he posted on this thread not on ASK thread)
I looked at some of the other solutions and there were not that seemed like contenders to the list above, either for scalability or other issues. Fatherjack's COALESCE solution looked interesting but is very slow and not set for scalability. Some of the solutions are just set to do a certain number of rows, which cannot be altered as easily as changing a number.
BWAA-HAAA!!! Guess I need an upgrade for my poor old 8 year old home desktop box... it took over 6 minutes for the fastest code above to run. :hehe:
--Jeff Moden
Change is inevitable... Change for the better is not.
February 24, 2010 at 11:09 pm
Jeff Moden (2/24/2010)
reidres (2/24/2010)
OK, I have run all four of the ones that are scalable/credible contenders and I have, for 150 000 000 rows using the COUNT (ALL xxxx) method described earlier. In the order I tested them, which turned out to be the order of times.Lynn : 3:51
Tom : 3:16
Kev : 1:35 (Itzik-based solution)
CirqueDeSQLeil: 0:43 (the solution he posted on this thread not on ASK thread)
I looked at some of the other solutions and there were not that seemed like contenders to the list above, either for scalability or other issues. Fatherjack's COALESCE solution looked interesting but is very slow and not set for scalability. Some of the solutions are just set to do a certain number of rows, which cannot be altered as easily as changing a number.
BWAA-HAAA!!! Guess I need an upgrade for my poor old 8 year old home desktop box... it took over 6 minutes for the fastest code above to run. :hehe:
Ummm, that would be an understatement.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
Viewing 15 posts - 196 through 210 (of 363 total)
You must be logged in to reply to this topic. Login to reply