May 31, 2007 at 8:19 pm
BTW, Jeff, your solution first time - 32 sec, then in average 13 sec
May 31, 2007 at 11:49 pm
I use the index almost exclusively. It seems to be the best way to find what I'm looking for.
May 31, 2007 at 11:53 pm
You mean like 14 (2 and 7 are prime numbers)?
May 31, 2007 at 11:55 pm
A syntex error come when i execute last query in artical, error 'with' block, please check this error and trply me on kjamilpk@yahoo.com. And also MAXRECURSION is not working.
regards.
kashif jamil
June 1, 2007 at 5:15 am
Great response. First, I was joking about actually using this as an interview question. However, the excercise is interesting because it does force you to apply logic and a bit of knowledge of TSQL to come up with a solution, any solution.
For myself, the issue comes down to one of balance. Yes, writing the most optimal code with clear documentation is a wonderful thing. Now, please do that on two different projects with varying delivery dates, one is for two people in HR to manage a series of health chats and one is a new enterprise system that's going to have 4000 simultaneous connections. Which one do you think will get a basic set of queries, probably not tuned to fare thee well, probably lacking an utterly complete set of documentation. Is this bad? No. First and foremost, our jobs are not about delivering documentation, they're about delivering functional systems. Some of those systems absolutely must have good documentation as part of the functionality and some don't. Easy statements such as "you ALWAYS have to write code that scales to 1,000,000 rows and you ALWAYS have to write complete documentation" are either crap or someone has an incredibly cushy job.
Good enough is usually good enough.
Nice job on the solution.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
June 1, 2007 at 5:17 am
I have to agree. In a real interview situation this code would scare me off.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
June 5, 2007 at 2:16 pm
Well my goal was to come up with the fastest code. This can be rewritten with functions so we do not hardcode values(which is how i use it in code). Well, the code below ran in < 10 secs for 1,000,000 records. It ran in < 1 ms for 100 records. It uses set based theory. By the way I copied Sergiy solution(nullif), I thought it was beautiful.
SELECT ISNULL(
NULLIF(CASE WHEN i%3=0 THEN 'Bizz' ELSE '' END + CASE WHEN i%5=0 THEN 'Buzz' ELSE '' END, '')
, CAST(i AS varchar(50)))
FROM
(
(SELECT A.i + B.i + C.i + D.i + E.i + F.i as i
FROM
(
SELECT 0 i UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6 UNION ALL
SELECT 7 UNION ALL
SELECT 8 UNION ALL
SELECT 9
)A CROSS JOIN
(
SELECT 0 i UNION ALL
SELECT 10 UNION ALL
SELECT 20 UNION ALL
SELECT 30 UNION ALL
SELECT 40 UNION ALL
SELECT 50 UNION ALL
SELECT 60 UNION ALL
SELECT 70 UNION ALL
SELECT 80 UNION ALL
SELECT 90
)B CROSS JOIN
(
SELECT 0 i UNION ALL
SELECT 100 UNION ALL
SELECT 200 UNION ALL
SELECT 300 UNION ALL
SELECT 400 UNION ALL
SELECT 500 UNION ALL
SELECT 600 UNION ALL
SELECT 700 UNION ALL
SELECT 800 UNION ALL
SELECT 900
)C CROSS JOIN
(
SELECT 0 i UNION ALL
SELECT 1000 UNION ALL
SELECT 2000 UNION ALL
SELECT 3000 UNION ALL
SELECT 4000 UNION ALL
SELECT 5000 UNION ALL
SELECT 6000 UNION ALL
SELECT 7000 UNION ALL
SELECT 8000 UNION ALL
SELECT 9000
)D CROSS JOIN
(
SELECT 0 i UNION ALL
SELECT 10000 UNION ALL
SELECT 20000 UNION ALL
SELECT 30000 UNION ALL
SELECT 40000 UNION ALL
SELECT 50000 UNION ALL
SELECT 60000 UNION ALL
SELECT 70000 UNION ALL
SELECT 80000 UNION ALL
SELECT 90000
)E CROSS JOIN
(
SELECT 0 i UNION ALL
SELECT 100000 UNION ALL
SELECT 200000 UNION ALL
SELECT 300000 UNION ALL
SELECT 400000 UNION ALL
SELECT 500000 UNION ALL
SELECT 600000 UNION ALL
SELECT 700000 UNION ALL
SELECT 800000 UNION ALL
SELECT 900000
)F
)
UNION ALL
(
SELECT 1000000
))T
WHERE i > 0
ORDER BY i
June 5, 2007 at 6:50 pm
Yep... Developer's Edition of SQL Server 2000 sp4. But, like you said, the evironment you run it in makes all the difference in the world for output speed. OSQL via CMD window would be one of the slowest as would the text mode of QA. The grid mode of QA is more like what any GUI would produce because... it's a GUI! Just a reminder... the whole shootin' match worked on my box in 8.6 seconds the first and all subsequent times it was executed.
Some of the larger manufactured cross-joins do work faster than my humble 2 table cross-join. Michael Valentine Jones wrote one similar to the manufactured cross-joins folks have been posting and it returns a million rows in the proverbial blink of an eye. I'll see if I can find the URL because its a handy function he made... can program both the start and end of the range including negative numbers, but its pretty darned hard to remember how to write it during an interview
If we want to test for the true speed of the processing only, we should overwrite a variable instead of writing to the display or even another table.
One other thing and I'm not sure what the heck it is... and I don't even have a scientific explanation... but it seems that laptops are a bit more expeditious in running loops than desktop boxes and servers are (judging by some of the folks reports of time for the million row tests). 'Course... could be my mind playing tricks on me.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 5, 2007 at 6:58 pm
Thanks, Igor... my solution takes 8.6 seconds on my box first and all times. It's a little ol' P5 1.8 Ghz with 1 Gig of Ram and twin 80 Gig IDE hard-drives. Because it's an "experimental" box, I've not done such optimizations like putting the MDF, LDF, and TempDB on separate drives.
What are you running on?
This also brings up the point that you have to be "box aware" and that a good tuning on one box might not be so handy on another. Still, there are some basic principles that are hard to ignore on any box...
--Jeff Moden
Change is inevitable... Change for the better is not.
June 5, 2007 at 7:06 pm
Hey folks...
Have seen lot's of different code and lot's of different opinions on this thread. I'd like first to congratulate Grant on producing an article that had such a tremendous and unexpected impact. I'd also like to compliment everyone who took part for their very professional demeanor even in light of some highly opposing opinions. THAT's the way to share information and ideas! Great job, all!!!
--Jeff Moden
Change is inevitable... Change for the better is not.
June 5, 2007 at 7:09 pm
Jeff, I have 2.18 Ghz Intel Duo, 4GB RAM, single HD , 2005 Developer Edition SP2, laptop.
I believe, for this example any solution should work practically the same, loop or set based, because there is no magic - set based solution internally still contains loops
-Igor
June 5, 2007 at 7:43 pm
Thanks for the H/W info, Igor.
Do me a favor please... run the following code and let me know what times you get back? The reason I ask is because I'm seeing a trend for those that run newer high speed CPU laptops... their loops run nasty fast compared to some set based examples and I think it may be because of the disk interface but I don't have a laptop to test on. Thanks.
DECLARE @Counter INT --Just a loop counter
DECLARE @Bitbucket INT --Just a place to store data
DECLARE @StartTime DATETIME --Just to measure duration
--===== Do the WHILE loop test of counting to 1 million
PRINT 'WHILE LOOP TEST...'
SET @StartTime = GETDATE() --Start the duration timer
SET @Counter = 1
SET @Bitbucket = 0
WHILE @Counter <= 1000000
BEGIN
SELECT @Bitbucket = @Bitbucket + 1
SET @Counter = @Counter + 1
END
-- Display the stats
PRINT STR(@BitBucket) + ' The final count (just to show work was done)'
PRINT STR(DATEDIFF(ms,@StartTime,GETDATE())) + ' Duration in milliseconds'
PRINT REPLICATE('-',78)
--===== Do the SET BASED test of counting to 1 million
PRINT 'SET BASED TEST...'
SET @StartTime = GETDATE() --Start the duration timer
SET @Bitbucket = 0
SELECT TOP 1000000
@Bitbucket = @Bitbucket + 1
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
-- Display the stats
PRINT STR(@BitBucket) + ' The final count (just to show work was done)'
PRINT STR(DATEDIFF(ms,@StartTime,GETDATE())) + ' Duration in milliseconds'
PRINT REPLICATE('-',78)
... and here's the results I got on my humble desktop machine...
WHILE LOOP TEST...
1000000 The final count (just to show work was done)
8186 Duration in milliseconds
------------------------------------------------------------------------------
SET BASED TEST...
1000000 The final count (just to show work was done)
390 Duration in milliseconds
------------------------------------------------------------------------------
--Jeff Moden
Change is inevitable... Change for the better is not.
June 5, 2007 at 7:52 pm
First time
WHILE LOOP TEST...
1000000 The final count (just to show work was done)
1073 Duration in milliseconds
------------------------------------------------------------------------------
SET BASED TEST...
1000000 The final count (just to show work was done)
263 Duration in milliseconds
------------------------------------------------------------------------------
Next times (156 remains the same for set based, loop based has some variance)
WHILE LOOP TEST...
1000000 The final count (just to show work was done)
1116 Duration in milliseconds
------------------------------------------------------------------------------
SET BASED TEST...
1000000 The final count (just to show work was done)
156 Duration in milliseconds
------------------------------------------------------------------------------
June 5, 2007 at 8:02 pm
Just for fun:
WHILE LOOP TEST...
100000000 The final count (just to show work was done)
111480 Duration in milliseconds
------------------------------------------------------------------------------
SET BASED TEST...
100000000 The final count (just to show work was done)
15296 Duration in milliseconds
------------------------------------------------------------------------------
Jeff, as you can see the ratio is the same
June 5, 2007 at 8:20 pm
Perfect. Thank you very much. That belays my previous suspicions about the new laptops. And thanks for taking it to the next couple of orders of magnitude! Always afraid to ask people to run a 10 million row test never mind a 100 million row test By the way, that's quite a laptop you have... what make is it? I might have to finally break down and get one
While were at it and now that we've established that your disk interface isn't a bottle neck, let's review what you said...
I believe, for this example any solution should work practically the same, loop or set based, because there is no magic - set based solution internally still contains loops |
Write or pick any of the loop based solutions for this BizzBuzz example and compare it's runtime (grid mode, please) to the set based solution in my very first post (comparing to mine because it's also "disk based" due to the reads from SysColumns... the other "memory only" set based solutions are even faster). (Note, to compensate for any display time differences, we should rewrite to output only to a variable). Then, you be the judge... do you still think there's no magic to the set based solutions for this BizzBuzz example?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 136 through 150 (of 309 total)
You must be logged in to reply to this topic. Login to reply