June 1, 2015 at 10:38 pm
Comments posted to this topic are about the item Writing Better T-SQL: Top-Down Design May Not be the Best Choice
Mickey Stuewe
Sr Database Developer
My blog
Follow me on twitter: @SQLMickey
Connect with me on LinkedIn
--------------------------------------------------------------------------
I laugh loudly and I laugh often. Just ask anyone who knows me.
June 2, 2015 at 2:25 am
That's fascinating, thank you. I come from an application developer background and top-down design was always the best approach - especially when trying to read back what it is that the code is trying to do.
Good comments are mandatory when trying to be more efficient, so I'll make double sure I've done so in any new code I implement!
Thanks again. 🙂
June 2, 2015 at 4:01 am
Nice article.
What I found most interesting about this was the remark:
For most environments, having faster stored procedures and reducing any potential CPU pressure is more important than the number of reads.
Because it's opposite of what I've come to believe - database systems are more likely to be IO bound than CPU bound so if something takes a little more CPU time but reads less, it'll be generally better to use that method because CPU cycle are cheap but disk access aren't.
Am I mistaken to think that?
On the other hand, the overall duration, however, is significant, since a stored procedure that takes half the time to run is much less likely to create lock contentions than other one that takes more time. I was once given the advice that the first and foremost measure is the wall clock -- I was making the mistake of making the execution plans my first measure and it was explained to me that most of time, something that executes in short time with a ugly execution plan is usually "better" than something that has a beautiful plan shape but takes more time.
June 2, 2015 at 4:04 am
Yes, it's not until you start to work in the real world that you find out 'elegant' is less important than 'efficient'. 🙂
June 2, 2015 at 4:38 am
Was thinking along the same lines för quite a time.
But: the optimizer is not an angel,
and in practice it is often much faster to use an intermediate result stored in a temp tabe and not a sub-query, or to use two simple update passes on a table in place of one more complicated.Found this out the hard way.
June 2, 2015 at 4:59 am
For 15 years I was against splitting the code into the parts until started working with hundreds of millions of the records. And what is the interesting part, that in many cases (even after rebuild statistics) execution plans are different between what you see in SSMS and what you have in reality, when SP is executed. Especially when you have heavy sub-queries.
June 2, 2015 at 5:18 am
While this is a good example and it's a good article (thank you SQLMickey for taking the time to write it), it IS an example of 1. "Divide'n'Conquer" methods frequently ARE the answer and having single queries that do a lot of things frequently ARE a major performance problem.
Also, the article suggests the use recursive CTEs. I want to remind everyone that certain types of recursive CTEs are frequently worse than either Cursors or While loops. Please see the following article for proof of that.
http://www.sqlservercentral.com/articles/T-SQL/74118/
As with all else in SQL Server, "It Depends". Keep both methods in mind because they both have their places.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 2, 2015 at 5:39 am
why SPs 2 and 4 run faster than 1 and 3 was not so clear
June 2, 2015 at 5:54 am
Nice thought-provoking article SQLMickey. Here's a thought for you:
"Let’s take a look at a stored procedure query written four different ways."
Whilst a junior developer might select one of the four without considering the others, an experienced developer will try all four and compare metrics. An informed choice is always better than a guess.
Also:
•Running small queries and inserting the results in temp tables to be used later in the stored procedure. One execution plan will be created and executed for each of these.
Sometimes this is more efficient than integrating the table source into the main query, and in any case, the “extra” execution plan won’t disappear if you integrate the table source into the main query – it will become part of a bigger and more complex execution plan, running the risk of an optimiser timeout. As above, an experienced developer will test, and compare metrics.
•Running small queries and inserting a single value in a variable to be used later in the stored procedure. One execution plan will be created and executed for each of these.
I’ll trade an “extra” execution plan for performance every time. The query optimiser doesn’t always get it right and when it doesn’t, temporary storage structures are usually the tool to use to help it along. Divide'n'Conquer, as Jeff says above.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 2, 2015 at 5:59 am
The article should have emphasized the "small" . When the the data sets get either large or the query is complex you do really need a temp table.
And to agree with previous comments, you need to test your queries, one size never fits all.
June 2, 2015 at 6:45 am
very good post! I am a big fan of CTEs and CROSS/OUTER APPLY but like many said above sometimes temp tables make the code more efficient and maintainable, so experienced developers will test a few different methods.
Also a lot of times I am ready to sacrifice some performance to make code more readable and maintainable - in the end of the day, if your SP does some heavy ETL at night off business hours, no one care if it runs for 2 hours or 1 hour and 40 minutes 🙂
June 2, 2015 at 7:07 am
Great article, making me think of why I do what I do and that maybe there is a better approach.
June 2, 2015 at 7:30 am
redtux1 (6/2/2015)
The article should have emphasized the "small" . When the the data sets get either large or the query is complex you do really need a temp table.And to agree with previous comments, you need to test your queries, one size never fits all.
Yes I agree with redtux1, this will work on small set of data
June 2, 2015 at 8:04 am
Simple Question. How are the stats for the different queries acquired?
June 2, 2015 at 9:01 am
Credibility is job one for me, so being able to see intermediate result sets helps me validate that the right data is flowing though my code. Obviously it has to be fast too, so one choice is never always right. I've used CTE's in SSIS while performing some ETL work and had a "simple" series of steps take 1.5 hours to run, changed the CTE to a stored proc using temp tables and called it from SSIS - it completed in about 2 minutes. Took awhile to find that "right" solution! Also, I like the ease of documenting and reading code that is in blocks. I'm not the most experienced with complicated or tricky code so maybe my thoughts will change as my SQL skills continue to improve.
Viewing 15 posts - 1 through 15 (of 57 total)
You must be logged in to reply to this topic. Login to reply