Twenty tips to write a good stored procedure

  • GilaMonster (8/10/2009)


    rja.carnegie (8/10/2009)


    ... because it's no longer the case that your use of the command has to be considered as possibly "SELECT column FROM gregory.table" and mine is "SELECT column FROM robert.table".

    It still is possible. If a table (view, proc) isn't schema-qualified, SQL looks first in the user's default schema then in the dbo schema (I think). Worth testing.

    It always checks the sys schema first, then the default schema, then dbo.

  • GSquared (8/10/2009)


    gregory.anderson (8/10/2009)


    Here's a question about one of the items. You mentioned that writing "master.dbo.proc_name" is better than just "proc_name", well, what about writing "master..proc_name"? My company does not specify the schema name so it's always "dbo". Is doing the ".." just a shortcut/default to "dbo"?

    Thanks

    Yes.

    No. 😀

    USE tempdb;

    GO

    CREATE SCHEMA Z;

    GO

    CREATE LOGIN A WITH PASSWORD = 'aA34%^&';

    GO

    CREATE USER A WITH DEFAULT_SCHEMA = Z;

    GO

    CREATE TABLE Z.A (A INT NULL);

    GO

    CREATE TABLE dbo.A (A INT NULL);

    GO

    INSERT Z.A VALUES (654321);

    INSERT dbo.A VALUES (-1);

    GO

    GRANT SELECT ON dbo.A TO A;

    GRANT SELECT ON Z.A TO A;

    GO

    EXECUTE AS USER = 'A';

    SELECT * FROM A; -- 654321

    SELECT * FROM tempdb..A; -- 654321

    SELECT * FROM dbo.A; -- -1

    REVERT;

    GO

    DROP TABLE dbo.A, Z.A

    DROP USER A

    DROP LOGIN A

    DROP SCHEMA Z

    It seems to resolve as for any schema-unqualified reference: sys, then default schema, then dbo.

    Paul

  • steve_melchert (8/10/2009)


    I've seen a lot of good debate today. The problem here, it seems, is the rating system. The displayed number of stars is viewed as authoritative, but in practical terms is arrived at by mere democratic vote. (The sheep outnumber the shepherds.) I think everyone would be happy if they could reliably expect that bad articles show 1 star and good articles show 5 stars.

    Can the displayed rating be *weighted* according to the experience level of the raters?

    Simply, I reckon if your vote was weighted by your SSC ranking (rather than points which would favour a handful of admittedly talented contributers) that might do it.

    Tim

    .

  • Consider this as print media with a circulation in excess of 1,000,000.

    I don't think anybody who might wish to do so would just be allowed write a front page article unreviewed would they?

    And if it was print media, whether this was opinion, editorial (comment) or fact would be made crystal clear wouldn't it? Not left to the reader to work out somehow?

    I am only surmising from other comments here that Steve Jones (whose writing I have a lot of respect for, even when controversial as it is EDITORIAL) allows people to post flawed articles so the community can sort them out later with no regard to the corresponding impact on the trustworthiness of SSC.

    If so I think this might work with 1,000 contributing members ... but not with 1,000,000 members, mostly who are read only and want to learn, and frankly probably don't know enough to contribute in detail.

    I stand by my previous comments.

    With Respect

    Tim

    EDIT: You can read the following comment if you need some support for my point...

    .

  • Convert also works with Sybase. How do I format a date with CAST as I can with convert.

    Good Article.

  • timothyawiseman (8/10/2009)


    GSquared (8/10/2009)


    Have to sort of disagree with you on this one. Temp tables cause statement recompiles in 2005/8, and proc recompiles in 2000, but even those are often/usually better than the execution plans that you end up with if you use table variables.

    Table variables have no benefits over temp tables except that they are outside of transactions, and that can be an advantage in a few cases (storing data for Try Catch handling to be used after a rollback is one example). Other than being non-transactional, they have no advantages that I know of. Temp tables, on the other hand, have plenty of advantages. They can indexed beyond the PK-clustered index, they have stats, they can easily pass data back and forth between procs, they can be accessed by dynamic SQL, and they can have their first page of data cached for re-use (2005/8) if that's needed, they can have DDL run on them after creation, they can be created by Select Into (very useful when importing data from files where you don't know before-hand what the column sizes need to be), and more.

    I recommend using table variables only if you have to (UDFs, which should be avoided as much as possible anyway for their own reasons), or if you need to hang onto some data despite needing to issue a rollback command.

    You have a few excellent points about the advantages of temp tables. Their ease of use with dynamic sql is a big reason I use them and the ability to index them has come into play for me in a few occassions for instance, and your other points about the advantages are very good.

    I am not clear on if temp tables generate better execution plans than table variables. I have not formally tested it, but intuitively I would expect the opposite. Eventually I will find time to test it, but if you have testing (or references) I would be very interested in seeing the results.

    There are however advantages to using table variables though. The big one being that they exist only during the running of the script where as temp tables exist for the duration of the session (or until explicitly destroyed). This means that there is no risk of them hanging around and infecting another run of the script (or another script that uses the same name). They also will not clutter the tempdb long after they were intended to as some temp table can do in some cases.

    Also, while I have not personally verified it, I understand that table variables generally create less overhead in terms of transaction logs and locks. Here is one reference, though again I have not personally verified: http://databases.aspfaq.com/database/should-i-use-a-temp-table-or-a-table-variable.html%5B/quote%5D

    The key is statistics. Temp tables have them, which is what leads to recompiles. Table variables don't, which is what leads to really poor choices from the optimizer. Statistics are largely what drives the optimizer. When it has good statistics, it makes good choices, when the statistics are poor, for whatever reason, but in this case because there are none, then it's choices are less than optimal.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • GilaMonster (8/10/2009)


    timothyawiseman (8/10/2009)


    I am not clear on if temp tables generate better execution plans than table variables. I have not formally tested it, but intuitively I would expect the opposite. Eventually I will find time to test it, but if you have testing (or references) I would be very interested in seeing the results.

    They usually do and the reason is statistics. Temp tables can have column statistics, table variables cannot. The optimiser uses statistics to estimate the number of rows that various parts of the query will process. Without stats, it estimates 1 row. If the table variable has close to 1 row in it, all is well. If it has lots of rows, the query optimiser's estimates are off so it's costing will be off and the plan it thinks is optimal may be very far from optimal.

    Doggone it. I read a couple of posts past to see if you had answered this one and thought it was safe to dive in.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • GSquared (8/10/2009)


    Have to sort of disagree with you on this one. Temp tables cause statement recompiles in 2005/8, and proc recompiles in 2000, but even those are often/usually better than the execution plans that you end up with if you use table variables.

    Table variables have no benefits over temp tables except that they are outside of transactions, and that can be an advantage in a few cases (storing data for Try Catch handling to be used after a rollback is one example). Other than being non-transactional, they have no advantages that I know of. Temp tables, on the other hand, have plenty of advantages. They can indexed beyond the PK-clustered index, they have stats, they can easily pass data back and forth between procs, they can be accessed by dynamic SQL, and they can have their first page of data cached for re-use (2005/8) if that's needed, they can have DDL run on them after creation, they can be created by Select Into (very useful when importing data from files where you don't know before-hand what the column sizes need to be), and more.

    I recommend using table variables only if you have to (UDFs, which should be avoided as much as possible anyway for their own reasons), or if you need to hang onto some data despite needing to issue a rollback command.

    That's less than entirely accurate. 🙂

    Rather than tackle each point, I would encourage you to read This TechNet Article, especially the section titled 'Improvements in SQL Server 2005'. Also check out This One to clarify the situation regarding (re)compilation.

    Some points are worth making explicitly:

    * Table variables are cached in tempdb exactly as for temporary tables, table-valued functions, and triggers. This includes the single data page and IAM page optimization.

    * Table variables can be declared with a PRIMARY KEY and UNIQUE constraints.

    * Only table variables can be use in an OUTPUT clause.

    * Sometimes the lack of statistics on table variables is desirable to produce a good 'FAST 1' plan - without the goal-seeking behaviour of TOP or FAST (n).

    I could go on.

    It really isn't on to say that temporary tables are better than temporary variables, or indeed that UDFs are undesirable. It is perfectly possible to misuse just about any feature in SQL Server - each and every one has its strengths and weaknesses. Appropriate use is the name of the game: choosing the right tool for the job.

    Paul

  • I do want to raise one point that was mentioned earlier. In 130 or so comments, where is the author? I think he put forward an honest best faith effort to put out a good article, no question. But some of it, based on review, appears to be off. Where is the author to address the information? Please. This isn't, nor should it be, a public lynching. Your peers have raised serious questions about the work done. You need to address them, even if it's to say, "oops. learned something new." For a reference, see here[/url], where I fell, utterly and completely, on my sword, because I messed up a test and completely misinterpreted the results. It happens.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • GilaMonster (8/10/2009)


    Now that's interesting. Didn't realise it worked for anything other than procedures. That certainly has possibilities.....

    And none of them good I suspect 😛

  • Lynn Pettis (8/10/2009)


    So, are you volunteering to be part of a forced peer review group for articles? Then, the more important question, how long will stay with it if you did volunteer? I know I don't have the time to volunteer for such a group. Between work and life, I barely have time for things I want to do.

    There may be a case for SSC publishing fewer articles.

    A better balance of quality versus quantity might be desirable.

    I'm sure no-one expects perfection, but there has to be a line somewhere. I'm all for contentious articles that provoke a rich debate, but this is silly.

    Paul

  • GilaMonster (8/10/2009)


    They usually do and the reason is statistics. Temp tables can have column statistics, table variables cannot. The optimiser uses statistics to estimate the number of rows that various parts of the query will process. Without stats, it estimates 1 row. If the table variable has close to 1 row in it, all is well. If it has lots of rows, the query optimiser's estimates are off so it's costing will be off and the plan it thinks is optimal may be very far from optimal.

    ...unless a RECOMPILE hint is added, in which case the number of rows is available to the optimizer. The deficiency is in the lack of distribution statistics.

    This is not always a bad thing, especially in OLTP queries where loop joins are frequently to be preferred over memory-grant-requiring hashes or merges, and we are often just after the first row. As I mentioned before, the row-goal-seeking behaviour of TOP and FAST (1) is not always desirable either.

    Horses for courses.

    Paul

  • I agree - many of these are just plain wrong.

  • I found a great deal wrong with this article, but I think GilaMonster probably covered most of my objections the best.

    I think the sugestion that this should have been peer reviewed first is an extremely valid point.

    I think this article should have taken a little different tack and made mention of the fact that there are very few if any hard and fast rules that should never be broken, I can find reasons to breach very nearly every rule, and they will generally deliver significant performance enhancements, the difference is that I think about it before I violate them..

    I just hope that the comments don't disuade Arup from additional writing.

    CEWII

  • I have the same feeling when I read :

    SELECT emp_name FROM table_name WHERE LOWER(emp_name) = 'edu'

    :w00t:

Viewing 15 posts - 121 through 135 (of 244 total)

You must be logged in to reply to this topic. Login to reply