SQL coding techniques

  • My predecessor came up with the following rule:

    When doing table joins, always use the INNER JOIN / LEFT OUTER JOIN syntax, including as many of the join conditions in the ON statement as possible. Filtering in the WHERE clause should be limited to the first table named (after the FROM verb) .

    "Including as many of the join conditions in the ON statement" -- isn't this going to cause performance degradation on really large tables?

    "Filtering in the WHERE clause should be limited to the first table named (after the FROM verb)" - this is probably just fine for inner joins.

    Do any of you use similar set of coding rules?

  • The join clause is for joins (how one table relates to the other), unless dealing with outer joins and wanting the table filtered before the join.

    Filters belong in the where clause.

    Your predecessor's aren't going to harm performance (SQL can convert one to the other), but it makes queries so incredibly much harder to read.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • No (we) don't.

    I use on to make the join

    then I use where to filter data I don't need after the join is made

    Then I use having to filter on aggregates

    Then I use Distinct to remove duplicates (but that is usually sign of bad design)

    Then I use TOP (n) ORDER BY to filter the "extra" rows.

    The query optimizer is MUCH smarter than people give it credit for. While far from perfect it will understand exactly what you want and use the fastest path possible* to access the data.

    *keep in mind that this can only be a guess on most cases, but it's a pretty darn good one.

  • Ninja's_RGR'us (1/6/2012)


    I use on to make the join

    then I use where to filter data I don't need after the join is made

    I think this is fairly standard and generally good practice, but I think it's important to point out that whether you put conditions in the 'where' or in the 'on' it doesn't matter. Both are handled at the same time and you'll end up with the exact same query plan. For instance, these queries result in the exact same plan:

    --condition in where clause

    select c.*

    from dbo.Content as c

    join dbo.Content_MTM_SearchCriteria as csc

    on csc.ContentID = c.ContentID

    where csc.SearchCriteriaID = 597

    --condition in join

    select c.*

    from dbo.Content as c

    join dbo.Content_MTM_SearchCriteria as csc

    on csc.ContentID = c.ContentID

    and csc.SearchCriteriaID = 597

    I think sometimes it's preferable for readability to have conditions in the join itself. For instance, when you're joining 6 tables, all of which have an IsActive bit and you only want the active ones. It may be clearer to see that in the join condition rather than a list at the end where you may be more likely to miss one. But to each his own.

    β””> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • bteraberry (1/6/2012)


    Ninja's_RGR'us (1/6/2012)


    I use on to make the join

    then I use where to filter data I don't need after the join is made

    I think this is fairly standard and generally good practice, but I think it's important to point out that whether you put conditions in the 'where' or in the 'on' it doesn't matter. Both are handled at the same time and you'll end up with the exact same query plan. For instance, these queries result in the exact same plan:

    --condition in where clause

    select c.*

    from dbo.Content as c

    join dbo.Content_MTM_SearchCriteria as csc

    on csc.ContentID = c.ContentID

    where csc.SearchCriteriaID = 597

    --condition in join

    select c.*

    from dbo.Content as c

    join dbo.Content_MTM_SearchCriteria as csc

    on csc.ContentID = c.ContentID

    and csc.SearchCriteriaID = 597

    I think sometimes it's preferable for readability to have conditions in the join itself. For instance, when you're joining 6 tables, all of which have an IsActive bit and you only want the active ones. It may be clearer to see that in the join condition rather than a list at the end where you may be more likely to miss one. But to each his own.

    You get a very different result with an LEFT OUTER JOIN when you move the filter condition from the ON clause to the WHERE clause.

    See example below:

    select x=1 into #t union all select x=2 union all select x=3 union all select x=4

    select x=1 into #y union all select x=2 union all select x=4 union all select x=5

    print 'Condition "b.x in (2,4)" in JOIN'

    select

    *

    from

    #t a

    left outer join

    #y b

    ona.x = b.x and

    b.x in (2,4)

    print 'Condition "b.x in (2,4)" in WHERE'

    select

    *

    from

    #t a

    left outer join

    #y b

    ona.x = b.x

    where

    b.x in (2,4)

    drop table #t

    drop table #y

    Results:

    (4 row(s) affected)

    (4 row(s) affected)

    Condition "b.x in (2,4)" in JOIN

    x x

    ----------- -----------

    1 NULL

    2 2

    3 NULL

    4 4

    (4 row(s) affected)

    Condition "b.x in (2,4)" in WHERE

    x x

    ----------- -----------

    2 2

    4 4

    (2 row(s) affected)

  • --Borrowed from bt ;-)

    --condition in where clause

    select c.*

    from dbo.Content as c

    join dbo.Content_MTM_SearchCriteria as csc

    on csc.ContentID = c.ContentID -- Foreign Key

    where csc.SearchCriteriaID = 597

    I prefer individual filters to appear in WHERE clause than JOIN. It helps to avoid confusion of filters & join criteria in LEFT JOIN specifically.

    For me, join conditions are basically links between two tables i.e. foreign keys. I prefer to see only that piece in JOIN conditions. I think it’s logical as well.

  • bteraberry (1/6/2012)


    I think this is fairly standard and generally good practice, but I think it's important to point out that whether you put conditions in the 'where' or in the 'on' it doesn't matter. Both are handled at the same time and you'll end up with the exact same query plan.

    Oh... be careful. Michael Valentine Jones beat me to it on this one but it's important enough that I thought I'd mention it again.

    It frequently DOES matter where you put filter criteria if some form of OUTER join is involved. Please see Michael's example above.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Very good point ... it most definitely does have a huge impact on outer joins.

    β””> bt



    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Luk (1/6/2012)


    When doing table joins, always use the INNER JOIN / LEFT OUTER JOIN syntax...

    This syntax is the most common in modern code. It brings consistency to inner and outer join syntaxes and lowers the chances of accidentally forgetting a join condition. A few people (Joe Celko is the only example that springs to mind) prefer the older syntax, with a comma-separated list of tables in the FROM clause, and join conditions in the WHERE clause. This syntax has never appealed to me, but it is just as valid and supported as the more common JOIN...ON...variety. I stress, this choice of syntaxes is only valid for inner (not outer or full) joins.

    Luk (1/6/2012)


    ...including as many of the join conditions in the ON statement as possible....

    Some might argue that moving WHERE clause conditions into the ON clause makes switching between inner and outer joins easier. Borrowing Michael's example:

    DECLARE @a TABLE (x INTEGER PRIMARY KEY)

    DECLARE @b-2 TABLE (x INTEGER PRIMARY KEY)

    INSERT @a VALUES (1), (2), (3), (4)

    INSERT @b-2 VALUES (1), (2), (4), (5)

    -- Inner join

    SELECT

    *

    FROM

    @a AS a

    JOIN @b-2 AS b ON

    a.x = b.x

    AND b.x IN (2,4)

    -- Outer join

    SELECT

    *

    FROM

    @a AS a

    LEFT JOIN @b-2 AS b ON

    a.x = b.x

    AND b.x IN (2,4)

    We only have to add or remove the LEFT keyword to change between inner and outer joins. This can help to preserve correct results when developing code, but overall I personally find this argument unconvincing.

    Luk (1/6/2012)


    ...Filtering in the WHERE clause should be limited to the first table named (after the FROM verb)...

    This is an extension of the idea discussed immediately above. It holds so long as only LEFT outer joins are ever specified (because the first table can never have NULL-extended rows). I remain unconvinced πŸ™‚

    Luk (1/6/2012)


    "Including as many of the join conditions in the ON statement" -- isn't this going to cause performance degradation on really large tables?

    No. SQL Server does not execute your SQL code literally. SQL queries describe logical results, SQL Server picks a physical plan that produces those results. The logical and physical forms frequently have a very different appearance.

    Luk (1/6/2012)


    "Filtering in the WHERE clause should be limited to the first table named (after the FROM verb)" - this is probably just fine for inner joins.

    These rules are unconventional, and seem to add little value overall. Most people write code using the more common styles already discussed on this thread. Doing something different may just confuse people in the future.

  • IIRC the older syntax is not supported in database compatibilty modes of 90 and higher

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" πŸ˜‰

  • Perry Whittle (1/8/2012)


    IIRC the older syntax is not supported in database compatibilty modes of 90 and higher

    The older style for outer joins isn't (*=, =*), but the older form for inner join is still supported. It's not something you can really remove.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Perry Whittle (1/8/2012)


    IIRC the older syntax is not supported in database compatibilty modes of 90 and higher

    That's the problematic old *= and =* outer join syntax. The inner join syntax is alive and well. This is fine on SQL Server 2012 at a compatibility level of 110:

    DECLARE @a TABLE (x INTEGER PRIMARY KEY)

    DECLARE @b-2 TABLE (x INTEGER PRIMARY KEY)

    INSERT @a VALUES (1), (2), (3), (4)

    INSERT @b-2 VALUES (1), (2), (4), (5)

    SELECT *

    FROM @a AS a, @b-2 AS b

    WHERE a.x = b.x

  • Thanks for clearing that up

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" πŸ˜‰

  • GilaMonster (1/8/2012)


    Perry Whittle (1/8/2012)


    IIRC the older syntax is not supported in database compatibilty modes of 90 and higher

    The older style for outer joins isn't (+=, =+), but the older form for inner join is still supported. It's not something you can really remove.

    I think old Oracle syntax used plus signs in brackets (+) somewhere in the syntax to indicate an outer join. SQL Server always used asterisks (*= and =*). I guess support for the old-style inner joins could be removed, by disallowing cross join using commas, and insisting on CROSS JOIN or JOIN...ON syntax, but AFAIK the old syntax is still in some SQL standard or other (ANSI-SQL 89?). I don't think Microsoft often remove support for a SQL standard once it is actually in the product.

  • SQL Kiwi (1/8/2012)


    GilaMonster (1/8/2012)


    Perry Whittle (1/8/2012)


    IIRC the older syntax is not supported in database compatibilty modes of 90 and higher

    The older style for outer joins isn't (+=, =+), but the older form for inner join is still supported. It's not something you can really remove.

    I think old Oracle syntax used plus signs in brackets (+) somewhere in the syntax to indicate an outer join.

    Yes it did (at least on 7, 8i and 9i), and iirc it was just += or maybe (+)= (its been a while), but my comment was just a typo from having spent the morning working on some calculations in Java (x += 5 and the like).

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply