Am I taking crazy pills? Dispute over inner vs. outer join.

  • TheGreenShepherd (2/14/2014)


    Unfortunately, this is a project still in the design phase, so it's not like we're trying to address an existing performance problem or anything.

    So premature optimisation in other words. Making non-standard design decisions based on unfounded assumptions as to what may or may not be faster.

    (I really need to write that article)

    If you do no have metrics, you have no reasons to make performance-related decisions which require unusual choices of query, design or code. Any performance-related decision should be based on hard numbers, metrics which show you that something is not performing acceptably and therefore you need to change some design or query aspect

    Oh, and query cost != query duration. It's an estimation, a unitless number that only has meaning to the query optimiser.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks Gail. I know this may be a hard question to answer definitively, but in general for an issue like this, is it better design to use LEFT JOINs or default a nullable FK column to a non-meaningful value and use an INNER JOIN?

  • As quoted from someone's sig:

    "Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms."

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • TheGreenShepherd (2/14/2014)


    Yeah, I know, it's non-SARGable. That was really my last line of defense for shooting this idea down, because his next suggestion is to make that column non-nullable, and default the values to -1, which also seems like a bit of a hack.

    Having a reference value that means some sort of missing value can be a good way to go in applications where the value can be missing for multiple reasons.

    I have used this in data warehouse applications to mean things like:

    1. Value exists but we don't know what it is yet.

    2. Value does not exist for this row yet, but it will eventually.

    3. Value will never exist for this row.

    I think it makes it easier to create ad-hoc type reporting, especially using a view with a fact table joined to it's related dimensions.

    I don't think I have ever seen it used in an OLTP application.

  • Well, this is an OLTP-style database, but it's being used for reporting (a DW was not part of the project specification). Further, it is the recipient of replication from the master OLTP database.

  • TheGreenShepherd (2/14/2014)


    Yeah, I know, it's non-SARGable. That was really my last line of defense for shooting this idea down, because his next suggestion is to make that column non-nullable, and default the values to -1, which also seems like a bit of a hack.

    I've actually seen this work. It's not a hack at all to have a design that doesn't allow for ANY null values. In fact, it's considered a bit more of a pure design that way. But, as with all things, it's a giant pain to maintain and default values are very difficult to define.

    But stay away from functions on columns in the JOIN and WHERE criteria. That's a classic code smell.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 6 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply