Why does the query optimizer choose this way?

  • I read the classic article

    https://blogs.msdn.microsoft.com/craigfr/2006/07/13/index-examples-and-tradeoffs/

    Using the example in the section "Seek vs. scan example" I run this batch:

    Set Statistics IO On

    select a from T

    where a between 1001 and 9000

    select a from T with (index (Tb))

    where a between 1001 and 9000

    Set Statistics IO Off

    As one can see, estimated operator cost and IO cost are higher in the first case but optimizer sets this way as default. Does anybody know why?

    I have SQL Server 2012 SP3 11.0.6020.0 x64 Developer Edition.

  • Add time to the statistics you capture:

    Set Statistics IO,TIME On

    It will tell you why.

    _____________
    Code for TallyGenerator

  • The optimizer sometimes chooses a "good enough" solution rather than take the time to explore all possible query plans.

    The clustered index is sequenced on Column [a]. This means that the optimizer can do an index seek then start scanning starting at row 1000 and stopping at row 9000. This kinds of solution is sometimes referred to as "trivial" because it's an obvious technique for the optimizer to pick when building a query plan.

    Non-clustered index Tb is sequenced on column . SQL has to scan the entire index to find the values of [A] between 1000 and 9000. In this case (at least on my home PC) the Index Scan runs slightly faster, but the optimizer chose to do the obvious clustered index seek.

    The philosophy behind this is that exhaustively looking at query plans waists cpu cycles that could be spent just running a "good enough" plan and getting it out of the system. It's not perfect but it works mighty good. ๐Ÿ˜‰

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Sergiy, execution time is not the constant. I run query 5 times with Set Statistics IO,TIME On and here is the result:

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 305 ms.

    SQL Server Execution Times:

    CPU time = 47 ms, elapsed time = 195 ms.

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 139 ms.

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 215 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 324 ms.

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 202 ms.

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 133 ms.

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 222 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 297 ms.

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 234 ms.

    Actually, the second query more affects CPU but has lesser mean elapsed time. And this is elapsed time, not estimated. So it doesn't contain any additional information about how optimizer makes decision.

  • The Dixie Flatline, the concept of "good enough" solution makes sense.

  • Edit: ignore.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster, to judge by IO reads and execution plan estimated cost the better is obviously the second. But optimizer chooses the first and this was the question.

  • Can you post the actual execution plans of both please?

    It's probably a case if 'good enough', but there might also be some estimation differences resulting in the plan you didn't expect.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Here are the screenshots. Tell me if you prefer other format.

  • The execution plans please, not pictures of them. As in, the XML plans saved in as .sqlplan files.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Here are the plans, sorry for misunderstanding.

  • Abort reason is "Good enough plan found" (properties of the SELECT). Optimisation has multiple phases, "Good enough plan found" means that at the end of one phase the cheapest plan found was cheap enough that further optimisation isn't necessary. That's probably why it's the clustered index seek.

    Optimiser's job is to find a good plan fast, not to find the best plan.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (9/2/2016)


    Abort reason is "Good enough plan found" (properties of the SELECT). Optimisation has multiple phases, "Good enough plan found" means that at the end of one phase the cheapest plan found was cheap enough that further optimisation isn't necessary. That's probably why it's the clustered index seek.

    Optimiser's job is to find a good plan fast, not to find the best plan.

    OPTION (EXTRAOPTIMIZERTIME (10))

    Maybe one day.

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks for explaining.

  • ChrisM@Work (9/2/2016)


    GilaMonster (9/2/2016)


    Abort reason is "Good enough plan found" (properties of the SELECT). Optimisation has multiple phases, "Good enough plan found" means that at the end of one phase the cheapest plan found was cheap enough that further optimisation isn't necessary. That's probably why it's the clustered index seek.

    Optimiser's job is to find a good plan fast, not to find the best plan.

    OPTION (EXTRAOPTIMIZERTIME (10))

    Maybe one day.

    There is * Trace Flag 2301 which claims to ...

    ".. make your optimizer work harder by enabling advanced optimizations that are specific to decision support queries, applies to processing of large data sets..."

    https://support.microsoft.com/en-us/kb/920093

    https://blogs.msdn.microsoft.com/ianjo/2006/04/24/query-processor-modelling-extensions-in-sql-server-2005-sp1/

    * Possible side effects include: excessive plan recompilation, high CPU utilization, plan cache bloat, headaches, 2 AM support calls, and bloody stool.

    https://www.brentozar.com/archive/2008/10/before-you-upgrade-to-sql-server-2008/

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply