Batch Cost & Actual Time

  • Hi Guys

    I have two queries which gives same results

    Query 1-------------->

    SELECT

    ColA

    ,COUNT (DISTINCT ColB)

    ,COUNT (DISTINCT ColC)

    FROM Table_Data

    WHERE SUBSTRING(CAST(X_DATE AS VARCHAR),1,11) IN

    (SELECT Y_DATE FROM Table_DateRange)

    GROUP BY ColA

    Query 2 ----------------->

    SELECT

    ColA

    ,COUNT (DISTINCT ColB)

    ,COUNT (DISTINCT ColC)

    FROM Table_Data

    WHERE X_Date BETWEEN '2008-01-01 00:00:00' AND

    '2008-01-31 23:59:59'

    GROUP BY ColA

    1. Table_Data table is having approx 3 million rows.

    2. Table_DateRange table contains daterange values from

    '2008-01-01 00:00:00' AND '2008-01-31 23:59:59'

    3. The first Query takes 27 Seconds.

    4. The second query takes 9 Seconds.

    The point here is that when I execute both the queries in a batch and see the Actual execution plan then the Query Cost of First Query ( which is taking 27 Secs) relative to batch is 27 %

    While the query cost of second query relative to batch is 73 % ( which is taking 9 Secs to execute)

    Can any one explain as to why it is happening... What do we have to look out while optimising the queries

    Also can anyone give me some links as to useful optimisation related to joins.

    Thanks in Advance

    Amit Tiwari

  • That is a bit odd. Usually there's a pretty direct correlation between cost & time. Also, looking at the two queries, I would expect the second query to have a lower cost as well as a lower time. Sometimes the execution plan can show a higher cost, but execute faster. In this case, I'm not sure. If you want you could post the execution plans, they might contain more detail.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Also, remember that the query cost is relative to the batch. The in the faster query, the relative cost is 73%. In the slower batch, the cost you showed of 27% is relative to that batch. You are trying to compare apples to oranges between the two queries, as they are not identical.

    The first, without looking at the execution plan, I would guess is doing a table scan and the second is probably doing an index seek.

    😎

  • Yeah but when you run two queries each shows a relative cost to the execution as a whole. Usually the speedier, lower cost, process shows as a smaller percentage of the overall cost. Still, I've seen it go the other way, a bit, but not as much as demonstrated here.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • There's something strange there. The one with the sub-query is being picked up by the optimizer as the lower cost. That stinks something strange.

    Are the statistics anywhere near current?

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I have sp2 installed and I noticed the same thing an a good number of queries that end up using a loop join...

    I've never trusted % if batch, anyway... it did this type of stuff even in 2000.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The problem here is that we haven't seen the entire execution plan, only what we have been told. Again, the cost is relative to the batch its self, and it is quite possible that the cost of the query (relative to the batch) is higher. How much work is occuring in the slower query that could be more costly relative to that batch?

    😎

  • Lynn Pettis (3/13/2008)


    Again, the cost is relative to the batch its self, and it is quite possible that the cost of the query (relative to the batch) is higher.

    I've found that it lies... a lot... especially in the estimated execution plan...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The costs are based on the plan using available stats ... if the stats don't represent the data then they will be off - check the execution plan for estimated and actual rows on each iterator. Also it's possible that the second query is using cached data that the first query loaded.

  • Have you tried reversing the order of the queries and compare the timings then? Or run each query individually, preceding each with a DBCC DROPCLEANBUFFERS to clear the buffer pool? This is on the assumption that the difference in timings was because the data was already in the buffer pool when the 2nd query ran.

    When optimising queries, I usually run SET STATISTICS IO ON, and view the actual execution plan. Large differences in actual and estimated rows in the execution would then suggest outdated statistics. Updating the statistics might then solve the issue.

    Otherwise, it boils down to tuning the query by hand. I always compare the logical reads for the baseline query against the new queries to get an idea if the new queries are better. Reducing the number of reads would usually mean a better query. I will also time the queries when ran from a empty buffer pool, which you do by running DBCC DROPCLEANBUFFERS.

    Sometimes, oddities do occur, like when a query which makes more reads is actually faster than a query that makes less reads (ok, that happens when it is actually reading the same data multiple times!). Or when a query ran with MAXDOP 1 outperforms the same query ran using a parallellised plan on a multi-processor system. Eventually you learn to roll with it.

    SQL BAK Explorer - read SQL Server backup file details without SQL Server.
    Supports backup files created with SQL Server 2005 up to SQL Server 2017.

  • I believe you also need to do a DBCC FREEPROCCACHE before a DBCC DROPCLEANBUFFERS if you trully want to make sure the execution plan is cleared.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply