Forcing Query Plan.

  • Greetings.

    I have a stored procedure which had a parameterized complex query. I took out the complex query and run it then I take note how long it run. For the first time it took 3 seconds, consists of 2 seconds for compiling and 1 second for retrieving data. The second and next time, it took only less than 1 second.

    From what I have read, when optimizer running a query, it searches best plan for using index, join mechanism, etc. So how do I force the SQL Server using one query plan produced after the first time run? I imagine I can save the query plan from the first time running to a file, and then use it always each this query run. Well, at final step I want to apply this strategy into stored procedure.

    Thanks in advance.

    -mp-

  • Hinting or forcing should be the absolute last resort when you know there is no other way and you know exactly what you are doing.

    Please post query, table definitions, index definitions and execution plan, as per http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • permono.sam (5/26/2012)


    From what I have read, when optimizer running a query, it searches best plan for using index, join mechanism, etc. So how do I force the SQL Server using one query plan produced after the first time run? I imagine I can save the query plan from the first time running to a file, and then use it always each this query run. Well, at final step I want to apply this strategy into stored procedure.

    SQL Server automatically caches plans for reuse, so unless your query is recompiling often (or there's some other factor you haven't mentioned) there's generally no need to worry about this. SQL Server also does a pretty good job of deciding when it is necessary to recompile your query because circumstances have changed (for example, the data distribution has changed). If you were to constrain SQL Server to using roughly the same plan, you would lose that advantage. Using plan freezing does not guarantee a reduction in compilation time; it can quite often have the opposite effect.

  • SQL Kiwi (5/26/2012)

    SQL Server automatically caches plans for reuse, so unless your query is recompiling often (or there's some other factor you haven't mentioned) there's generally no need to worry about this. SQL Server also does a pretty good job of deciding when it is necessary to recompile your query because circumstances have changed (for example, the data distribution has changed). If you were to constrain SQL Server to using roughly the same plan, you would lose that advantage. Using plan freezing does not guarantee a reduction in compilation time; it can quite often have the opposite effect.

    What other factors for example? I do not mention because maybe I do not know that some things will be those :-D. Looks like the plan in caches often replaced with other query plan, because if I run the same query on next day, it took 3 seconds (equal to first time I run it). That is why I want to force it. Or is there any other strategy to decrease compilation time? Minimizing number of indexes for each table involved?

    -tia-

  • declare @IdPackage as decimal

    declare @IdFc as decimal

    set @IdPackage = 394

    set @IdFc = 15

    set statistics io on

    SELECT identitas_debitur

    , SUM(isnull(ang_deb, 0)) AS ang_deb

    , SUM(isnull(cost_bdi, 0)) AS cost_bdi

    , SUM(isnull(os_bdi, 0)) AS os_bdi

    , SUM(isnull(ang_bdi, 0)) AS ang_bdi

    , SUM(isnull(pendapatan_perbulan, 0)) AS pendapatan_perbulan

    , product_code

    --into #tmp_RelatedOneObligorNew

    FROM

    (

    SELECT cif.identitas_debitur

    , loan.ang_deb

    , loan.cost_bdi

    , loan.os_bdi

    , loan.ang_bdi

    , 1 AS pendapatan_perbulan

    , prd.PRODUCT_CODE

    , loan.jenis_kelamin AS jenis_kelamin

    FROM cif inner join loan

    on cif.identitas_debitur = loan.identitas_debitur

    inner join

    (

    select ID_PRODUCT

    , PRODUCT_CODE

    from PRODUCT

    where ID_FC = @IdFc

    and APPROVAL_STATUS = 'APPROVED'

    ) prd

    on loan.id_product = prd.ID_PRODUCT

    UNION

    SELECT txt_cif.identitas_debitur

    , case

    when isnumeric(txt_fin.ang_deb) = 1 then CONVERT(money, txt_fin.ang_deb)

    else 0

    end as ang_deb

    , case

    when isnumeric(txt_fin.cost_bdi) = 1 then CONVERT(money, txt_fin.cost_bdi)

    else 0

    end as cost_bdi

    , NULL AS os_bdi

    , case

    when isnumeric(txt_fin.ang_bdi) = 1 then CONVERT(money, txt_fin.ang_bdi)

    else 0

    end as ang_bdi

    , case

    when ISNUMERIC(txt_disburse.pendapatan_perbulan) = 1 then CONVERT(money, txt_disburse.pendapatan_perbulan)

    else 0

    end as pendapatan_perbulan

    , prd.PRODUCT_CODE

    , txt_disburse.jenis_kelamin AS jenis_kelamin

    FROM txt_cif INNER JOIN txt_disburse

    ON txt_cif.identitas_debitur = txt_disburse.identitas_debitur

    INNER JOIN txt_fin

    ON txt_cif.identitas_debitur = txt_fin.identitas_debitur

    inner join PACKAGE pckg

    on pckg.ID_PACKAGE = TXT_CIF.ID_PACKAGE

    inner join

    (

    select ID_PRODUCT

    , PRODUCT_CODE

    from PRODUCT

    where APPROVAL_STATUS = 'APPROVED'

    ) prd

    on prd.ID_PRODUCT = pckg.ID_PRODUCT

    ) derivedtbl

    GROUP BY identitas_debitur, product_code

    @IdPackage and @IdFc will be parameters passed from application to stored procedure.

  • You sure you're seeing a delay from plan compilation, not SQL having to pull the data from disk vs cache? That query's simple, shouldn't take more than few ms to compile. As Paul said, forcing a plan won't necessarily reduce compilation time, it does not bypass the optimiser.

    Once in cache, a plan remains unless the cache is flushed, SQL restarts or the plan is removed because it's not used. The plan will be recreated if the stats change, underlying objects change or any underlying object is recompiled.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • permono.sam (5/26/2012)


    What other factors for example? I do not mention because maybe I do not know that some things will be those :-D. Looks like the plan in caches often replaced with other query plan, because if I run the same query on next day, it took 3 seconds (equal to first time I run it). That is why I want to force it. Or is there any other strategy to decrease compilation time? Minimizing number of indexes for each table involved?

    Thanks for posting the execution plan, it's very useful. If you click on the green SELECT node in the plan, and look at its properties (hit F4 to open the properties page in SSMS) you can see the compilation time was 197ms. The delay you experience is probably due to the data and index pages needed to execute the query being no longer in the buffer pool - check the STATISTICS IO output to see how many physical reads and read-ahead reads were needed (feel free to share that output if you like).

    The plan itself is pretty good. I might be tempted to split it into three parts though: one query for side of the UNION (each result stored in a temporary table), and a final query to do the final SUMs and GROUP BY on the result of UNIONing the two partial results. Check that you really need UNION rather than UNION ALL. I suspect you do (even though the operation only excludes 8 rows in the example you posted).

    -- Part 1

    SELECT

    cif.identitas_debitur,

    loan.ang_deb,

    loan.cost_bdi,

    loan.os_bdi,

    loan.ang_bdi,

    1 AS pendapatan_perbulan,

    prd.PRODUCT_CODE,

    loan.jenis_kelamin AS jenis_kelamin

    INTO #temp1

    FROM cif

    JOIN loan ON

    cif.identitas_debitur = loan.identitas_debitur

    JOIN

    (

    SELECT

    ID_PRODUCT,

    PRODUCT_CODE

    FROM

    PRODUCT

    WHERE

    ID_FC = @IdFc

    AND APPROVAL_STATUS = 'APPROVED'

    ) prd ON

    loan.id_product = prd.ID_PRODUCT;

    -- Part 2

    SELECT

    txt_cif.identitas_debitur,

    CASE WHEN ISNUMERIC(txt_fin.ang_deb) = 1

    THEN CONVERT(money, txt_fin.ang_deb)

    ELSE 0

    END AS ang_deb,

    CASE WHEN ISNUMERIC(txt_fin.cost_bdi) = 1

    THEN CONVERT(money, txt_fin.cost_bdi)

    ELSE 0

    END AS cost_bdi,

    NULL AS os_bdi,

    CASE WHEN ISNUMERIC(txt_fin.ang_bdi) = 1

    THEN CONVERT(money, txt_fin.ang_bdi)

    ELSE 0

    END AS ang_bdi,

    CASE WHEN ISNUMERIC(txt_disburse.pendapatan_perbulan) = 1

    THEN CONVERT(money, txt_disburse.pendapatan_perbulan)

    ELSE 0

    END AS pendapatan_perbulan,

    prd.PRODUCT_CODE,

    txt_disburse.jenis_kelamin AS jenis_kelamin

    INTO #temp2

    FROM txt_cif

    JOIN txt_disburse ON txt_cif.identitas_debitur = txt_disburse.identitas_debitur

    JOIN txt_fin ON txt_cif.identitas_debitur = txt_fin.identitas_debitur

    JOIN PACKAGE pckg ON pckg.ID_PACKAGE = TXT_CIF.ID_PACKAGE

    JOIN

    (

    SELECT

    ID_PRODUCT,

    PRODUCT_CODE

    FROM

    PRODUCT

    WHERE

    APPROVAL_STATUS = 'APPROVED'

    ) prd ON prd.ID_PRODUCT = pckg.ID_PRODUCT ;

    -- Part 3

    SELECT

    identitas_debitur,

    SUM(ISNULL(ang_deb, 0)) AS ang_deb,

    SUM(ISNULL(cost_bdi, 0)) AS cost_bdi,

    SUM(ISNULL(os_bdi, 0)) AS os_bdi,

    SUM(ISNULL(ang_bdi, 0)) AS ang_bdi,

    SUM(ISNULL(pendapatan_perbulan, 0)) AS pendapatan_perbulan,

    product_code

    FROM

    (

    SELECT

    identitas_debitur,

    ang_deb,

    cost_bdi,

    os_bdi,

    ang_bdi,

    pendapatan_perbulan,

    PRODUCT_CODE,

    jenis_kelamin

    FROM #temp1 AS t

    UNION

    SELECT

    identitas_debitur,

    ang_deb,

    cost_bdi,

    os_bdi,

    ang_bdi,

    pendapatan_perbulan,

    PRODUCT_CODE,

    jenis_kelamin

    FROM #temp2 AS t2

    ) AS derivedtbl

    GROUP BY

    identitas_debitur,

    product_code;

    Splitting the code this way not only makes life easier for the optimizer (simpler is often better, and it can auto-create statistics on the temporary tables), it also gives you a chance to create a primary key and/or indexes on the temporary tables to give the optimizer better information. More to the point, the more accurate information may allow the optimizer to better estimate the number of rows entering the Distinct Sort in your original plan (this is important because the memory allocated for sorting is fixed at compile time, and cannot grow dynamically at execution time; if not enough memory is reserved, the sort will spill to tempdb).

    Of course I only use SELECT...INTO #temp there because I'm typing quickly; you might want to use CREATE TABLE for the temporary tables with an explicit structure and/or keys and indexes already, and INSERT rows to those. You might also get better results by splitting the query into just two parts: one for the derived table, and one for the final SUM and GROUP BY. Some ideas for you to explore.

  • GilaMonster (5/26/2012)


    You sure you're seeing a delay from plan compilation, not SQL having to pull the data from disk vs cache? That query's simple, shouldn't take more than few ms to compile. As Paul said, forcing a plan won't necessarily reduce compilation time, it does not bypass the optimiser.

    Once in cache, a plan remains unless the cache is flushed, SQL restarts or the plan is removed because it's not used. The plan will be recreated if the stats change, underlying objects change or any underlying object is recompiled.

    I do not have strong evidence for this, I just conclude from what I saw:

    1. first time run, it took so much time (3-5 secs). second time run, it took only 1 sec. It happened same on next day (server is not restarted).

    2. I have seen compiled time (statistics time on) : about 2000 ms, and execution time is always constant on about 1000ms.

    I am experiencing these to my other SPs/queries. I want to avoid the factor I suspect (compiling time) with always using the same plan, so the optimizer will not need more time to think :D.

  • Here is the result of statistics io:

    (60852 row(s) affected)

    Table 'PRODUCT'. Scan count 2, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'LOAN'. Scan count 1, logical reads 225, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'CIF'. Scan count 9, logical reads 752, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'PACKAGE'. Scan count 9, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TXT_CIF'. Scan count 9, logical reads 1378, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TXT_FIN'. Scan count 9, logical reads 731, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'TXT_DISBURSE'. Scan count 9, logical reads 1558, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    TXT_CIF, TXT_FIN, and TXT_DISBURSE filled with about 60000 records.

    From what I have known, I always observe the logical reads. Is this correct?

    -tia-

  • permono.sam (5/26/2012)


    1. first time run, it took so much time (3-5 secs). second time run, it took only 1 sec. It happened same on next day (server is not restarted).

    Very likely data cache, not plan cache. As Paul mentioned, the compilation time was < 200ms,

    2. I have seen compiled time (statistics time on) : about 2000 ms, and execution time is always constant on about 1000ms.

    I want to avoid the factor I suspect (compiling time) with always using the same plan, so the optimizer will not need more time to think :D.

    Wrong approach (unless you're smarter than the people who wrote the optimiser). Let SQL do it's job, make sure you write good solid queries, make sure there's enough memory on the server for the frequently used data and the frequently used plans, make sure you're not writing code that can fill the plan cache with single-use ad-hoc plans and that should serve in 99% of cases.

    Also note that forcing plans (via the Use Plan hint) does not bypass the optimiser and hence will likely not reduce compilation time (if that was the cause)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • permono.sam (5/26/2012)


    Here is the result of statistics io:

    Thank you. Capture the same information when the query takes 3 seconds and you might see that the cause of the slower performance is physical/read-ahead reads. Does the fact the query sometimes takes 3 seconds to execute cause a specific problem?

    From what I have known, I always observe the logical reads. Is this correct?

    The number of logical reads is one consideration, but you should not use that as your only metric. It can be misleading and in any case does not tell the whole story. Consider also things like elapsed time, processor usage, memory requirement and best versus worst case performance characteristics. Query tuning often comes down to making a trade-off between one factor or another.

    SQL Server provides a wide range of performance information including in DMVs like sys.dm_exec_procedure_stats and sys.dm_exec_query_stats.

    On a *test system*, you can check the performance of your procedure when no data is in cache by executing CHECKPOINT and DBCC DROPCLEANBUFFERS before executing the query. You do not want to do this on a live production system.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply