May 26, 2012 at 4:48 am
Greetings.
I have a stored procedure which had a parameterized complex query. I took out the complex query and run it then I take note how long it run. For the first time it took 3 seconds, consists of 2 seconds for compiling and 1 second for retrieving data. The second and next time, it took only less than 1 second.
From what I have read, when optimizer running a query, it searches best plan for using index, join mechanism, etc. So how do I force the SQL Server using one query plan produced after the first time run? I imagine I can save the query plan from the first time running to a file, and then use it always each this query run. Well, at final step I want to apply this strategy into stored procedure.
Thanks in advance.
-mp-
May 26, 2012 at 5:21 am
Hinting or forcing should be the absolute last resort when you know there is no other way and you know exactly what you are doing.
Please post query, table definitions, index definitions and execution plan, as per http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 26, 2012 at 6:27 am
permono.sam (5/26/2012)
From what I have read, when optimizer running a query, it searches best plan for using index, join mechanism, etc. So how do I force the SQL Server using one query plan produced after the first time run? I imagine I can save the query plan from the first time running to a file, and then use it always each this query run. Well, at final step I want to apply this strategy into stored procedure.
SQL Server automatically caches plans for reuse, so unless your query is recompiling often (or there's some other factor you haven't mentioned) there's generally no need to worry about this. SQL Server also does a pretty good job of deciding when it is necessary to recompile your query because circumstances have changed (for example, the data distribution has changed). If you were to constrain SQL Server to using roughly the same plan, you would lose that advantage. Using plan freezing does not guarantee a reduction in compilation time; it can quite often have the opposite effect.
May 26, 2012 at 10:37 am
SQL Kiwi (5/26/2012)
SQL Server automatically caches plans for reuse, so unless your query is recompiling often (or there's some other factor you haven't mentioned) there's generally no need to worry about this. SQL Server also does a pretty good job of deciding when it is necessary to recompile your query because circumstances have changed (for example, the data distribution has changed). If you were to constrain SQL Server to using roughly the same plan, you would lose that advantage. Using plan freezing does not guarantee a reduction in compilation time; it can quite often have the opposite effect.
What other factors for example? I do not mention because maybe I do not know that some things will be those :-D. Looks like the plan in caches often replaced with other query plan, because if I run the same query on next day, it took 3 seconds (equal to first time I run it). That is why I want to force it. Or is there any other strategy to decrease compilation time? Minimizing number of indexes for each table involved?
-tia-
May 26, 2012 at 10:48 am
declare @IdPackage as decimal
declare @IdFc as decimal
set @IdPackage = 394
set @IdFc = 15
set statistics io on
SELECT identitas_debitur
, SUM(isnull(ang_deb, 0)) AS ang_deb
, SUM(isnull(cost_bdi, 0)) AS cost_bdi
, SUM(isnull(os_bdi, 0)) AS os_bdi
, SUM(isnull(ang_bdi, 0)) AS ang_bdi
, SUM(isnull(pendapatan_perbulan, 0)) AS pendapatan_perbulan
, product_code
--into #tmp_RelatedOneObligorNew
FROM
(
SELECT cif.identitas_debitur
, loan.ang_deb
, loan.cost_bdi
, loan.os_bdi
, loan.ang_bdi
, 1 AS pendapatan_perbulan
, prd.PRODUCT_CODE
, loan.jenis_kelamin AS jenis_kelamin
FROM cif inner join loan
on cif.identitas_debitur = loan.identitas_debitur
inner join
(
select ID_PRODUCT
, PRODUCT_CODE
from PRODUCT
where ID_FC = @IdFc
and APPROVAL_STATUS = 'APPROVED'
) prd
on loan.id_product = prd.ID_PRODUCT
UNION
SELECT txt_cif.identitas_debitur
, case
when isnumeric(txt_fin.ang_deb) = 1 then CONVERT(money, txt_fin.ang_deb)
else 0
end as ang_deb
, case
when isnumeric(txt_fin.cost_bdi) = 1 then CONVERT(money, txt_fin.cost_bdi)
else 0
end as cost_bdi
, NULL AS os_bdi
, case
when isnumeric(txt_fin.ang_bdi) = 1 then CONVERT(money, txt_fin.ang_bdi)
else 0
end as ang_bdi
, case
when ISNUMERIC(txt_disburse.pendapatan_perbulan) = 1 then CONVERT(money, txt_disburse.pendapatan_perbulan)
else 0
end as pendapatan_perbulan
, prd.PRODUCT_CODE
, txt_disburse.jenis_kelamin AS jenis_kelamin
FROM txt_cif INNER JOIN txt_disburse
ON txt_cif.identitas_debitur = txt_disburse.identitas_debitur
INNER JOIN txt_fin
ON txt_cif.identitas_debitur = txt_fin.identitas_debitur
inner join PACKAGE pckg
on pckg.ID_PACKAGE = TXT_CIF.ID_PACKAGE
inner join
(
select ID_PRODUCT
, PRODUCT_CODE
from PRODUCT
where APPROVAL_STATUS = 'APPROVED'
) prd
on prd.ID_PRODUCT = pckg.ID_PRODUCT
) derivedtbl
GROUP BY identitas_debitur, product_code
@IdPackage and @IdFc will be parameters passed from application to stored procedure.
May 26, 2012 at 11:18 am
You sure you're seeing a delay from plan compilation, not SQL having to pull the data from disk vs cache? That query's simple, shouldn't take more than few ms to compile. As Paul said, forcing a plan won't necessarily reduce compilation time, it does not bypass the optimiser.
Once in cache, a plan remains unless the cache is flushed, SQL restarts or the plan is removed because it's not used. The plan will be recreated if the stats change, underlying objects change or any underlying object is recompiled.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 26, 2012 at 6:32 pm
permono.sam (5/26/2012)
What other factors for example? I do not mention because maybe I do not know that some things will be those :-D. Looks like the plan in caches often replaced with other query plan, because if I run the same query on next day, it took 3 seconds (equal to first time I run it). That is why I want to force it. Or is there any other strategy to decrease compilation time? Minimizing number of indexes for each table involved?
Thanks for posting the execution plan, it's very useful. If you click on the green SELECT node in the plan, and look at its properties (hit F4 to open the properties page in SSMS) you can see the compilation time was 197ms. The delay you experience is probably due to the data and index pages needed to execute the query being no longer in the buffer pool - check the STATISTICS IO output to see how many physical reads and read-ahead reads were needed (feel free to share that output if you like).
The plan itself is pretty good. I might be tempted to split it into three parts though: one query for side of the UNION (each result stored in a temporary table), and a final query to do the final SUMs and GROUP BY on the result of UNIONing the two partial results. Check that you really need UNION rather than UNION ALL. I suspect you do (even though the operation only excludes 8 rows in the example you posted).
-- Part 1
SELECT
cif.identitas_debitur,
loan.ang_deb,
loan.cost_bdi,
loan.os_bdi,
loan.ang_bdi,
1 AS pendapatan_perbulan,
prd.PRODUCT_CODE,
loan.jenis_kelamin AS jenis_kelamin
INTO #temp1
FROM cif
JOIN loan ON
cif.identitas_debitur = loan.identitas_debitur
JOIN
(
SELECT
ID_PRODUCT,
PRODUCT_CODE
FROM
PRODUCT
WHERE
ID_FC = @IdFc
AND APPROVAL_STATUS = 'APPROVED'
) prd ON
loan.id_product = prd.ID_PRODUCT;
-- Part 2
SELECT
txt_cif.identitas_debitur,
CASE WHEN ISNUMERIC(txt_fin.ang_deb) = 1
THEN CONVERT(money, txt_fin.ang_deb)
ELSE 0
END AS ang_deb,
CASE WHEN ISNUMERIC(txt_fin.cost_bdi) = 1
THEN CONVERT(money, txt_fin.cost_bdi)
ELSE 0
END AS cost_bdi,
NULL AS os_bdi,
CASE WHEN ISNUMERIC(txt_fin.ang_bdi) = 1
THEN CONVERT(money, txt_fin.ang_bdi)
ELSE 0
END AS ang_bdi,
CASE WHEN ISNUMERIC(txt_disburse.pendapatan_perbulan) = 1
THEN CONVERT(money, txt_disburse.pendapatan_perbulan)
ELSE 0
END AS pendapatan_perbulan,
prd.PRODUCT_CODE,
txt_disburse.jenis_kelamin AS jenis_kelamin
INTO #temp2
FROM txt_cif
JOIN txt_disburse ON txt_cif.identitas_debitur = txt_disburse.identitas_debitur
JOIN txt_fin ON txt_cif.identitas_debitur = txt_fin.identitas_debitur
JOIN PACKAGE pckg ON pckg.ID_PACKAGE = TXT_CIF.ID_PACKAGE
JOIN
(
SELECT
ID_PRODUCT,
PRODUCT_CODE
FROM
PRODUCT
WHERE
APPROVAL_STATUS = 'APPROVED'
) prd ON prd.ID_PRODUCT = pckg.ID_PRODUCT ;
-- Part 3
SELECT
identitas_debitur,
SUM(ISNULL(ang_deb, 0)) AS ang_deb,
SUM(ISNULL(cost_bdi, 0)) AS cost_bdi,
SUM(ISNULL(os_bdi, 0)) AS os_bdi,
SUM(ISNULL(ang_bdi, 0)) AS ang_bdi,
SUM(ISNULL(pendapatan_perbulan, 0)) AS pendapatan_perbulan,
product_code
FROM
(
SELECT
identitas_debitur,
ang_deb,
cost_bdi,
os_bdi,
ang_bdi,
pendapatan_perbulan,
PRODUCT_CODE,
jenis_kelamin
FROM #temp1 AS t
UNION
SELECT
identitas_debitur,
ang_deb,
cost_bdi,
os_bdi,
ang_bdi,
pendapatan_perbulan,
PRODUCT_CODE,
jenis_kelamin
FROM #temp2 AS t2
) AS derivedtbl
GROUP BY
identitas_debitur,
product_code;
Splitting the code this way not only makes life easier for the optimizer (simpler is often better, and it can auto-create statistics on the temporary tables), it also gives you a chance to create a primary key and/or indexes on the temporary tables to give the optimizer better information. More to the point, the more accurate information may allow the optimizer to better estimate the number of rows entering the Distinct Sort in your original plan (this is important because the memory allocated for sorting is fixed at compile time, and cannot grow dynamically at execution time; if not enough memory is reserved, the sort will spill to tempdb).
Of course I only use SELECT...INTO #temp there because I'm typing quickly; you might want to use CREATE TABLE for the temporary tables with an explicit structure and/or keys and indexes already, and INSERT rows to those. You might also get better results by splitting the query into just two parts: one for the derived table, and one for the final SUM and GROUP BY. Some ideas for you to explore.
May 26, 2012 at 7:52 pm
GilaMonster (5/26/2012)
You sure you're seeing a delay from plan compilation, not SQL having to pull the data from disk vs cache? That query's simple, shouldn't take more than few ms to compile. As Paul said, forcing a plan won't necessarily reduce compilation time, it does not bypass the optimiser.Once in cache, a plan remains unless the cache is flushed, SQL restarts or the plan is removed because it's not used. The plan will be recreated if the stats change, underlying objects change or any underlying object is recompiled.
I do not have strong evidence for this, I just conclude from what I saw:
1. first time run, it took so much time (3-5 secs). second time run, it took only 1 sec. It happened same on next day (server is not restarted).
2. I have seen compiled time (statistics time on) : about 2000 ms, and execution time is always constant on about 1000ms.
I am experiencing these to my other SPs/queries. I want to avoid the factor I suspect (compiling time) with always using the same plan, so the optimizer will not need more time to think :D.
May 26, 2012 at 7:56 pm
Here is the result of statistics io:
(60852 row(s) affected)
Table 'PRODUCT'. Scan count 2, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LOAN'. Scan count 1, logical reads 225, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CIF'. Scan count 9, logical reads 752, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'PACKAGE'. Scan count 9, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TXT_CIF'. Scan count 9, logical reads 1378, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TXT_FIN'. Scan count 9, logical reads 731, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TXT_DISBURSE'. Scan count 9, logical reads 1558, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
TXT_CIF, TXT_FIN, and TXT_DISBURSE filled with about 60000 records.
From what I have known, I always observe the logical reads. Is this correct?
-tia-
May 27, 2012 at 3:07 am
permono.sam (5/26/2012)
1. first time run, it took so much time (3-5 secs). second time run, it took only 1 sec. It happened same on next day (server is not restarted).
Very likely data cache, not plan cache. As Paul mentioned, the compilation time was < 200ms,
2. I have seen compiled time (statistics time on) : about 2000 ms, and execution time is always constant on about 1000ms.
I want to avoid the factor I suspect (compiling time) with always using the same plan, so the optimizer will not need more time to think :D.
Wrong approach (unless you're smarter than the people who wrote the optimiser). Let SQL do it's job, make sure you write good solid queries, make sure there's enough memory on the server for the frequently used data and the frequently used plans, make sure you're not writing code that can fill the plan cache with single-use ad-hoc plans and that should serve in 99% of cases.
Also note that forcing plans (via the Use Plan hint) does not bypass the optimiser and hence will likely not reduce compilation time (if that was the cause)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 27, 2012 at 1:40 pm
permono.sam (5/26/2012)
Here is the result of statistics io:
Thank you. Capture the same information when the query takes 3 seconds and you might see that the cause of the slower performance is physical/read-ahead reads. Does the fact the query sometimes takes 3 seconds to execute cause a specific problem?
From what I have known, I always observe the logical reads. Is this correct?
The number of logical reads is one consideration, but you should not use that as your only metric. It can be misleading and in any case does not tell the whole story. Consider also things like elapsed time, processor usage, memory requirement and best versus worst case performance characteristics. Query tuning often comes down to making a trade-off between one factor or another.
SQL Server provides a wide range of performance information including in DMVs like sys.dm_exec_procedure_stats and sys.dm_exec_query_stats.
On a *test system*, you can check the performance of your procedure when no data is in cache by executing CHECKPOINT and DBCC DROPCLEANBUFFERS before executing the query. You do not want to do this on a live production system.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply