September 18, 2017 at 2:33 am
Hi all,
A query I've got isn't running particulary fast (a bout 2 and a bit minutes). I know, in the grand scheme of things that not really that long, but really I'm looking for something to run considerably faster (n the sigle digit seconds). I know what the problem with the query is, just not how to fix it.
The query is a Stored Procedure, using WITH RECOMPILE, and I've attached a copy query plan and the T-SQL at the end of the post. The problem (I think) is the part I've underlined in my query; this is causing the estimator to make very bad decisions, and utterly ruins it. I'm aware that the combination of NULL and OR statements on Keys is not a good idea, however, what I'm not sure on is how to get around using the syntax I have. Any insight anyone can give would be great.
(Apologies for the awful formatting of the SQL, SSC still doesn't let me paste without wiping out my tabs, so it's just going to look awful; sorry).ALTER PROC [dbo].[AgeasMR_sp] @MTLockdown DATE, @CSLockdown DATE, @ALockdown DATE WITH RECOMPILE AS
--DECLARE @MTLockdown DATE, @CSLockdown DATE, @ALockdown DATE;
--SELECT @MTLockdown = '03-Aug-2017', @CSLockdown = '03-Aug-2017', @ALockdown = '03-Aug-2017';
SELECT DATENAME(month, DATEADD(month, DATEDIFF(month, 0, @MTLockdown) -1, 0)) + ' ' + CAST(DATEPART(year, DATEADD(month, DATEDIFF(month, 0, @MTLockdown) -1, 0)) AS VARCHAR(4)) AS BDXMonth,
COALESCE(BCM.Name, BCM.Char5, BCM.Char6) AS Insured,
BTX.Dt_raised AS EffectiveDate,
BPY.Polno,
BPY.PolRef@ as Reference,
MTRA.Addr1 + ISNULL(', ' + MTRA.Addr2,'') + ISNULL(', ' + MTRA.Addr3,'') + ISNULL(', ' + MTRA.Addr4,'') AS CorreAddress,
MTRA.Pcode AS CorrePostcode,
MTBP.Addr1 + ISNULL(', ' + MTBP.Addr2,'') + ISNULL(', ' + MTBP.Addr3,'') + ISNULL(', ' + MTBP.Addr4,'') AS RiskAddress,
MTBP.Pcode AS RiskPostcode,
BTX.Trantype AS TransactionType,
ISNULL(MRT0.Tranrr, MRT0h.Tranrr) * CASE WHEN ISNULL(MRT0.Ldpp1,MRT0h.Ldpp1) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld1,MRT0h.Ld1) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ld1amt,MRT0h.Ld1amt) / 100)) ELSE 1 END AS RoadRisksGWP, ISNULL(MRT0.Tranrr - MRT0.rrcomm, MRT0h.Tranrr - MRT0h.rrcomm) * CASE WHEN ISNULL(MRT0.Ldpp1,MRT0h.Ldpp1) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld1,MRT0h.Ld1) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ld1amt,MRT0h.Ld1amt) / 100)) ELSE 1 END AS RoadRisksPayable,
ISNULL(MRT0.Tranmd, MRT0h.Tranmd) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS MaterialDamGWP, ISNULL(MRT0.Tranmd - MRT0.Mdcomm, MRT0h.Tranmd - MRT0h.Mdcomm) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS MaterialDamPayable,
MTBP.Totalsc AS MaterialDamSI,
ISNULL(MRT0.Tranbi, MRT0h.Tranbi) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS LoiGWP, ISNULL(MRT0.Tranbi - MRT0.bicomm, MRT0h.Tranbi - MRT0h.bicomm) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS LoiPayable,
MTBI.Agp AS AnnualProfit, MTBI.Indperiod as IndemnityPeriod, MTBI.Si AS LOISI, MTBI.Odbsi AS DebitSI,
ISNULL(MRT0.Tranmy, MRT0h.Tranmy) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS MoneyGWP, ISNULL(MRT0.Tranmy - MRT0.Mycomm, MRT0h.Tranmy - MRT0h.Mycomm) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS MoneyPayable,
MTMY.Anncarr AS AnnualCarryings,
ISNULL(MRT0.Trangit, MRT0h.Trangit) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS InTransitGWP, ISNULL(MRT0.Trangit - MRT0.Gitcomm, MRT0h.Trangit - MRT0h.Gitcomm) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS InTransitPayable,
MTIT.Maxlimit AS MaxConsignment, MTIT.Numveh AS MaxTransitVehicles,
ISNULL(MRT0.Tranel, MRT0h.Tranel) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS ELGWP, ISNULL(MRT0.Tranel - MRT0.Elcomm, MRT0h.Tranel - MRT0h.elcomm) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS ELPayable,
MTLI.Total AS ELWageRoll, CASE LEFT(MTEL.ELYN,1) WHEN 'Y' THEN 10000000 ELSE 0 END AS ELIndem,
ISNULL(MRT0.Tranpl, MRT0h.Tranpl) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS PLGWP, ISNULL(MRT0.Tranpl - MRT0.Plcomm, MRT0h.Tranpl - MRT0h.Plcomm) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS PLPayable,
MTBQ.Grossturnover AS Turnover, MTPL.Indlimitnum AS PLLimit,
ISNULL(MRT0.Transp,MRT0h.Transp) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS SpecifiedPropertyGWP, ISNULL(MRT0.Transp - MRT0.Spcomm, MRT0h.Transp - MRT0h.Spcomm) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS SpecifiedPropertyPayable,
MTBI.Compawaysi AS SpecifiedPropertySI, MTBI.Terrlimit AS SpecifiedPropertyTerrLimit,
ISNULL(MRT0.Trandrs, MRT0h.Trandrs) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS RefrigeratedGWP, ISNULL(MRT0.Trandrs - MRT0.Drscomm, MRT0h.Trandrs - MRT0h.Drscomm) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END as RefrigeratedPayable,
MTBI.Detstocksi AS RefrigeratedSI,
ISNULL(MRT0.Traned, MRT0h.Traned) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS DishonestyGWP, ISNULL(MRT0.Traned - MRT0.Edcomm, MRT0h.Traned - MRT0.Edcomm) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS DishonestyPayable,
MTBI.Edlimit AS DishonestySI,
ISNULL(MRT0.Tranlml, MRT0h.Tranlml) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS LoseMOTGWP, ISNULL(MRT0.Tranlml - MRT0.Lmlcomm, MRT0h.Tranlml - MRT0h.Lmlcomm) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS LoseMOTPayable,
MTBI.Annmot AS MOTIncome, MTBI.Motgp AS MOTGP,
ISNULL(MRT0.Tranglass, MRT0h.Tranglass) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS GlassGWP, ISNULL(MRT0.Tranglass - MRT0.Glasscomm, MRT0h.Tranglass - MRT0h.Glasscomm) * CASE WHEN ISNULL(MRT0.Ldpp,MRT0h.Ldpp) = '%' THEN 1 + (CASE ISNULL(MRT0.Ld,MRT0h.Ld) WHEN 'L' THEN 1 WHEN 'D' THEN -1 ELSE 0 END * (ISNULL(MRT0.Ldamt,MRT0h.Ldamt) / 100)) ELSE 1 END AS GlassPayable,
ISNULL(MRT0.Tranle, MRT0h.Tranle) AS LegalGWP,
ISNULL(MRT0.Tranle - MRT0.Lecomm, MRT0h.Tranle - MRT0h.Lecomm) AS LegalPayable,
ISNULL(BTXcm.Orig_debt - BTXcm.Ipt_amount,0) + ISNULL(BTXcs.Orig_debt - BTXcs.ipt_amount,0) + ISNULL(BTXca.Orig_debt - BTXca.ipt_amount,0) AS GWP,
BTX.Ipt_rate/100 AS IPTRate,
ISNULL(BTXcm.Ipt_amount,0) + ISNULL(BTXcs.Ipt_amount,0) + ISNULL(BTXca.Ipt_amount,0) AS IPT,
ISNULL(BTXcm.Orig_debt - BTXcm.Ipt_amount - BTXcm.Comm_amt,0) + ISNULL(BTXcs.Orig_debt - BTXcs.Ipt_amount - BTXcs.Comm_amt,0) + ISNULL(BTXca.Orig_debt - BTXca.Ipt_amount - BTXca.Comm_amt,0) AS NWP,
ISNULL(BTXcm.Comm_amt,0) + ISNULL(BTXcs.Comm_amt,0) + ISNULL(BTXca.Comm_amt,0) AS Commission,
MTEL.ERNEXEMPTYN AS ERNExempt,
MTEL.ERN,
CASE WHEN ISNULL(LEFT(MTEL.SUBCOYN,1),'N') = 'N' THEN NULL
ELSE MTEL.SUBNAME1 + ISNULL(', ' + MTEL.SUBNAME2,'') + ISNULL(', ' + MTEL.SUBNAME3,'')
END AS SUBName,
CASE WHEN ISNULL(LEFT(MTEL.SUBCOYN,1),'N') = 'N' THEN NULL
ELSE MTEL.SUBERN1 + ISNULL(', ' + MTEL.SUBERN2,'') + ISNULL(', ' + MTEL.SUBERN3,'')
END AS SUBERN,
BAD.[Name] AS AgentName,
AN.FCANumber
FROM OpenGI..ic_yyclient BCM
JOIN OpenGI..ic_brpolicy BPY ON BCM.B@ = BPY.B@ AND BCM.Ref@ = BPY.Ref@
JOIN OpenGI..ic_brcledger BTX ON BPY.B@ = BTX.B@ AND BPY.PolRef@ = BTX.PolRef@
JOIN OpenGI..ic_BD_MTRA MTRA ON BPY.B@ = MTRA.B@ AND BPY.PolRef@ = MTRA.PolRef@
JOIN OpenGI..ic_BD_MTBP MTBP ON BPY.B@ = MTBP.B@ AND BPY.PolRef@ = MTBP.PolRef@
JOIN OpenGI..ic_BD_MTBI MTBI ON BPY.B@ = MTBI.B@ AND BPY.PolRef@ = MTBI.PolRef@
JOIN OpenGI..ic_BD_MTMY MTMY ON BPY.B@ = MTMY.B@ AND BPY.PolRef@ = MTMY.PolRef@
JOIN OpenGi..ic_BD_MTIT MTIT ON BPY.B@ = MTIT.B@ AND BPY.PolRef@ = MTIT.PolRef@
JOIN OpenGI..ic_BD_MTLI MTLI ON BPY.B@ = MTLI.B@ AND BPY.PolRef@ = MTLI.PolRef@
JOIN OpenGI..ic_BD_MTEL MTEL ON BPY.B@ = MTEL.B@ AND BPY.PolRef@ = MTEL.PolRef@
JOIN OpenGI..ic_BD_MTBQ MTBQ ON BPY.B@ = MTBQ.B@ AND BPY.PolRef@ = MTBQ.PolRef@
JOIN OpenGI..ic_BD_MTPL MTPL ON BPY.B@ = MTPL.B@ AND BPY.PolRef@ = MTPL.PolRef@
LEFT JOIN OpenGi..ic_brcledger BTXcm ON BTX.B@ = BTXcm.B@ AND BTX.PolRef@ = BTXcm.PolRef@
AND BTX.Suffix@ = BTXcm.C_mast
AND BTXcm.C_ins = 'Ageas Motor Trade'
AND BTXcm.Lockdown = @MTLockdown
LEFT JOIN OpenGi..ic_brcledger BTXcs ON BTX.B@ = BTXcs.B@ AND BTX.PolRef@ = BTXcs.PolRef@
AND BTX.Suffix@ = BTXcs.C_mast
AND BTXcs.C_ins = 'Ageas MT Comm Split'
AND BTXcs.Lockdown = @CSLockdown
LEFT JOIN OpenGi..ic_brcledger BTXca ON BTX.B@ = BTXca.B@ AND BTX.PolRef@ = BTXca.PolRef@
AND BTX.Suffix@ = BTXca.C_mast
AND BTXca.C_ins = 'arag motor trade'
AND BTXca.Lockdown = @ALockdown
LEFT JOIN OpenGi..ic_BD_MRT0 MRT0 ON BTX.B@ = MRT0.B@ AND BTX.Polref@ = MRT0.Polref@
AND BTX.Ref = MRT0.Tranref
LEFT JOIN OpenGi..ic_BD_MRT0_History MRT0h ON BTX.B@ = MRT0h.B@ AND BTX.Polref@ = MRT0h.Polref@
AND BTX.Ref = MRT0h.Tranref
AND MRT0.Key@ IS NULL
LEFT JOIN OpenGI..ic_brAgentdtls BAD ON BPY.B@ = BAD.B@ AND BPY.Agent = BAD.Code
LEFT JOIN BDXMID..AgentFCANumber_tbl AN ON BPY.Agent = AN.SEIBAgentCode
WHERE BCM.B@ != 3 AND BCM.Ref@ != 'XXXX01' AND BPY.Status = 0
--AND ((BTXcm.Lockdown = @MTLockdown OR BTXcm.key@ IS NULL)
--AND (BTXcs.Lockdown = @CSLockdown OR BTXcs.key@ IS NULL)
--AND (BTXca.Lockdown = @ALockdown OR BTXca.key@ IS NULL))
--AND BTXcm.Lockdown = @MTLockdown AND BTXcs.Lockdown = @CSLockdown AND BTXca.Lockdown = @ALockdown
AND (BTXcm.Key@ IS NOT NULL OR BTXcs.Key@ IS NOT NULL OR BTXca.Key@ IS NOT NULL
OR (BTX.Lockdown = @MTLockdown AND BTX.Insurer IN ('Ageas Motor Trade','Ageas MT Comm Split','arag motor trade')))
AND (MRT0.Key@ IS NOT NULL OR MRT0h.Key@ IS NOT NULL);
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
September 18, 2017 at 7:31 am
Looks like a lack of a covering index, given that you're getting a Key Lookup. You might benefit from a SPARSE index that filters on the same things that the icp_yyclient table (BCM alias) has in the WHERE clause, and is indexed on how that table JOINs to the other tables in the query. Alternatively, you might be able to "pre-filter" that table using only it's needed fields to participate in the JOINs and SELECT, into a temp table, and you might even then index that temp table as well. Try something and let me know how you fare...
Edit: corrected grammar and spelling. Change in meaning occurred.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
September 18, 2017 at 7:40 am
You've got a catch-all type query. They're slow (and writing the WHERE clause in a more complicated form doesn't make that any better)
https://www.simple-talk.com/sql/t-sql-programming/how-to-confuse-the-sql-server-query-optimizer/
WITH RECOMPILE on the procedure? Why? That gets you very few of the benefits of the OPTION(RECOMPILE) hint, and more overhead (all statements get compiled on every execution)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 18, 2017 at 8:11 am
I understand where the responses posted by Steve and Gail are coming from, but I do not agree with them.
@steve-2: Yes there is a key lookup, and having a covering index would eliminate that and probably cause the execution time to be cut in half or so. But that would not tackle the real underlying issue, In the plan you see in the right-hand side a relatively low number of rows being joined. Then in the middle the rows from icp_yyclient are joined and suddenly the result set explodes from 253 rows to over 7.5 million. And then the next step is to join to icp_brpolicy as well, and the rowcount immediately drops back to 29.
The cost of the execution is in 7.5 million key lookups and 7.5 million index seeks in icp_brpolicy. A covering index would eliminate the key lookup but the index seek is still there. (And you do pay the price in overall performance of either having an extra index or having more columns in an existing index).
@Gail: The query has some elements that LOOK like catch-all queries (and they may contribute to the performance issue - though I am not convinced). But they are not. Here are the fragments I am talking about (reformatted by SQLPrompt, though I have no idea if that will persist in the forum interface): AND (
BTXcm.Key@ IS NOT NULL
OR BTXcs.Key@ IS NOT NULL
OR BTXca.Key@ IS NOT NULL
OR (
BTX.Lockdown = @MTLockdown
AND BTX.Insurer IN ( 'Ageas Motor Trade', 'Ageas MT Comm Split', 'arag motor trade' )
)
) AND (
MRT0.Key@ IS NOT NULL
OR MRT0h.Key@ IS NOT NULL
)
The reason why I do NOT call this a catch-all query is that these conditions do not refer to parameters that may be NULL, or even to table value that may be NULL - in this case they all refer to key columns of outer-joined tables. So the second fragment translates as "AND (at least one of MRT0 and MRT0h had a matching row in the join)". The first fragment is similar but a bit more complicated.
Now all of these *can* cause issues. But your execution plan shows that they don't. The plan first joins all the MT* tables, then joins that result to icp_yyclient, icp_brpolicy, and icp_brledger, and then finally does all the logic on the remaining 113 rows to do the outer joins and complex filtering. The rowcount remains low throughout the process - so low that we should not care about whether this part of the query can be written in a more effective way.
[[ That being said - there is always the option that a rewrite gives the optimizer a complete new idea, so you CAN experiment with a rewrite. The idea is that instead of doing Table1 LEFT OUTER JOIN Table2 LEFT OUTER JOIN Table3 WHERE (Table1 IS NOT NULL OR Table2 IS NOT NULL), you instead use Table1 INNER JOIN Table2 UNION ALL Table1 INNER JOIN Table3 WHERE NOT EXISTS Table2 ]]
I'll look a bit deeper into the exact join criteria in the part where 7.5 million rows and let you know what I find.
September 18, 2017 at 8:19 am
I've had to optimise many queries like this one and there is no "one size fits all", often I find that rewriting including redesigning the schema is less time consuming😉
😎
September 18, 2017 at 8:24 am
GilaMonster - Monday, September 18, 2017 7:40 AMWITH RECOMPILE on the procedure? Why? That gets you very few of the benefits of the OPTION(RECOMPILE) hint, and more overhead (all statements get compiled on every execution)
Not sure, in all honestly, now I think of it. Having an odd day where my brain is not engaged and 1 + 1 = 3.5 :hehe:
sgmunson - Monday, September 18, 2017 7:31 AMLooks like a lack of a covering index, given that you're getting a Key Lookup. You might benefit from a SPARSE index that filters on the same things that the icp_yyclient table (BCM alias) has in the WHERE clause, and is indexed on how that table JOINs to the other tables in the query. Alternatively, you might be able to "pre-filter" that table using only it's needed fields to participate in the JOINs and SELECT, into a temp table, and you might even then index that temp table as well. Try something and let me know how you fare...Edit: corrected grammar and spelling. Change in meaning occurred.
I think that the Catch-all (like Gail highlighted) is causing the Key Lookup. If I simply take out that OR statement section on the BTXxx.Key@ then the query runs in about 2 seconds. It doesn't return the right results but the query optimiser makes the decisions I want to make, and i imagine if it made those decisions with the above query, it would do "ok". Having a quick read through the link Gail provided, the below quote seems especially relevant:
The index scan on the
ProductID
index now returns every row of the table, because if we supplyNULL
for@ProductID
then the predicate “ProductID = @ProductID OR @ProductID IS NULL
†evaluates to true for every row.
What I'm struggling with, however, Gail is that it isn't the parameters that are optional, it's that the JOIN to the BTXcm, BTXcs or the BTXca is optional HOWEVER, at least one of them must return a result. If the 3 parameters were optional, then yes, I'd certainly be doing this with parametrised SQL. Is it possible to do this in the same way?
Although, this does make me think, maybe I should put the relevant transactions (which can easily be found) into a Temporary table first, and then do the respective joins on that. Although, the Key Lookup is on the yyclient, so I doubt that is going to help. 🙁
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
September 18, 2017 at 8:26 am
Eirikur Eiriksson - Monday, September 18, 2017 8:19 AMI've had to optimise many queries like this one and there is no "one size fits all", often I find that rewriting including redesigning the schema is less time consuming😉
😎
3rd party software unfortunately. Redesigning the data/schema is not an option, as it all comes from a flat file style mainframe running on Tripos.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
September 18, 2017 at 8:46 am
NB: In the explanation below, I use table and column names from the execution plan, which are not exactly the same as in the query you posted.
So as I said in the previous post, the optimizer first joins all the MT* tables, in a suitable order. (I have not checked out the exact order; the row count remains low throughout and it uses nested loops + seek strategies, so related to the total work done in the query this part is irrelevant). The end result of this is 253 rows (estimated 221 - close enough)
It then joins to icp_yyclient (alias: BCM) using these criteria:WHERE BCM.Branch@ = MTPL.Branch@
AND BCM.Branch@ <> 3
AND BCM.ClientRef@ <> 'XXXX01'
The optimizer estimates that this join returns 412 rows; in reality it gets 7,501,778. Oops!!
Next step is to join to icp_brpolicy (BPY) using:WHERE BPY.ClientRef@ = BCM.ClientRef@
AND BPY.Branch@ = MTPL.Branch@
AND BPY.PolicyRef@ = MTPL.PolicyRef@
After the 7.5 million lookups needed for this, the row count is back to just 29 (estimate: 1).
Questions to ask based on this:
1. Why is the estimate for the result after joining BCM so far off? (Probably either outdated or unrepresentative statistics, or a very skewed distribution - e.g. if there's a million branches and ten million rows, the estimater's mathematical model assumes 10 rows per branch; reality might be 2 rows per branch for 999,999 of them, and the remainng 8 million for the last branch and then a join result becomes impossible to predict)
2. Would performance improve if you manage to convince the optimzier that it needs to reverse the order of BCM and BCP in the join process? (I don't know your tables; I just *hope* that joining the intermediate results from the MT*-tables to BPY will not cause a similar severe row-explosion, and then joining those results to BCM would perhaps still keep the row count low. However, be aware that the join logic then changes as well - the MT* tables would be joined to BPY on only Branch@ and PolicyRef@ because ClientRef@ is not available yet. The join to BCM would then be on Branch@ and Client@. Filtering out the XX0001 value can of course be done in the first join, so BPY.
It can be hard to force this alternate order, given how much re-structuring the optimzier has already done by itself it will probably do the same for your attempts. But if you want to give it a try, you can follow this logic (pseudo-code)FROM MTPL
INNER JOIN MTRA ON MTRA.Branch@ = MTPL.Branch@ AND MTRA.PolicyRef@ = MTPL.PolicyRef@ -- Repeat for all other MT* tables
INNER JOIN BPY ON BPY.Branch@ = MTPL.Branch@ AND BPY.PolicyRef@ = MTPL.PolicyRef@
INNER JOIN BCM ON BCM.Branch@ = BPY.Branch@ AND BCM.ClientRef = BPY.ClientRef
LEFT OUTER JOIN (here comes the rest of the joins, unchanged)
WHERE MTPL.Branch@ <> 3
AND BPY.ClientRef <> 'XXXX01'
AND (here comes the rest of the filter, unchanged);
I cannot test this of course and the query is complex enough that errors creep in easily so do verify carefully that I haven't changed the meaning.
Note that in the rewrite, I start with MTRA and then join every other MT* table to MTRA (instead of to BPY like you did); I then join BPY to MTRA as well, and finally join BCM to BPY. I *hope* that this might convince the optimizer to not reorganize the join order as drastically as it does now.
If you decide to try this, then please do let me know what you find!
September 18, 2017 at 8:50 am
Hmmm, I didn't say this explicitly but based on the first of the two questions I asked, you should definitely try if rebuilding statistics (or rebuilding with fullscan) on all tables helps. (And yes, rebuilding on a large table such as BCM and -probably- BPY are is not fun, I know).
Depending on your SQL Server version, you can also try what happens if you run with the other cardinality estimator.
September 18, 2017 at 8:51 am
Hugo Kornelis - Monday, September 18, 2017 8:11 AMI understand where the responses posted by Steve and Gail are coming from, but I do not agree with them.@steve-2: Yes there is a key lookup, and having a covering index would eliminate that and probably cause the execution time to be cut in half or so. But that would not tackle the real underlying issue, In the plan you see in the right-hand side a relatively low number of rows being joined. Then in the middle the rows from icp_yyclient are joined and suddenly the result set explodes from 253 rows to over 7.5 million. And then the next step is to join to icp_brpolicy as well, and the rowcount immediately drops back to 29.
The cost of the execution is in 7.5 million key lookups and 7.5 million index seeks in icp_brpolicy. A covering index would eliminate the key lookup but the index seek is still there. (And you do pay the price in overall performance of either having an extra index or having more columns in an existing index).@Gail: The query has some elements that LOOK like catch-all queries (and they may contribute to the performance issue - though I am not convinced). But they are not. Here are the fragments I am talking about (reformatted by SQLPrompt, though I have no idea if that will persist in the forum interface):
AND (
BTXcm.Key@ IS NOT NULL
OR BTXcs.Key@ IS NOT NULL
OR BTXca.Key@ IS NOT NULL
OR (
BTX.Lockdown = @MTLockdown
AND BTX.Insurer IN ( 'Ageas Motor Trade', 'Ageas MT Comm Split', 'arag motor trade' )
)
)AND (
MRT0.Key@ IS NOT NULL
OR MRT0h.Key@ IS NOT NULL
)The reason why I do NOT call this a catch-all query is that these conditions do not refer to parameters that may be NULL, or even to table value that may be NULL - in this case they all refer to key columns of outer-joined tables. So the second fragment translates as "AND (at least one of MRT0 and MRT0h had a matching row in the join)". The first fragment is similar but a bit more complicated.
Now all of these *can* cause issues. But your execution plan shows that they don't. The plan first joins all the MT* tables, then joins that result to icp_yyclient, icp_brpolicy, and icp_brledger, and then finally does all the logic on the remaining 113 rows to do the outer joins and complex filtering. The rowcount remains low throughout the process - so low that we should not care about whether this part of the query can be written in a more effective way.
[[ That being said - there is always the option that a rewrite gives the optimizer a complete new idea, so you CAN experiment with a rewrite. The idea is that instead of doing Table1 LEFT OUTER JOIN Table2 LEFT OUTER JOIN Table3 WHERE (Table1 IS NOT NULL OR Table2 IS NOT NULL), you instead use Table1 INNER JOIN Table2 UNION ALL Table1 INNER JOIN Table3 WHERE NOT EXISTS Table2 ]]
I'll look a bit deeper into the exact join criteria in the part where 7.5 million rows and let you know what I find.
Thanks Hugo. I hadn't seen your post before replying to Gail, and you kind of the nail on the head with what I was thinking.
I had a very quick stab in the dark at something though, and was, well, very surprised at the result. The section BCM.B@ = 3 AND BCM.Ref@ != 'XXXX01' is very much a common theme for all our queries, as that's the test area of our live environment (that will be going away soon when our DEV server finally get's switched on after 10 years...). Normally, most of our stuff is "multi branch", however, in the case it's not (it's Branch 0 only). I just changed the start of the WHERE clause to BCM.B@ = 0 AND BCM.Ref@ != 'XXXX01' and the query ran like a dream (<0 Seconds). This time, the estimator made use of the Clustered indexes it should(?) have been doing in the first place (including that of the yyclient, which is on @Branch and @ClientRef). Surprised that it was such a simple thing though... I'm guessing that the estimator saw that != 3 was non deterministic, while = 0 was?
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
September 18, 2017 at 9:04 am
I shouldn't post when I'm exhausted. Saw the @ and brain interpreted as a parameter.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 18, 2017 at 11:15 am
GilaMonster - Monday, September 18, 2017 9:04 AMI shouldn't post when I'm exhausted. Saw the @ and brain interpreted as a parameter.
I know the feeling:blush:
😎
September 18, 2017 at 2:08 pm
Thom A - Monday, September 18, 2017 8:51 AMI'm guessing that the estimator saw that != 3 was non deterministic, while = 0 was?
They're both deterministic. The problem is that the optimiser knows absolutely that for =0 it can seek to the beginning of the range of rows with 0 and read to the end of the rows with 0, and that's it.
With !=3 however, it has to do two range scans, first for all values less than 3 and another for all rows greater than 3 (because, unless there's a constraint that forces only the values 0 and 3 to exist) there's absolutely no guarantee that there won't be a row with 2 on some execution, a row with -2232434 on another and a row with 258778788 on another. It's an inequality predicate.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 19, 2017 at 2:43 am
GilaMonster - Monday, September 18, 2017 2:08 PMThom A - Monday, September 18, 2017 8:51 AMI'm guessing that the estimator saw that != 3 was non deterministic, while = 0 was?They're both deterministic. The problem is that the optimiser knows absolutely that for =0 it can seek to the beginning of the range of rows with 0 and read to the end of the rows with 0, and that's it.
With !=3 however, it has to do two range scans, first for all values less than 3 and another for all rows greater than 3 (because, unless there's a constraint that forces only the values 0 and 3 to exist) there's absolutely no guarantee that there won't be a row with 2 on some execution, a row with -2232434 on another and a row with 258778788 on another. It's an inequality predicate.
Thanks Gail, although, if that was the case, I still would have expected a Clustered Index Scan, rather than a key lookup, no? The Key Lookup means it's checking every single row, even though SQL Server should know where there 3 is from the Clustered index.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
September 19, 2017 at 5:03 am
Thom A - Tuesday, September 19, 2017 2:43 AMGilaMonster - Monday, September 18, 2017 2:08 PMThom A - Monday, September 18, 2017 8:51 AMI'm guessing that the estimator saw that != 3 was non deterministic, while = 0 was?They're both deterministic. The problem is that the optimiser knows absolutely that for =0 it can seek to the beginning of the range of rows with 0 and read to the end of the rows with 0, and that's it.
With !=3 however, it has to do two range scans, first for all values less than 3 and another for all rows greater than 3 (because, unless there's a constraint that forces only the values 0 and 3 to exist) there's absolutely no guarantee that there won't be a row with 2 on some execution, a row with -2232434 on another and a row with 258778788 on another. It's an inequality predicate.Thanks Gail, although, if that was the case, I still would have expected a Clustered Index Scan, rather than a key lookup, no? The Key Lookup means it's checking every single row, even though SQL Server should know where there 3 is from the Clustered index.
The seek + lookup in your plan is not because of the <> '3' filter. That has been pushed down into the seek as an additional predicate, but it's not what drives the index.
The optimzer expects that the relatively low number of rows from the joined MT*-tables (estimated 220,892, actual 253) will each join to a relatively low number of rows from icp_yyclient. So it uses a Nested Loops join, For each of the 253 rows, it does a seek on yyclient.Branch@ = MTPL.Branch@, and it expects to find 1.867 rows on average per execution, for a total of (220.892 * 1.867 = ) 412.496 rows. The actual number of rows is MUCH higher (7.5 million).
As Gail indicates, telling the optimizer that every value except 3 is relevant is much different from telling it that only the value 0 is relevant. I think that after you changed the predicate from "<> 3" to "= 0", you not only got a clustered index scan (since the statistics tell the optimzer that a huge amount of rows will match the =0 predicate and hence a lookup is not smart); but the nested loops join also changed to a hash match join operator, because the optimzer now know it will join to the same set of yyclient-rows for every row from the MT*-join.
What I do not understand is why the optimizer, in the original plan, has not concluded that Branch@ is equal in all tables and that filtering on yyclient.Branch@ <> 3 therefor means that it can already throw out the Branch@ = 3 rows in the MT* join tree. However, that would not have made any difference to your problem. The number of rows from the MT*-join would have been lower, but the only rows removed are the ones that return no rows in the inner input of the nested loops join; you would still get the same 7.5 million matching rows on the non-3 values of Branch@.
I know that you've got this solved now, but I am still intrigued by the huge mismatch on the join to yyclient: an estimate of 1.867 matching rows on average, and in reality the average number of matching rows is 29651,3 for each row from MT*. I would be interested to know, if you do only the join between all those MT*-tables, which values of Branch@ occur, and how often each one occurs. And also, how many other values of Branch@ exist in one or more of the MT*-tables but are not matched by all the join criteria and hence not in the combined results. (I expect that, based on statistics on MT*-tables and the join criteria, the optimizer thinks that the expected 220.892 rows from the MT*-join will span a wide range of Branch@ values, with most of them not occurring or occurring only 1 or 2 times in yyclient; and that in reality most of the rows from the MT*-join have Branch@ 0, the single value in yyclient that occurs in a huge majority of the rows).
Viewing 15 posts - 1 through 15 (of 21 total)
You must be logged in to reply to this topic. Login to reply