inefficient query

  • Hi,

    I have the following query:

    SELECT

    lc.DownLoad

    , lc.AccountNum

    , lc.PolicyNum

    --, splitLC.SplitPolicyNumLmt

    , splitLC.SplitPolicyNumLmt

    , splitLC.SplitPolicyNumCA

    , splitLC.SplitPolicyNumReins

    , lc.LCDetailGK

    , splitLC.PolicyInsuredLocGK

    , lc.CbiId

    , lc.LCAmtUSD

    , lc.LCAplbtyCode

    , lc.LCTypeCode

    , lc.PerLocInd

    , lc.PolicyLevelInd

    , lc.LCQualifierTypeCode

    , lc.LCFrequencyCode

    , 0 AS ProcessInd

    FROM

    AIRGBS.vLCExtract AS lc

    INNER JOIN AIRGBS.SplitLCLimits AS splitLC

    ON splitLC.AccountNum = lc.AccountNum

    AND splitLC.PolicyNum = lc.PolicyNum

    AND splitLC.DownLoad = lc.DownLoad

    AND splitLC.CbiID = lc.CbiId

    AND splitLC.PolicyInsuredLocGK = lc.PolicyInsuredLocGK

    WHERE

    lc.PerLocInd = 'N'

    AND lc.LCTypeCode = 'LIM'

    AND

    (

    lc.LCDetailGK <> splitLC.PolicyLCDetailGK

    AND lc.LCDetailGK <> splitLC.PolicyTEDetailGK

    AND lc.LCDetailGK <> splitLC.PolicyPDDetailGK

    AND lc.LCDetailGK <> splitLC.LocLCDetailGK

    AND lc.LCDetailGK <> splitLC.LocPDDetailGK

    AND lc.LCDetailGK <> splitLC.LocTEDetailGK

    )

    AND

    1 =

    (

    SELECT

    COUNT(DISTINCT lc2.PolicyInsuredLocGK)

    FROM

    AIRGBS.vLCExtract AS lc2

    WHERE

    lc.AccountNum = lc2.AccountNum

    AND lc.PolicyNum = lc2.PolicyNum

    AND lc.DownLoad = lc2.DownLoad

    AND lc.LCDetailGK = lc2.LCDetailGK

    )

    Basically, this query chokes with the last piece where I have the 1 = ....

    I need to only bring back rows from AIRGBS.vLCExtract where this is only 1 distinct PolicyInsuredLocGK for each combination of AccountNum, PolicyNum, Download, and LCDetailGK.

    That view has about 2 million rows. The rest of the query runs in about 6 minutes, but takes 4 hours when I add the last piece of code with the 1 = ...

  • Let's start with a little formatting so this is more legible.

    SELECT lc.DownLoad

    ,lc.AccountNum

    ,lc.PolicyNum

    --, splitLC.SplitPolicyNumLmt

    ,splitLC.SplitPolicyNumLmt

    ,splitLC.SplitPolicyNumCA

    ,splitLC.SplitPolicyNumReins

    ,lc.LCDetailGK

    ,splitLC.PolicyInsuredLocGK

    ,lc.CbiId

    ,lc.LCAmtUSD

    ,lc.LCAplbtyCode

    ,lc.LCTypeCode

    ,lc.PerLocInd

    ,lc.PolicyLevelInd

    ,lc.LCQualifierTypeCode

    ,lc.LCFrequencyCode

    ,0 AS ProcessInd

    FROM AIRGBS.vLCExtract AS lc

    INNER JOIN AIRGBS.SplitLCLimits AS splitLC ON splitLC.AccountNum = lc.AccountNum

    AND splitLC.PolicyNum = lc.PolicyNum

    AND splitLC.DownLoad = lc.DownLoad

    AND splitLC.CbiID = lc.CbiId

    AND splitLC.PolicyInsuredLocGK = lc.PolicyInsuredLocGK

    WHERE lc.PerLocInd = 'N'

    AND lc.LCTypeCode = 'LIM'

    AND (

    lc.LCDetailGK <> splitLC.PolicyLCDetailGK

    AND lc.LCDetailGK <> splitLC.PolicyTEDetailGK

    AND lc.LCDetailGK <> splitLC.PolicyPDDetailGK

    AND lc.LCDetailGK <> splitLC.LocLCDetailGK

    AND lc.LCDetailGK <> splitLC.LocPDDetailGK

    AND lc.LCDetailGK <> splitLC.LocTEDetailGK

    )

    AND 1 = (

    SELECT COUNT(DISTINCT lc2.PolicyInsuredLocGK)

    FROM AIRGBS.vLCExtract AS lc2

    WHERE lc.AccountNum = lc2.AccountNum

    AND lc.PolicyNum = lc2.PolicyNum

    AND lc.DownLoad = lc2.DownLoad

    AND lc.LCDetailGK = lc2.LCDetailGK

    )

    All those inequality predicates are not helping here. I would argue that 6 minutes without the last part is way too slow...but 4 hours is absurd. We can help but we need a LOT more information. We would want to see an execution plan. We also would need to see the table structures including indexes.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Execution Plan attached.

  • scottcabral (8/3/2016)


    Execution Plan attached.

    The actual plan wold be better but at 4 hours you probably can't generate one. 😉

    It looks like tmpLCRollupList is a heap. That means there is no clustered index on that table which is one part of the problem. Also your tble LCExtract seems to likely have no indexes at all which is why we are seeing an index spool. This is likely the worst performance sink hole here but I suspect that given the lack of indexing in general there are LOTS of improvements that could be made with a decent indexing strategy.

    Please post the table definitions (and indexes) and we can help you get a handle on this. Also, you should probably spend some time reading this stairway. http://www.sqlservercentral.com/stairway/72399/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean's nailed the real problem - the indexing is a proper mess.

    You might get a little better mileage with this alternative query:

    ;WITH FirstQuery AS (

    SELECT

    lc.DownLoad

    , lc.AccountNum

    , lc.PolicyNum

    --, splitLC.SplitPolicyNumLmt

    , splitLC.SplitPolicyNumLmt

    , splitLC.SplitPolicyNumCA

    , splitLC.SplitPolicyNumReins

    , lc.LCDetailGK

    , splitLC.PolicyInsuredLocGK

    , lc.CbiId

    , lc.LCAmtUSD

    , lc.LCAplbtyCode

    , lc.LCTypeCode

    , lc.PerLocInd

    , lc.PolicyLevelInd

    , lc.LCQualifierTypeCode

    , lc.LCFrequencyCode

    , 0 AS ProcessInd

    , ExtraFilter = COUNT(DISTINCT lc2.PolicyInsuredLocGK) OVER (PARTITION BY lc.AccountNum, lc.PolicyNum, lc.DownLoad, lc.LCDetailGK)

    FROM AIRGBS.vLCExtract AS lc

    INNER JOIN AIRGBS.SplitLCLimits AS splitLC

    ON splitLC.AccountNum = lc.AccountNum

    AND splitLC.PolicyNum = lc.PolicyNum

    AND splitLC.DownLoad = lc.DownLoad

    AND splitLC.CbiID = lc.CbiId

    AND splitLC.PolicyInsuredLocGK = lc.PolicyInsuredLocGK

    WHERE lc.PerLocInd = 'N'

    AND lc.LCTypeCode = 'LIM'

    AND lc.LCDetailGK NOT IN (splitLC.PolicyLCDetailGK, splitLC.PolicyTEDetailGK, splitLC.PolicyPDDetailGK,

    splitLC.LocLCDetailGK, splitLC.LocPDDetailGK, splitLC.LocTEDetailGK)

    )

    SELECT

    lc.DownLoad

    , AccountNum

    , PolicyNum

    , SplitPolicyNumLmt

    , SplitPolicyNumCA

    , SplitPolicyNumReins

    , LCDetailGK

    , PolicyInsuredLocGK

    , CbiId

    , LCAmtUSD

    , LCAplbtyCode

    , LCTypeCode

    , PerLocInd

    , PolicyLevelInd

    , LCQualifierTypeCode

    , LCFrequencyCode

    , ProcessInd

    INTO #temp

    FROM FirstQuery

    WHERE ExtraFilter = 1

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • i get this:

    Use of DISTINCT is not allowed with the OVER clause

  • 1) I am wondering if a CROSS APPLY for that COUNT(DISTINCT..) query would be better. Not sure if it would or not.

    2) Optimal indexing can easily result in 5-6 ORDERS OF MAGNITUDE performance difference. Magic-bullet stuff for sure, and in my 20 years of consulting on SQL Server I have rarely come across a client that was anywhere close to optimal. Most are HORRIBLE!!

    3) I am going to question the need for the COUNT(DISTINCT ..) = 1 thing. I can't count the number of times I have seen COUNT(DISTINCT..) constructs when it wasn't necessary. If you dig into the requirements and find that they are really "if there is one or more rows, undistincted" then an EXISTS clause could be MUCH faster, especially when properly indexed, because it can short-cut the operation when the first row is found.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • scottcabral (8/3/2016)


    i get this:

    Use of DISTINCT is not allowed with the OVER clause

    You can still use windowed functions, but it's a bit more complicated. I would start with something like the following:

    ExtraFilter = CASE WHEN MIN(lc2.PolicyInsuredLocGK) OVER( PARTITION BY lc.AccountNum, lc.PolicyNum, lc.DownLoad, lc.LCDetailGK ) = MAX(lc2.PolicyInsuredLocGK) OVER( PARTITION BY lc.AccountNum, lc.PolicyNum, lc.DownLoad, lc.LCDetailGK ) THEN 1 ELSE 0 END

    This assumes that PolicyInsuredLocGK cannot be NULL. There are other similar approaches that you could try, and I'm not sure which is the most efficient.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • And here is an option changing the last part to a NOT EXISTS. This also assumes that PolicyInsuredLocGK cannot be NULL.

    AND NOT EXISTS (

    SELECT 1

    FROM AIRGBS.vLCExtract AS lc2

    WHERE lc.AccountNum = lc2.AccountNum

    AND lc.PolicyNum = lc2.PolicyNum

    AND lc.DownLoad = lc2.DownLoad

    AND lc.LCDetailGK = lc2.LCDetailGK

    AND lc.PolicyInsuredLocGK <> lc2.PolicyInsuredLocGK

    )

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (8/3/2016)


    And here is an option changing the last part to a NOT EXISTS. This also assumes that PolicyInsuredLocGK cannot be NULL.

    AND NOT EXISTS (

    SELECT 1

    FROM AIRGBS.vLCExtract AS lc2

    WHERE lc.AccountNum = lc2.AccountNum

    AND lc.PolicyNum = lc2.PolicyNum

    AND lc.DownLoad = lc2.DownLoad

    AND lc.LCDetailGK = lc2.LCDetailGK

    AND lc.PolicyInsuredLocGK <> lc2.PolicyInsuredLocGK

    )

    Drew

    If this:

    SELECT AccountNum, PolicyNum, DownLoad, LCDetailGK, PolicyInsuredLocGK

    INTO #vLCExtract

    FROM AIRGBS.vLCExtract

    GROUP BY AccountNum, PolicyNum, DownLoad, LCDetailGK, PolicyInsuredLocGK

    provides a significant row reduction, then it will help too.

    AND NOT EXISTS (

    SELECT 1

    FROM #vLCExtract AS lc2

    WHERE lc.AccountNum = lc2.AccountNum

    AND lc.PolicyNum = lc2.PolicyNum

    AND lc.DownLoad = lc2.DownLoad

    AND lc.LCDetailGK = lc2.LCDetailGK

    AND lc.PolicyInsuredLocGK <> lc2.PolicyInsuredLocGK

    )

    One of these came up today, taking around 30 minutes to execute in a production environment. An implicit conversion made it worse. Easy to convert to the correct datatype in the #temp table. Post-tuning execution time 3s.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply