Any ideas on bad query plan?

  • I would appreciate review and opinions of below issue. I believe issue is data related, but confused on how a good plan is generated every time using RECOMPILE hint and not after being flushed from cache.

    VERSION: SQL 2012 SP3
    OS: Windows Server 2012 R2
    ISSUE: Bad query plan is produced by default on production server, whereas our Staging server produces an adequate plan. Both plans are almost identical with three braches. The good plan performs a Merge Inner Join with a Hash Match (Right Outer) Join (see attached img). The bad plan performs a Lazy Spool into a Nested Loops Inner Join with a Merge (Right Outer) Join. With the good plan, the query executes in under 20 seconds, but takes 1.5hrs with bad plan. Largest table, AgreementMaster has 28.55m records in Production, 28.67m records in Staging. The other two tables data differences is minor and they're not very large < 500k records. Query below is being executed as is against both servers in all testing. Comparison reveals zero differences in schema, indexing, instance version/patch, instance options, or database options. ANSI options are same since I'm executing query directly via SMSS. During testing, only 288 rows had changed in largest table since last stats update.

    ANALYSIS
    Bad plan is reproduced after flushing it from cache
    Using hints RECOMPILE and OPTIMIZE FOR produces good plan
    UPDATE STATS WITH FULLSCAN produces good plan
    Staging db continues to produce good plan on another server. It's not environmental

    NEXT STEP
    : Restore production db to another server. Drop statistics on all tables referenced by query. Allow SQL Server to recreate automatically. Test query.

    QUERY
    DECLARE @SourceSystemID INT = 1044;
    WITH prd AS (    SELECT a.PROD_GRP_CD ,
                                a.PlanCode ,
                                a.ProductKey ,
                                a.PROD_CD
                        FROM    ( SELECT    PlanCode ,
                                            PROD_GRP_CD ,
                                            ProductKey ,
                                            PROD_CD ,
                                            ROW_NUMBER() OVER ( PARTITION BY PlanCode ORDER BY PROD_CD ASC ) AS PC
                                  FROM      EDW.Product(NOLOCK)
                                  WHERE     SourceSystemKey = @SourceSystemID
                                            AND IsActive = 1
                                ) a
                        WHERE   a.PC = 1)
      SELECT
       AM.AgreementMasterKey AS AgreementMasterKey
       ,PC.ProductKey AS  ProductKey
       ,@SourceSystemID AS SourceSystemKey
       ,AM.SourceSystemIdentifier AS SourceSystemIdentifier
       ,CAST(ST.policy_number AS NVARCHAR(50)) AS AgreementNumber
       ,PC.PROD_CD AS ProductCode
       ,ST.BASE_COVERAGE_AMT AS Face_Value
       ,0 AS IndicatorCode
       ,CAST('Base' AS NVARCHAR(50)) AS IndicatorCodeDescription
       ,GetDate() AS LoadDate
       ,GetDate() AS ModifyDate
       ,1 AS IsActive
      FROM stage.GENEXT_TOPS_POLICY_Persist ST WITH(NOLOCK)
      INNER JOIN EDW.AgreementMaster AM WITH(NOLOCK) ON LTRIM(RTRIM(AM.AgreementNUmber)) = LTRIM(RTRIM(ST.policy_number))
       AND AM.Sourcesystemkey = @SourceSystemID
       AND AM.Isactive = 1
       AND AM.ProductTypeCode = 1
      LEFT JOIN prd PC WITH(NOLOCK) ON LTRIM(RTRIM(ST.PLAN_CODE)) = LTRIM(RTRIM(PC.PlanCode))

  • Instead of posting pictures of the execution plans, save then as files and attach them.  Also, that last outer join has no chance of being SARGable as both columns must first be converted by the formulas.

    See the second link under "Helpful Links" in my signature line below for what we need to help troubleshoot performance problems.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • In addition to Jeff's comments about functions around columns in WHERE clause (DEVASTATINGLY BAD, BTW, and possibly not even needed here depending on the actual data - I have seen that type of thing more than a few times where it wasn't actually needed but was KILLING the system):

    1) This is not a sproc, it is ad hoc code. So I encourage you to look up how the optimizer uses variables in ad hoc code. Look at your estimated rows with the variable, with the code put in a sproc and the variable made a parameter instead, both  of those with OPTION (RECOMPILE) and lastly with dynamic SQL. You will learn a LOT about how things work and why choices are made with this exercise.

    2) Speaking of dynamic SQL, OPTION (RECOMPILE) can be "mostly" thought of as just that. Do the dynamic SQL and OPTION (RECOMPILE) estiamtes and query match up?

    3) I have only ONCE in TWENTY YEARS of consulting come across a client that had a stage/test/qa server that was EXACTLY THE SAME as production (and that was a hedge fund with a pair of STUNNINGLY-powerful machines). So you are almost certainly comparing bananas and pomegranates (not even apples and oranges). Probably worse still because your databases aren't in sync either.

    4) You REALLY need to get rid of those functions around columns in the WHERE clause. That is the single worst performance thing I have seen in 20+ years of consulting in aggregate. And believe me when I tell you that is really saying something!!! 😀

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Absolutely get rid of the LTRIM and RTRIM.  Also, review all indexes on the tables, particularly making sure that you have the best clustered index on every (significant) table.

    If these columns:
    AM.AgreementNUmber
    ST.policy_number
    ST.PLAN_CODE
    PC.PlanCode
    really do need TRIMmed, do it once when the row is INSERTed; an AFTER INSERT trigger is an excellent method of doing that.  If necessary, you can also do if those columns gets UPDATEd (but presumably those particular would never get UPDATEd anyway?!).

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply