Solution looking for a problem!

  • I have a query that's been running fine with the following WHERE clause:

    WHERE P.PH1 IS NOT NULL

    It froze last week, so I took the query apart and found that if I excluded the above WHERE clause, it ran in 15 seconds.

    I then modified the WHERE clause as follows:

    WHERE ISNUMERIC(SUBSTRING(P.PH1,1,1)) = 1)

    Result: Query still froze.

    I then modified the WHERE clause as follows:

    WHERE SUBSTRING(P.PH1,1,1) IN ('1','2','3','4','5','6','7','8','9')

    Result: Query ran in 3 seconds

    What gives?????

  • Query optimizer is stumbling somewhere. If you have an index you may have out of date statistics causing this or the index is fragmented or any of serveral other issues.

    As a basic practice I do the foloowing once a week on most of my DBs

    DBCC DBREINDEX on 7/2000 or DBCC INDEXDEFRAG on 2000

    sp_updatestats

    DBCC UPDATEUSAGE

    ALso, even when things are in clean and smooth state the other queries may execute better based on the condition.

    Consider in the ISNUMERIC item you get the value of a particular position in the string, then you check if it is a number and you make sure that the return from ISNUMERIC is 1. In the second thou you get the value of the string position and compare to the a list of string values 1-9 which is much less complex and has a much lower cost per row porcessed. Not to mention the comppiler may build a temp table for the list of values 1-9 and actually perform a join so now you have an added performance enhancement you cannot get the other way.

    In the first case thou where you have IS NOT NULL then you may need to add an index or stats are out of wack.

  • I think indexes won't help here. IS NOT NULL (or <>) condition has to go through every row anyway. This is different from IS NULL (or =). Also, once you are using a function (ISNUMERIC, SUBSTRING< etc.) on a column, forget about index.

    On the other hand, try to replace SUBSTRING( x, 1, 1) with LEFT(x, 1). LEFT is much faster than SUBSTRING.

    I think Antares686 is right about temp table that might be created by the optimizer in case of IN('1', ...).

    Here is what I whould try in this situation to be able to use index:

    WHERE P.PH1 > 0

    There is no need to switch to the char types.

    Let us know what will happen at the end.

  • Thanks for the help from each of you. mromm seems to be on the right track about indexes - DBCC DEFRAG, sp_udatestats and DBCC updateusage had no effect on the query (but I think that stale indexes will be a problem in the future, so I appreciate your direction on this, antares686!).

    Your clues on eliminating NOT constructs were helpful as well as your alerts about the cost of functions in the WHERE clause.

    Hoever, subsequent to my posting and your kind replies, the query hung again, despite the syntax

    WHERE SUBSTRING(P.PH1,1,1) IN ('1','2','3','4','5','6','7','8','9')

    I finally postulated that I was running into an index size and complexity problem. Here's the complete code for the query:

    INSERT MOAT_Component

    (MxC_Price,

    MxC_Type,

    MxC_Qty,

    MxC_ProcessDate,

    MxC_HasReturnEnvelope,

    MxC_PrinterPlexType,

    MxC_Trailer,

    TSP_ID,

    MxC_LoadDate,

    CL_ID)

    SELECT DISTINCT

    CASE

    WHEN (M.M_PostageRateType = '1' AND

    M.M_Trailer = 'N') THEN MAX(TSP.TSP_DomesticPostageRate)

    WHEN (M.M_PostageRateType = '2' AND

    M.M_Trailer = 'N') THEN MAX(TSP.TSP_MexCanPostageRate)

    WHEN (M.M_PostageRateType = '3' AND

    M.M_Trailer = 'N') THEN MAX(TSP.TSP_OffContPostageRate)

    WHEN (M.M_PostageRateType = '1' AND

    M.M_Trailer = 'Y') THEN MAX(TSP.TSP_DomesticPostageRate) +

    MAX(TSP.TSP_TrailerSurcharge)

    WHEN (M.M_PostageRateType = '2' AND

    M.M_Trailer = 'Y') THEN MAX(TSP.TSP_MexCanPostageRate) +

    MAX(TSP.TSP_TrailerSurcharge)

    WHEN (M.M_PostageRateType = '3' AND

    M.M_Trailer = 'Y') THEN MAX(TSP.TSP_OffContPostageRate) +

    MAX(TSP.TSP_TrailerSurcharge)

    END,

    CASE

    WHEN M.M_PostageRateType = '1' THEN 'P' --

    Domestic

    WHEN M.M_PostageRateType = '2' THEN 'M' --

    Continental (Mexico/Canada)

    WHEN M.M_PostageRateType = '3' THEN 'O' --

    Off-Continent

    END,

    COUNT(P.CID),

    @EarliestDate,

    M.M_HasReturnEnvelope,

    M.M_PrinterPlexType,

    M.M_Trailer,

    TSP.TSP_ID,

    GETDATE(),

    M.CL_ID

    FROM PACLOADER P INNER JOIN MOAT M ON ((P.CID = M.M_CID) AND

    (M.M_RecordID = p.RID) AND (M.M_ProcessDate = @EarliestDate) )

    INNER JOIN WorkTransSpec TSP ON

    (TSP.CL_ID = M.CL_ID AND TSP.TSP_Name = P.Ltr AND TSP.PG_ID = M.PG_ID)

    WHERE (M.M_PricingCode = 'Y') AND

    (M.M_RecordType IN ('TrHd','LL','TJ','JL')) AND

    (M.M_ServiceLevel IN ('0','1','2','3','4','5')) AND

    (P.Prt > '0')

    GROUP BY M.CL_ID, M.M_PostageRateType, M.M_Trailer,

    M.M_HasReturnEnvelope, M.M_PrinterPlexType, TSP.TSP_ID

    The MOAT table is about 15 million rows and growing by about 300,000 records per day. I then use the above query to post records to MOAT_Component. M_ID is an identity column. I decided to change M_ID to a non-identity column and implement an identically structured WorkMOAT table with M_ID as the identity field. I do MAX(MID) on MOAT before the load, add 1 to it and reset the seed on WorkMOAT. I then load to WorkMOAT and use it (only 300,000 records) instead of MOAT to do the complex joins above. I finally INSERT WorkMOAT into MOAT.

    Result: The entire load (there are many more pieces)runs quickly and predictably. I assume that my hypothesis about index size is correct, but if you have any other explanations, I'm all ears.

  • 1) Migrate all the where clauses into the ON part of the statement if possible. Move the m.(MOAT) to be the first table in the FROM and make the M.M_ProcessDate = @EarliestDate the first in the ON part. Presumably there is a covering date index.

    2) Save a cycle or two by shorten the case statements to

    CASE M.M_PostageRateType

    WHEN '1' THEN MAX(TSP.TSP_DomesticPostageRate) +

    CASE When M.M_Trailer = 'Y' Then MAX(TSP.TSP_TrailerSurcharge) Else 0 End

    WHEN '2' THEN MAX(TSP.TSP_MexCanPostageRate) +

    Case When M.M_Trailer = 'Y' Then MAX(TSP.TSP_TrailerSurcharge) Else 0 End

    WHEN '3' THEN MAX(TSP.TSP_OffContPostageRate) +

    Case When M.M_Trailer = 'Y' Then MAX(TSP.TSP_TrailerSurcharge) Else 0 End

    END,

    CASE M.M_PostageRateType

    WHEN '1' THEN 'P' -- Domestic

    WHEN '2' THEN 'M' -- Continental (Mexico/Canada)

    WHEN '3' THEN 'O' -- Off-Continent

    END

    Interested to know what the final solution was!

  • 5409045121009, thanks for the ideas on query optimization. Yes, the date field is indexed. I can see the reasons for all your suggestions except placing MOAT as the first table in the FROM clause. I was under the impression that the optimal arrangement is ordering the tables in the FROM clause by ascending table size. Why are you recommending placing MOAT first?

  • AS Antares686 was saying "Query optimiser is stumbling somewhere." I thought it may just tickle it to generate a "better" / different plan. Agree in principal with your statement "ascending table sizes".

  • Actually order of tables does not always work like you would assume. I find the joins faster when you start from the least resultset size to the larger since you deal with the minimal data output that way. There still are no guarantees as the Query Engine makes so odd choices sometimes, just test different ways if any exist.

    ANd since you are doing INNER JOINS thru this you may try

    FROM PACLOADER P INNER JOIN MOAT M ON ((P.CID = M.M_CID) AND

    (M.M_RecordID = p.RID) AND (M.M_ProcessDate = @EarliestDate) )

    INNER JOIN WorkTransSpec TSP ON

    (TSP.CL_ID = M.CL_ID AND TSP.TSP_Name = P.Ltr AND TSP.PG_ID = M.PG_ID)

    WHERE (M.M_PricingCode = 'Y') AND

    (M.M_RecordType IN ('TrHd','LL','TJ','JL')) AND

    (M.M_ServiceLevel IN ('0','1','2','3','4','5')) AND

    (P.Prt > '0')

    as

    FROM

    PACLOADER P

    INNER JOIN

    MOAT M

    ON

    (P.CID = M.M_CID AND

    M.M_RecordID = p.RID AND

    M.M_ProcessDate = @EarliestDate AND

    (M.M_PricingCode = 'Y') AND

    (M.M_RecordType IN ('TrHd','LL','TJ','JL')) AND

    (M.M_ServiceLevel IN ('0','1','2','3','4','5')) AND

    (P.Prt > '0') )

    INNER JOIN

    WorkTransSpec TSP

    ON

    (TSP.CL_ID = M.CL_ID AND

    TSP.TSP_Name = P.Ltr AND

    TSP.PG_ID = M.PG_ID)

    Or

    FROM PACLOADER P INNER JOIN

    (SELECT columnsyouneedinyourfinaloutputandtocompletejoins FROM MOAT

    WHERE M_ProcessDate = @EarliestDate AND (M_PricingCode = 'Y') AND

    (M_RecordType IN ('TrHd','LL','TJ','JL')) AND

    (M_ServiceLevel IN ('0','1','2','3','4','5')) ) AS M

    ON ((P.CID = M.M_CID) AND

    (M.M_RecordID = p.RID))

    INNER JOIN WorkTransSpec TSP ON

    (TSP.CL_ID = M.CL_ID AND TSP.TSP_Name = P.Ltr AND TSP.PG_ID = M.PG_ID)

    WHERE (P.Prt > '0')

    And there are some other options, just try them and see what it does to the query output and time.

  • Thanks to all for your help. The most effective tweak by far was to avoid doing joins on the archive table, using instead a temp version of the archive table containing only the data currently being added. Seems that index size (as a function of table size) is a crucial determinant.

  • I Agree with Carl. Most of my speed improvemnets was moving the small resultset to a amemory table and then join that large resultset with a small memory table,

    or sometimes even better

    Count group etc... the bigger resultset into a mem table as well and join both mem tables.... (only if there ar not too many records)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply