Slow query when using specific date range = very frustrated

  • Ok. So I have a statement and a question.

    I tried the various suggestions offered. Swapping orders and track didn't have an effect, neither did adding the option to recompile or force sort. The one change that worked was adding the track table as a left join. The query ran in about 6 minutes.

    That's all fine and good, however no matter how much I reindex or create stats or update stats on those tables, it won't have any effect on the optimizer? I thought maybe creating statistics for those specific fields it uses in the join and where clause, it will make the optimizer choose a better plan. Is that wrong?

    Also, below is the result from sp_helpindex for the orders table. This is a reporting box so I don't think all these index's are necessary, however I put it here for any opinions. Thanks again!

    IX_order_inactive_datenonclustered located on PRIMARYo_inactive_date

    IX_orders_orefnonclustered located on PRIMARYo_ref

    ORDERS_PK_Cclustered, unique located on PRIMARYo_num

    ORDERS_XND10_NCnonclustered located on PRIMARYo_type, o_inbox, o_div

    ORDERS_XND11_NCnonclustered located on PRIMARYe_num

    ORDERS_XND12_NCnonclustered located on PRIMARYo_escrownumber

    ORDERS_XND13_NSnonclustered located on PRIMARYFastWebNumber

    ORDERS_XND14_NCnonclustered located on PRIMARYo_titleref

    ORDERS_XND15_NCnonclustered located on PRIMARYo_eofficer

    ORDERS_XND16_NCnonclustered located on PRIMARYo_close

    ORDERS_XND2_NCnonclustered located on PRIMARYo_refcnty

    ORDERS_XND3_NCnonclustered located on PRIMARYc_num

    ORDERS_XND4_NCnonclustered located on PRIMARYo_div, o_type

    ORDERS_XND5_NCnonclustered located on PRIMARYo_open

    ORDERS_XND6_NCnonclustered located on PRIMARYo_type, o_ref

    ORDERS_XND7_NCnonclustered located on PRIMARYo_ref, o_open

    ORDERS_XND8_NCnonclustered located on PRIMARYo_sales

    ORDERS_XND9_NCnonclustered located on PRIMARYo_status

  • Something like the following may be worth a try:

    SELECT o_num

    INTO #orders

    FROM orders

    WHERE o_type IN (9, 30)

    AND trn_id IN (24, 45, 25, 0)

    ALTER TABLE #orders

    ADD PRIMARY KEY (o_num)

    -- This assumes that a CANCEL always comes after a CANCEL REQUEST

    SELECT o_num, t_num

    FROM

    (

    SELECT T.o_num

    -- assuming t_num is a positive int

    ,MAX(CASE WHEN T.t_status = 'CANCEL REQUEST' AND T.t_out IS NOT NULL THEN T.t_num ELSE -1 END) AS t_num

    ,SUM(CASE WHEN T.t_status IN ('CANCEL/BILL', 'CANCEL/NO BILL') THEN 1 ELSE 0 END) AS Cancel

    FROM track T

    WHERE T.t_status IN ('CANCEL REQUEST', 'CANCEL/BILL', 'CANCEL/NO BILL')

    AND T.t_in >= GETDATE() - 169

    AND EXISTS

    (

    SELECT *

    FROM #orders O

    WHERE O.o_num = T.o_num

    )

    GROUP BY T.o_num

    ) D1

    WHERE t_num > -1 AND Cancel = 0

    Also a covering nonclustered index on orders of o_num, o_type, trn_id

    and a covering nonclustered index on track of o_num, t_in, t_status, t_num, t_out may help.

  • Frank Nadal (11/24/2010)


    Ok. So I have a statement and a question.

    I tried the various suggestions offered. Swapping orders and track didn't have an effect, neither did adding the option to recompile or force sort. The one change that worked was adding the track table as a left join. The query ran in about 6 minutes.

    That's all fine and good, however no matter how much I reindex or create stats or update stats on those tables, it won't have any effect on the optimizer? I thought maybe creating statistics for those specific fields it uses in the join and where clause, it will make the optimizer choose a better plan. Is that wrong?

    Also, below is the result from sp_helpindex for the orders table. This is a reporting box so I don't think all these index's are necessary, however I put it here for any opinions. Thanks again!

    IX_order_inactive_datenonclustered located on PRIMARYo_inactive_date

    IX_orders_orefnonclustered located on PRIMARYo_ref

    ORDERS_PK_Cclustered, unique located on PRIMARYo_num

    ORDERS_XND10_NCnonclustered located on PRIMARYo_type, o_inbox, o_div

    ORDERS_XND11_NCnonclustered located on PRIMARYe_num

    ORDERS_XND12_NCnonclustered located on PRIMARYo_escrownumber

    ORDERS_XND13_NSnonclustered located on PRIMARYFastWebNumber

    ORDERS_XND14_NCnonclustered located on PRIMARYo_titleref

    ORDERS_XND15_NCnonclustered located on PRIMARYo_eofficer

    ORDERS_XND16_NCnonclustered located on PRIMARYo_close

    ORDERS_XND2_NCnonclustered located on PRIMARYo_refcnty

    ORDERS_XND3_NCnonclustered located on PRIMARYc_num

    ORDERS_XND4_NCnonclustered located on PRIMARYo_div, o_type

    ORDERS_XND5_NCnonclustered located on PRIMARYo_open

    ORDERS_XND6_NCnonclustered located on PRIMARYo_type, o_ref

    ORDERS_XND7_NCnonclustered located on PRIMARYo_ref, o_open

    ORDERS_XND8_NCnonclustered located on PRIMARYo_sales

    ORDERS_XND9_NCnonclustered located on PRIMARYo_status

    Based on this code, which you said worked to run in minutes vs. hours:

    SELECT o.o_num,

    MAX(t.t_num)

    FROM orders o

    JOIN track t

    ON o.o_num = t.o_num

    AND t.t_status = 'CANCEL REQUEST'

    LEFT JOIN track tr

    ON tr.o_num = o.o_num

    AND tr.t_status IN ('CANCEL/BILL','CANCEL/NO BILL')

    WHERE t.t_out is not null

    AND t.t_in >= getdate() -169 --*********this statement right here ***************

    AND o.o_type IN (9, 30)

    AND o.trn_id IN (24, 45, 25, 0)

    AND tr.o_num IS NULL

    GROUP BY o.o_num;

    On the orders table, I would initially suggest an index on o_type, trn_id, o_num in that order.

    On the track table, I would initially suggest an index on t_status, t_in, t_out, o_num in that order.

    These will put the filtering columns (used in the where / join clauses to filter the records) first.

    I don't remember seeing how many rows are in these tables - can you give an idea of this?

    (I wish you were using 2005 or higher - then we could utilize the missing indexes dmv to get an idea of what the optimizer thinks is the best index to use for this query.)

    If the existing statistics are up-to-date, then re-indexing and updating statistics won't have an effect. Adding a missing statistic might help, but SQL usually does this behind the scenes if it thinks it would be helpful. However, choosing a proper index for the fields that are used in the where/join clauses, and referenced in the rest of the query, could be very beneficial.

    A reporting box is a good box to put extra indexes on - in fact, that's one of the reasons why things are split out to a reporting box. Have a database without them for faster insert/update/delete operations; have a database with them for faster querying.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Hi,

    I did face a similar problem, some years back. This has to do with missing indexes, missing statistics or statistics not updated.

    Using the management studio, run the stored prcoedure / SQL code...and using profiler, capture the trace.

    Feed this trace as an input to Index tuning advisor, it will recommend you some missing indexes & statistics

    Apply the ones which u think is logical...and i am 200% sure, things will work for you.

    Regards,

    Manish Kamath

  • An update:

    I bit the bullet and ran an 'update statistics fullscan' on all the tables in the database and that seemed to do the trick. Our report writers said their report queries were running much much better.

    I am going to have them all re-write their code to streamline their join statements and their where clauses.

    Thanks again for all your help!

  • Thanks for the feedback Frank. So, what is the performance like now on that query that used to run in hours?

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • By altering the query to not use the (does not exist) and instead using it in the where clause to check for cancelled orders, that along with the stats update, it runs in a few minutes. Same with other procs that used to run in hours.

  • Good news indeed.

    Just to be sure I understand, you replaced the NOT EXISTS (...) with the LEFT JOIN WHERE column IS NULL, correct? I thought that was what you had used earlier that took it down to 6 minutes.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply