Query optimization help

  • GilaMonster (4/16/2014)


    The query is reading just over 14 million rows and aggregating that down to 1099. If it's reading 14 million rows from a table (the rows which satisfy the where clause), the reads you're seeing are pretty much expected (you're reading the majority of the table) and hence you may not be able to reduce it much more.

    Also, using the index Gail recommended, you've replaced the hash aggregate with a stream aggregate which is likely to improve performance by a measurable amount. I don't think this query has further room for improvement.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • GilaMonster (4/16/2014)


    The query is reading just over 14 million rows and aggregating that down to 1099. If it's reading 14 million rows from a table (the rows which satisfy the where clause), the reads you're seeing are pretty much expected (you're reading the majority of the table) and hence you may not be able to reduce it much more.

    I have used row_number with CTE rather than max and group by, but it seems like query is not improving at all.

    I am attaching execution for your information.

    Query:

    DECLARE @CURRENT smalldatetime

    set @current = '6/30/2013';

    with cte as (

    select KeyInstn,DateEndedStandard , row_number() over(partition by keyinstn order by KEYINSTN desc) as row_num

    from ObjectViews..InternalOperCBGEstimate

    where dateendedstandard < @Current and mostrecentsequence = 1 and UpdOperation<2

    )

    select KeyInstn, dateendedstandard

    from cte where row_num = 1

  • Try to obtain the same result set by whatever other means you can think of. Time the execution of each method and record those timings. Usually the fastest query is the "best" query, your metric may differ.

    To be honest I don't think you can arrive at this same result set without architectural changes - you're not going to get the reads down because you have to read almost the whole table to pick up all the rows to be aggregated.

    Take care to ensure that the results from each query are correct - your query using row_number() will not return the correct result set because you are ordering by your aggregate key instead of DateEndedStandard.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Looking at the latest execution plan, you are still reading over 14 million rows from the table that meet your search criteria. That is a lot of reads.

  • I suggest changing one of your existing indexes to also cover this query:

    CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14947]

    ON OperatorCBGEstimate ( DateEndedStandard, UpdOperation )

    INCLUDE ( KeyInstn, MostRecentSequence ) WITH ( DROP_EXISTING = ON )

    Edits:

    Added line breaks to CREATE INDEX command so it displays better.

    I'm thinking coming in by DateEndedStandard could help; would have to look at row counts for that date to be sure.

    I'm also assuming:

    (1) the view just does a simple SELECT on the main table; if it doesn't, that might affect things as well.

    (2) That I understood all the table and column names correctly, as there are some variations in the names you used.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • As it looks like you are using SQL Server 2008 perhaps a filtered index. Not sure if it will help, but it is worth testing.

    CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14947]

    ON dbo.OperatorCBGEstimate (DateEndedStandard)

    INCLUDE (KeyInstn)

    WHERE MostRecentSequence = 1 and UpdOperation < 2

  • Thanks Scott and Lynn for your inputs.

    As suggested by Scott i have made changes in existing index with filtered index (suggested by Lynn). And i am able to bring the logical reads down to 30k from 70k.

    CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14947]

    ON OperatorCBGEstimate ( DateEndedStandard, UpdOperation )

    INCLUDE ( KeyInstn, MostRecentSequence )

    WHERE MostRecentSequence = 1 and UpdOperation < 2 WITH ( DROP_EXISTING = ON )

  • EasyBoy (4/16/2014)


    Thanks Scott and Lynn for your inputs.

    As suggested by Scott i have made changes in existing index with filtered index (suggested by Lynn). And i am able to bring the logical reads down to 30k from 70k.

    CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14947]

    ON OperatorCBGEstimate ( DateEndedStandard, UpdOperation )

    INCLUDE ( KeyInstn, MostRecentSequence )

    WHERE MostRecentSequence = 1 and UpdOperation < 2 WITH ( DROP_EXISTING = ON )

    That's quite a lot better than I'd expect. Can you post the execution plan?


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Please find the updated sql plan.

  • EasyBoy (4/17/2014)


    Please find the updated sql plan.

    Thanks! That's nice - the filtered index eliminates the need for an expensive residual predicate (the "filter" bit of the filtered seek Gail mentioned earlier).

    Good catch, Lynn - with 13M rows to filter, it makes quite a difference.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Except...the 70k read plan isn't from the index Gail suggested, and whatever index it is has several columns which aren't referenced by the query - hence the difference in reads.

    I'd be very interested to see the result of this, if you have the time. It's the same index with a query hint to ensure it is used:

    CREATE NONCLUSTERED INDEX [AI_OperatorCBGEstimate_14948]

    ON [dbo].[OperateEstimate] (mostrecentsequence, KeyInstn)

    INCLUDE (dateendedstandard, UpdOperation)

    SELECT KeyInstn, MAX(dateendedstandard) DateEndedStandard

    FROM ObjectViews..InternalOperCBGEstimate WITH (INDEX(AI_OperatorCBGEstimate_14948))

    WHERE dateendedstandard < @Current

    AND mostrecentsequence = 1

    AND UpdOperation<2

    GROUP BY KeyInstn

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Comparing the latest two execution plans, all the filtered index did is drop approximately 700,000 rows from the initial pull. May have reduced the reads, but not sure if it really had much of an affect on the processing itself.

  • You need to do further research before you change the existing index to a filtered index, because that index will no longer satisfy queries originally using it if you filter out the values it was using.

    Besides, I was hoping that the date range itself would eliminate enough rows to improve I/O significantly without impairing any existing uses of the index.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (4/17/2014)


    You need to do further research before you change the existing index to a filtered index, because that index will no longer satisfy queries originally using it if you filter out the values it was using.

    Besides, I was hoping that the date range itself would eliminate enough rows to improve I/O significantly without impairing any existing uses of the index.

    What date range? The query wants EVERYTHING prior to a specific date.

  • Lynn Pettis (4/17/2014)


    ScottPletcher (4/17/2014)


    You need to do further research before you change the existing index to a filtered index, because that index will no longer satisfy queries originally using it if you filter out the values it was using.

    Besides, I was hoping that the date range itself would eliminate enough rows to improve I/O significantly without impairing any existing uses of the index.

    What date range? The query wants EVERYTHING prior to a specific date.

    It also wants 90% of the rows in the table.

    EasyBoy, there are some interesting things going on in the plan Stream_plan.sqlplan. Could you please post the definition of [AI_OperatorCBGEstimate_HAP]? I'm unable to do anything with the xls you posted. Thanks.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 15 posts - 16 through 30 (of 49 total)

You must be logged in to reply to this topic. Login to reply