Query Tuning Assistance - Max() and Group By

  • Good morning,

    I was wondering if there is a way to optimize a query that has a MAX() function. Here are the detailed information:

    Table:

    [ID] INT NOT NULL,

    [Date] SMALLDATETIME NOT NULL

    PK Clustered Index:

    [ID] ASC,

    [Date] ASC

    Query:

    SELECT [ID], MAX([Date]) AS [Max_Date]

    FROM [dbo].[Table]

    GROUP BY [ID]

    The query plan is doing a clustered index scan using the PK index mentioned above and is scanning the entire table (20 million total records, returns 400 records). I know that because the query is using the aggregate function, it has to execute that function on every record. Is there no other way(s) to optimize this query?

    Thank you in advance.

  • That's already as optimal as it will get.

    Since the table has 2 columns only, that clustered index scan is returning the minimum amount of information possible to execute this query. It's returning it in an order optimal for a stream aggregate, the fastest aggregate type.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thank you very much, Gail! I appreciate your quick response.

  • Assuming that you have a finite list of ID's used, you can use cross apply with TOP, as demonstrated at the end of this article http://www.sqlservercentral.com/articles/T-SQL/71571/

    As you want 400 out of the 20 million , it should improve performance.



    Clear Sky SQL
    My Blog[/url]

  • is this a partitioned table perchance? there is a very nasty bug for those for max/min.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Dave Ballantyne (3/27/2011)


    Assuming that you have a finite list of ID's used, you can use cross apply with TOP, as demonstrated at the end of this article http://www.sqlservercentral.com/articles/T-SQL/71571/

    As you want 400 out of the 20 million , it should improve performance.

    Dave,

    Even if this produces a Segment/Top type plan, each row would have to be touched upon to do it since there are only 2 columns in the table and they are both in the clustered index. I doubt that it would change the performance. There's really no way around hitting each row in the table.

    Todd Fifield

  • tfifield (3/29/2011)


    Dave Ballantyne (3/27/2011)


    Assuming that you have a finite list of ID's used, you can use cross apply with TOP, as demonstrated at the end of this article http://www.sqlservercentral.com/articles/T-SQL/71571/

    As you want 400 out of the 20 million , it should improve performance.

    Dave,

    Even if this produces a Segment/Top type plan, each row would have to be touched upon to do it since there are only 2 columns in the table and they are both in the clustered index. I doubt that it would change the performance. There's really no way around hitting each row in the table.

    Todd Fifield

    My assumption is that as this is looking like a child table, there should exist a table which has ID as a unique PK somewhere.

    If that exists then

    Select PK,MaxDate.Date

    from <Theoretical Table > TT Cross apply(Select top(1) Date

    from Table

    where Table.ID = TT.ID

    order by date desc) as MaxDate(Date)

    400 seeks Should be faster that the single scan through 20 million rows

    Take a look under 'the caveat' section right at the end of the article i referenced ( and wrote ).

    Same thing here..



    Clear Sky SQL
    My Blog[/url]

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply