Index hints and top operatror

  • Hi,

    This is more of a discussion that a question.

    I was reading the Enhancements for SQL Server 2005 and read the implementation of TOP clause and just out of curiosity tried to use that clause in SQL Server 2000 (I was not using the 2005 implementation but the simple old TOP clause).

    A simple query which returned some rows from a 5 column table was used, so the query was

    select row1,row2,row3,row4, row5 from <tablename> with (index(0)) where row2 = 'abc'

    This used the clustered index defined on row 1. An returns

    =================================================

    col1             col2                  col3                    col4             col5

    11                  abc                somedata11         somemore   some11 

    12                  abc                somedata21         somemore   some21

    13                  abc                somedata31        somemore   some31

     

    select row1,row2,row3,row4, row5 from <tablename> with (index(NXC_col2)) where row2 = 'abc'

    =================================================

    col1             col2                  col3                   col4           col5

    11                  abc                somedata11         somemore   some11 

    12                  abc                somedata21         somemore   some21

    13                  abc                somedata31         somemore   some31

     

    select row1,row2,row3,row4, row5 from <tablename> with (index(nxc_col3)) where row2 = 'abc'

    =================================================

    col1             col2                  col3                   col4            col5

    01                  abc                somedata01         somemore   some1 

    02                  abc                somedata02         somemore   some2

    03                  abc                somedata03         somemore   some3

    In case of the last query the records are ordered on the basis of col 3. Hence the result set is changed. And this is because I am forcing the query optimiser to use the index which is defined on a column which is not being referred to the where clause.

    Since BOL says that "The data rows are not sorted and stored in order based on their non-clustered keys." I was surprised to see the change in the result set. Query 1 used a clustered index scan, query 2 uses a non-clustered index seek (obviously) and query 3 used a NCX scan but on col 3. There’s an additional Filter clause which comes in play for query 3 execution plan. I think its this difference in the query plans that makes the difference in the result sets, though I think theres a lil more to it which I am not able to sort out.

    NOTE: The table in the query examples is not the actual table i used for my queries


    What I hear I forget, what I see I remember, what I do I understand

  • Without an ORDER BY, you cannot guarantee anything about a SELECT TOP query.

    Just because it seemed to happen a certain way every day for the last 2 years doesn't mean it won't blow up in your face tomorrow if you omit the ORDER BY. <-- Real world experience, scars still healing

    Or, IOW, you should not rely on the underlying physical database design, you should be explicit in your SQL queries, because the physical level is out of your control and service packs, security updates or major product upgrades can & will change the results of queries that make assumptions about physical storage & sort order.

  • I have a lot of respect for quotes coming from "real world" experiences.

    If the sort order requested is the same as the clustered sort order, would it now save processing time if the sorting on that column is avoided? The question is how would "they" (read developers) know that... hummmmmmm


    What I hear I forget, what I see I remember, what I do I understand

  • >>the same as the clustered sort order

    That's kinda the crux of the whole issue. "Clustering" is a vendor & version dependant physical implementation. There is really no such thing as a "clustered sort order". There is a set of data. And that set can & will come back in any random order if you don't specify an ORDER BY.

     

  • If the sort order requested is the same as the clustered sort order, would it now save processing time if the sorting on that column is avoided?

    If you do an order by on the cluster key, SQL can take advantage of the index sort order and do the sort very quickly. Covering indexes can also give you cheap sorts, if they're designed carefully.

    If you don't do an order by, then there is absolutely no guarentee of the order of returned records. Especially on a multiprocessor server that's running queries in parallel. Hash matches or hash aggregates can also 'reorder' the data in ways you don't expect.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for the input guys. I guess one needs to sort it explicityly instead of leaving it to the indexes to "get it right" for you.


    What I hear I forget, what I see I remember, what I do I understand

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply