December 22, 2005 at 12:09 am
Hi,
This is more of a discussion that a question.
I was reading the Enhancements for SQL Server 2005 and read the implementation of TOP clause and just out of curiosity tried to use that clause in SQL Server 2000 (I was not using the 2005 implementation but the simple old TOP clause).
A simple query which returned some rows from a 5 column table was used, so the query was
select row1,row2,row3,row4, row5 from <tablename> with (index(0)) where row2 = 'abc'
This used the clustered index defined on row 1. An returns
=================================================
col1 col2 col3 col4 col5
11 abc somedata11 somemore some11
12 abc somedata21 somemore some21
13 abc somedata31 somemore some31
select row1,row2,row3,row4, row5 from <tablename> with (index(NXC_col2)) where row2 = 'abc'
=================================================
col1 col2 col3 col4 col5
11 abc somedata11 somemore some11
12 abc somedata21 somemore some21
13 abc somedata31 somemore some31
select row1,row2,row3,row4, row5 from <tablename> with (index(nxc_col3)) where row2 = 'abc'
=================================================
col1 col2 col3 col4 col5
01 abc somedata01 somemore some1
02 abc somedata02 somemore some2
03 abc somedata03 somemore some3
In case of the last query the records are ordered on the basis of col 3. Hence the result set is changed. And this is because I am forcing the query optimiser to use the index which is defined on a column which is not being referred to the where clause.
Since BOL says that "The data rows are not sorted and stored in order based on their non-clustered keys." I was surprised to see the change in the result set. Query 1 used a clustered index scan, query 2 uses a non-clustered index seek (obviously) and query 3 used a NCX scan but on col 3. There’s an additional Filter clause which comes in play for query 3 execution plan. I think its this difference in the query plans that makes the difference in the result sets, though I think theres a lil more to it which I am not able to sort out.
NOTE: The table in the query examples is not the actual table i used for my queries
December 22, 2005 at 12:52 am
Without an ORDER BY, you cannot guarantee anything about a SELECT TOP query.
Just because it seemed to happen a certain way every day for the last 2 years doesn't mean it won't blow up in your face tomorrow if you omit the ORDER BY. <-- Real world experience, scars still healing
Or, IOW, you should not rely on the underlying physical database design, you should be explicit in your SQL queries, because the physical level is out of your control and service packs, security updates or major product upgrades can & will change the results of queries that make assumptions about physical storage & sort order.
December 22, 2005 at 1:10 am
I have a lot of respect for quotes coming from "real world" experiences.
If the sort order requested is the same as the clustered sort order, would it now save processing time if the sorting on that column is avoided? The question is how would "they" (read developers) know that... hummmmmmm
December 22, 2005 at 1:47 am
>>the same as the clustered sort order
That's kinda the crux of the whole issue. "Clustering" is a vendor & version dependant physical implementation. There is really no such thing as a "clustered sort order". There is a set of data. And that set can & will come back in any random order if you don't specify an ORDER BY.
December 22, 2005 at 3:53 am
If the sort order requested is the same as the clustered sort order, would it now save processing time if the sorting on that column is avoided?
If you do an order by on the cluster key, SQL can take advantage of the index sort order and do the sort very quickly. Covering indexes can also give you cheap sorts, if they're designed carefully.
If you don't do an order by, then there is absolutely no guarentee of the order of returned records. Especially on a multiprocessor server that's running queries in parallel. Hash matches or hash aggregates can also 'reorder' the data in ways you don't expect.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 22, 2005 at 3:58 am
Thanks for the input guys. I guess one needs to sort it explicityly instead of leaving it to the indexes to "get it right" for you.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply