September 5, 2008 at 8:31 am
We have a client that claims their hosted app will not work correctly because the data is not returned in the correct order when they query it. Their DBA is not the sharpest knife in the drawer and refuses to consider the use of an ORDER BY clause to sort records. They simply want the records sorted correctly by default. I run the following query and view the execution plan:
SELECT * FROM ITEMCHG
WHERE RES = 001673
The index used is called: IX_ITEMCHG_RES and here is the create statement
CREATE INDEX [IX_ITEMCHG_RES] ON [dbo].[ITEMCHG]([RES]) WITH FILLFACTOR = 90 ON [PRIMARY]
GO
There is also a clustered index on the table for their primary key column.
Now here is the kicker. On their production server when I run the above query the results are returned with no apparent order. The client claims their app is breaking because the results are not ordered by the primary key ASC. Again, using an order by clause is not an option for some reason.
However, when they restore the backup of this database to another server the results ARE ordered by primary key ASC. I don't understand why there is a difference and am curious as to why. The "DBA" claims the issue is related to collation or a bad SQL Server 2000 install or something. I say just use an ORDER BY clause and be done with it but am curious as to what could cause the difference in ordering. Any ideas?
Thanks,
Chris
September 5, 2008 at 2:12 pm
The order of a result set can only be guaranteed by using an ORDER BY. Although You might be able to get it to "work" again by rebuilding the indexes, indexes do not guarantee the results (see Jeff Moden's article for an example: http://www.sqlservercentral.com/articles/Advanced+Querying/61716/), and using index hints doesn't allow the optimizer to determine the best way to run the query (your index might not always be the best one to use).
If the results have to be in order, you need to either use an order by or sort the results on the client in whatever form of resultset receives the data. I don't know of anything else that will *guarantee* that the results are returned in order.
September 5, 2008 at 3:24 pm
I have to agree with Chad in saying that the only way to gauruntee that the results are returned in a specified order is to use an order by clause, and it seems that this is the correct way to do it.
However, I will point out that if you do a select into statement and then select from the newly created table, I have never seen a case where this fails to come back in the specified order. In others, what could be done behind the scenes is:
Select *
into table2
from table1
order by WhateverOrder
drop table1
sp_rename 'table2', 'table1'
I would not recommend that for a lot of reasons though. I agree that the order by clause is the way to go if that is an option.
---
Timothy A Wiseman
SQL Blog: http://timothyawiseman.wordpress.com/
September 5, 2008 at 8:46 pm
Sorry... I overlooked the part about the "client's hosted app"... does that mean it's a 3rd party app that the client cannot change?
--Jeff Moden
Change is inevitable... Change for the better is not.
September 5, 2008 at 10:53 pm
Jeff, I must be missing something. Why would you want to do the sort in the gui instead of letting the server do that?
I do understand why you would not want to do it in a view, but I do not see why the gui is superior to doing it in a stored procedure?
---
Timothy A Wiseman
SQL Blog: http://timothyawiseman.wordpress.com/
September 5, 2008 at 10:58 pm
The client's DBA is a moron. He is free to recommend modifications that would require the application to use a SQL query using an ORDER BY clause or he is free to recommend that their in house application do client side sorting to address the problem. Yet because some server somewhere is returning the results in the order he and his application expects he claims that the problem with the query results not being returned in the order he expects is a problem with our hosted sql server instance.
September 6, 2008 at 12:15 am
here's what the server guarantees regarding ordering:
http://blogs.msdn.com/sqltips/archive/2005/07/20/441053.aspx%5B/url%5D
---------------------------------------
elsasoft.org
September 6, 2008 at 8:11 am
Sorry... I put some bum dope out :blush: and deleted this entry.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 6, 2008 at 8:46 am
However, even if the index hint is specified, there are ways to mess up the pseudo-ordering.
Parallelism (parallel scans and the subsequent repartition/merge streams return the data in no particular order)
Read uncommitted isolation level (as the storage engine has 2 options in this isolation level it doesn't have in any other - allocation order scans and merry-go-round scans)
Hash joins/hash aggregates (if applicable)
btw, Jeff, that's why the attempts to get a merry-go-round in your update article failed. Because it's an update it cannot be read uncommitted.
Also, while forcing an index does have the desired effect in this version of SQL (and it works for NC indexes too), there's no official guarantee that it will do so after the next service pack/next version. Essentially, it's an undocumented feature.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 6, 2008 at 8:53 am
I recently read an article where Itzik Ben-Gan showed that a clustered index does not guarantee the order of records for a SELECT. It had to do with an additional INSERT if I am not mistaken.
I'll try to find it again.
N 56°04'39.16"
E 12°55'05.25"
September 6, 2008 at 8:59 am
For proof of what read-uncommitted does to ordering...
(borrowed and slightly modified from WesleyB's old blog
CREATE TABLE tblClustered
(ID int,
MyDate smalldatetime,
TestField char(2000))
GO
CREATE CLUSTERED INDEX ixID ON tblClustered (ID, MyDate)
GO
DECLARE @i int
SET @i = 0
WHILE @i < 1000
BEGIN
INSERT INTO tblClustered (ID, MyDate, TestField) VALUES (RAND() * 1000, CONVERT(varchar, getdate(), 112), REPLICATE('T', 50))
SET @i = @i + 1
END
SELECT * FROM tblClustered with (index = 1, nolock)
SELECT * FROM tblClustered with (index = 1)
DROP TABLE tblClustered
And let me see if I can get a parallelism one to work... I usually struggle to get the optimiser to pick a parallel plan
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 6, 2008 at 9:01 am
http://www.sqlmag.com/Article/ArticleID/92888/sql_server_92888.html
Itzik Ben-Gan
Creating a clustered index on a table does not guarantee that the
data is stored in the file in index key order. Data movement caused
by page splits, changing index key values, and expanding dynamic
columns generate logical fragmentation. When you create or rebuild
an index on an existing table, SQL Server will make effort to create
it in a contiguous manner (least amount of fragmentation), but there
are no guarantees.
N 56°04'39.16"
E 12°55'05.25"
September 6, 2008 at 9:17 am
I have posted examples like this on SQLTeam several times that shows a select from a table with a clustered index that returns results not in cluster index order. If you want your query results in order, use an ORDER BY.
create table #t (number int primary key clustered)
insert into #t (number)
select
number
from
-- Number Table Function available here
--http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=47685
F_TABLE_NUMBER_RANGE(500,1000)
order by
number
insert into #t (number)
select
number
from
F_TABLE_NUMBER_RANGE(100,199)
order by
number
insert into #t (number)
select
number
from
F_TABLE_NUMBER_RANGE(1,50)
order by
number
insert into #t (number)
select
number
from
F_TABLE_NUMBER_RANGE(300,499)
order by
number
select * from #t
Out of order results returned:
(501 row(s) affected)
(100 row(s) affected)
(50 row(s) affected)
(200 row(s) affected)
number
-----------
500
501
502
... rows 503-995 omitted
996
997
998
999
1000
1
2
3
4
... rows 5-45 omitted
46
47
48
49
50
100
101
102
103
104
... rows 105-194 omitted
195
196
197
198
199
300
301
302
303
304
305
... rows 306-494 omitted
495
496
497
498
499
(851 row(s) affected)
September 6, 2008 at 9:19 am
http://www.sqlmag.com/Article/ArticleID/92888/sql_server_92888.html
Which is what the test with the read uncommitted shows.
If the isolation level is read committed or higher, and SQL has to scan the index it will scan in index page order. Essentially it will find the first page of the index leaf (the pages that stores the lowest index key value) and then will read along the index following the next-page pointer.
The result of that is that the data will be returned from that in order of the index key. What happens to the data later during the query processing is another matter.
In read uncommitted, the storage engine has an extra option. It can use an allocation order scan. That means it doesn't start with the page with the lowest index key, rather it starts with the page earliest in the data file (the one with the lowest PageID). SQL uses the IAM to find the pages that exist in the index and it reads them in the allocation order in the file, completely ignoring the index order.
This is why (in certain cases) a read-uncommitted index scan can miss or double read pages.
If the index is fragmented, then the order of pages in the file doesn't match the order order of page in the index and hence the data comes back 'unordered'
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 6, 2008 at 9:29 am
timothyawiseman (9/5/2008)
Jeff, I must be missing something. Why would you want to do the sort in the gui instead of letting the server do that?I do understand why you would not want to do it in a view, but I do not see why the gui is superior to doing it in a stored procedure?
Heh... I didn't say the GUI would be superior for sorting and, in fact, it frequently isn't superior. This type of question is always a very difficult question to answer on such limited information. My belief is that the DBA is trying to do the right thing, but may also not be taking other things into consideration...
The problem the DBA has implied is, how would you indicate which sort to do?
Hard coded - Very effective but is like adding ORDER BY to a View... it either limits what it can be used for to a rather specialized scope or the output has to be resorted to get a different sort or you have to have 1 sproc for sort order.
Dynamically defined - Also very effective, but either requires some dynamic SQL or some decision making in the Order By. Either way, it'll be anywhere from "almost as fast" as the hardcoded method to dreadfully slow with a full recompile every time it's called.
Regardless, both methods put an extra load on the server as sorting is a pretty expensive operation and that's probably why the DBA has dug his/her heels in. If the GUI or app makes a lot of calls to the server for sorted data, you end up with a lot of resource usage just sorting data. So, if you can off-load some of that to the client, then why not do it in the app?
Do I think the DBA or the OP is correct in their argument? Dunno... not enough information on scalable performance COST to the server or the COST in dollars to fix, test, and redeploy the app. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 21 total)
You must be logged in to reply to this topic. Login to reply