Order My Data

  • Mighty (12/2/2013)


    Guess that all has been said before. Only the "order by" answer is correct.

    Agree with some others to delete this question, before people might get a wrong idea.

    I totally agree. In fact, I didn't even bother to answer the question as there was only one correct answer. It is one of my "interview questions" for prospective hires. It is amazing how many people would "bet their life" on the order of rows returned without specifying "ORDER BY".


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • I could be wrong but if you pop the server into single user mode and an index that uses a as a lead column will return results in order every time if I recall. Damn useless if you need multiple users to access the server though 🙂

    Also I don't think that any edition of SQL Server other than Enterprise, datacentre or Developer does carousel scans does it?

    Yes order by is the only truly safe way to do this but there are other methods if I recall. Please set me straight if I'm wrong though.

  • rmunwin (12/2/2013)


    Yes order by is the only truly safe way to do this but there are other methods if I recall. Please set me straight if I'm wrong though.

    There are plenty of other methods which currently appear to work (almost) all of the time. Probably some that will definitely work all of the time in a particular version of SQLServer, though proving this could be a bit tricky 😉

    But without an Order By, there is absolutely no guarantee - it is entirely down to how the optimiser decides to process the query. The behaviour will not be documented (the only documentation tells you quite clearly that you must use Order By) and can change at any time, perhaps by adding a processor, perhaps by the MS programmers rewriting part of the optimiser.

  • rmunwin (12/2/2013)


    I could be wrong but if you pop the server into single user mode and an index that uses a as a lead column will return results in order every time if I recall. Damn useless if you need multiple users to access the server though 🙂

    Also I don't think that any edition of SQL Server other than Enterprise, datacentre or Developer does carousel scans does it?

    Yes order by is the only truly safe way to do this but there are other methods if I recall. Please set me straight if I'm wrong though.

    The following query was tested in both single and multi-user (SQL SERVER 2008 R2 SP 2) and in all cases, without using ORDER BY, the column indicated in the clustered index ('c') is sorted in ASC order.

    DBCC FreeProcCache

    IF EXISTS (SELECT * FROM Information_schema.Tables

    WHERE Table_name = 'aTest')

    DROP TABLE aTest;

    CREATE TABLE atest

    (

    a int NOT NULL,

    b int NOT NULL,

    c char(20) NULL

    );

    --returns a in ASC order

    --CREATE NONCLUSTERED INDEX ix_atest ON aTest(c, b, a);

    --returns c in ASC order

    CREATE CLUSTERED INDEX ix_atest1 ON aTest(c);

    --returns c in ASC order

    --CREATE UNIQUE CLUSTERED INDEX ix_atest ON aTest(c, b, a);

    DECLARE @i int;

    SET @i = 1;

    WHILE @i < 10000

    BEGIN

    declare @randString as char(20)

    set @randString = char(rand()*26+65)+char(rand()*26+65)+char(rand()*26+65)

    +char(rand()*26+65)+char(rand()*26+65)+char(rand()*26+65)+char(rand()*26+65)

    INSERT INTO aTest (a, b, c)

    SELECT @i, RAND() * @i, @randString;

    SET @i = @i + 1;

    END;

    SELECT a, c, b

    FROM aTest

    WHERE a BETWEEN 1 AND 150

  • steve.jacobs (12/3/2013)


    rmunwin (12/2/2013)


    I could be wrong but if you pop the server into single user mode and an index that uses a as a lead column will return results in order every time if I recall. Damn useless if you need multiple users to access the server though 🙂

    Also I don't think that any edition of SQL Server other than Enterprise, datacentre or Developer does carousel scans does it?

    Yes order by is the only truly safe way to do this but there are other methods if I recall. Please set me straight if I'm wrong though.

    The following query was tested in both single and multi-user (SQL SERVER 2008 R2 SP 2) and in all cases, without using ORDER BY, the column indicated in the clustered index ('c') is sorted in ASC order.

    DBCC FreeProcCache

    IF EXISTS (SELECT * FROM Information_schema.Tables

    WHERE Table_name = 'aTest')

    DROP TABLE aTest;

    CREATE TABLE atest

    (

    a int NOT NULL,

    b int NOT NULL,

    c char(20) NULL

    );

    --returns a in ASC order

    --CREATE NONCLUSTERED INDEX ix_atest ON aTest(c, b, a);

    --returns c in ASC order

    CREATE CLUSTERED INDEX ix_atest1 ON aTest(c);

    --returns c in ASC order

    --CREATE UNIQUE CLUSTERED INDEX ix_atest ON aTest(c, b, a);

    DECLARE @i int;

    SET @i = 1;

    WHILE @i < 10000

    BEGIN

    declare @randString as char(20)

    set @randString = char(rand()*26+65)+char(rand()*26+65)+char(rand()*26+65)

    +char(rand()*26+65)+char(rand()*26+65)+char(rand()*26+65)+char(rand()*26+65)

    INSERT INTO aTest (a, b, c)

    SELECT @i, RAND() * @i, @randString;

    SET @i = @i + 1;

    END;

    SELECT a, c, b

    FROM aTest

    WHERE a BETWEEN 1 AND 150

    Did you see the article I posted? You might want to read it. It demonstrates very clearly that not having an order by on a query can and will start to return results in a different order.

    http://blogs.msdn.com/b/conor_cunningham_msft/archive/2008/08/27/no-seatbelt-expecting-order-without-order-by.aspx

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (12/3/2013)


    steve.jacobs (12/3/2013)


    rmunwin (12/2/2013)


    I could be wrong but if you pop the server into single user mode and an index that uses a as a lead column will return results in order every time if I recall. Damn useless if you need multiple users to access the server though 🙂

    Also I don't think that any edition of SQL Server other than Enterprise, datacentre or Developer does carousel scans does it?

    Yes order by is the only truly safe way to do this but there are other methods if I recall. Please set me straight if I'm wrong though.

    The following query was tested in both single and multi-user (SQL SERVER 2008 R2 SP 2) and in all cases, without using ORDER BY, the column indicated in the clustered index ('c') is sorted in ASC order.

    DBCC FreeProcCache

    IF EXISTS (SELECT * FROM Information_schema.Tables

    WHERE Table_name = 'aTest')

    DROP TABLE aTest;

    CREATE TABLE atest

    (

    a int NOT NULL,

    b int NOT NULL,

    c char(20) NULL

    );

    --returns a in ASC order

    --CREATE NONCLUSTERED INDEX ix_atest ON aTest(c, b, a);

    --returns c in ASC order

    CREATE CLUSTERED INDEX ix_atest1 ON aTest(c);

    --returns c in ASC order

    --CREATE UNIQUE CLUSTERED INDEX ix_atest ON aTest(c, b, a);

    DECLARE @i int;

    SET @i = 1;

    WHILE @i < 10000

    BEGIN

    declare @randString as char(20)

    set @randString = char(rand()*26+65)+char(rand()*26+65)+char(rand()*26+65)

    +char(rand()*26+65)+char(rand()*26+65)+char(rand()*26+65)+char(rand()*26+65)

    INSERT INTO aTest (a, b, c)

    SELECT @i, RAND() * @i, @randString;

    SET @i = @i + 1;

    END;

    SELECT a, c, b

    FROM aTest

    WHERE a BETWEEN 1 AND 150

    Did you see the article I posted? You might want to read it. It demonstrates very clearly that not having an order by on a query can and will start to return results in a different order.

    http://blogs.msdn.com/b/conor_cunningham_msft/archive/2008/08/27/no-seatbelt-expecting-order-without-order-by.aspx

    I read several articles, including the article you noted and I know the pitfalls of not using ORDER BY (for ordering). What I'm saying is what I'm seeing on my Development and QA DB systems. I'm not arguing the point of not using ORDER BY.

  • It IS NOT guaranteed! Your test cases are invalid! Case in point:

    I have a table with an IDENTITY column as the PK with a clustered index on it. The table is ever increasing. It has hundreds, if not thousands of rows.

    Now for all of you that think a "SELECT * FROM..." would return the rows in PK (IDENTITY) order, you are wrong!

    The SQL Server optimizer, in this case will return the rows in physical block (page) order. Which in this case, will actually be scrambled. Why, you ask?

    In this business case, the table has a number of NULLable columns that are filled in later. When the NULLable columns are updated with values, some rows can no longer fit in the page. Therefore, they are moved to a new page (page split). Those new pages are allocated "somewhere". Without an ORDER BY clause, SQL Server has no reason to use any index to get to the data.

    Additionally, if SQL Server performs a parallel query, the rows will be returned in all sorts of random order.

    For all of you developers who do not specify an ORDER BY clause, and assume, based upon your invalid testing, that the data will always be returned in the correct order, you are making a grave mistake! It is the cause of more bugs in computer systems.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • Mauve (12/3/2013)


    It IS NOT guaranteed! Your test cases are invalid! Case in point:

    I have a table with an IDENTITY column as the PK with a clustered index on it. The table is ever increasing. It has hundreds, if not thousands of rows.

    Now for all of you that think a "SELECT * FROM..." would return the rows in PK (IDENTITY) order, you are wrong!

    The SQL Server optimizer, in this case will return the rows in physical block (page) order. Which in this case, will actually be scrambled. Why, you ask?

    In this business case, the table has a number of NULLable columns that are filled in later. When the NULLable columns are updated with values, some rows can no longer fit in the page. Therefore, they are moved to a new page (page split). Those new pages are allocated "somewhere". Without an ORDER BY clause, SQL Server has no reason to use any index to get to the data.

    Additionally, if SQL Server performs a parallel query, the rows will be returned in all sorts of random order.

    For all of you developers who do not specify an ORDER BY clause, and assume, based upon your invalid testing, that the data will always be returned in the correct order, you are making a grave mistake! It is the cause of more bugs in computer systems.

    mauve, to begin, CALM DOWN!!!!. Secondly, I never stated it was guaranteed. Finally, you are correct pertaining to the NULLs and page allocations.

    I agree that you must use ORDER BY.

  • steve.jacobs (12/3/2013)


    Mauve (12/3/2013)


    It IS NOT guaranteed! Your test cases are invalid! Case in point:

    I have a table with an IDENTITY column as the PK with a clustered index on it. The table is ever increasing. It has hundreds, if not thousands of rows.

    Now for all of you that think a "SELECT * FROM..." would return the rows in PK (IDENTITY) order, you are wrong!

    The SQL Server optimizer, in this case will return the rows in physical block (page) order. Which in this case, will actually be scrambled. Why, you ask?

    In this business case, the table has a number of NULLable columns that are filled in later. When the NULLable columns are updated with values, some rows can no longer fit in the page. Therefore, they are moved to a new page (page split). Those new pages are allocated "somewhere". Without an ORDER BY clause, SQL Server has no reason to use any index to get to the data.

    Additionally, if SQL Server performs a parallel query, the rows will be returned in all sorts of random order.

    For all of you developers who do not specify an ORDER BY clause, and assume, based upon your invalid testing, that the data will always be returned in the correct order, you are making a grave mistake! It is the cause of more bugs in computer systems.

    mauve, to begin, CALM DOWN!!!!. Secondly, I never stated it was guaranteed. Finally, you are correct pertaining to the NULLs and page allocations.

    I agree that you must use ORDER BY.

    But it is an interesting question, anyway, and interesting discussion, too.

    Thanks, Steve!

  • Mauve (12/3/2013)


    Now for all of you that think a "SELECT * FROM..." would return the rows in PK (IDENTITY) order, you are wrong!

    The SQL Server optimizer, in this case will return the rows in physical block (page) order. Which in this case, will actually be scrambled. Why, you ask?

    Not necessarily, this was my point. If two people run the same query and the second starts before the first finishes then it will pick up the scan where the first query is upto in some editions of SQL Server. I never meant to say, and I jolly well hope I didn't because I know better, that you can get away without an order by clause. All I'm trying to say is that this is a much much deeper subject than most people had given it credit for 🙂 Sorry if I wasn't clear.

  • Thanks Steve for the question update.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • but no points adjustment - I demand back the points I've had forcibly taken from me!

    😉

  • Toreador (12/6/2013)


    but no points adjustment - I demand back the points I've had forcibly taken from me!

    😉

    😀

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • 🙂 Nice question.

Viewing 14 posts - 31 through 43 (of 43 total)

You must be logged in to reply to this topic. Login to reply