The statistics affect the sort ?!

  • I met a case like that:

    when i run the following t-sql,I find the order of one column is not correct:

    select b.ID,b.Seq

    from IcTable a,IcStock b

    where b.ID='333295'

    and a.FID=b.FSourceID

    the result is:

    ID Seq

    ----------- -----------

    333295 1

    333295 2

    333295 4

    333295 3

    333295 5

    But there are a clustered index on table IcStock(id,seq)

    and I run the other t-sql :

    select a.ID,a.Seq

    fromIcStock a

    where a.ID='333295'

    ID Seq

    ----------- -----------

    333295 1

    333295 2

    333295 3

    333295 4

    333295 5

    the order on seq is right!

    So I look up the statistics about the clustered index on IcStock,I got the id=333295 was above the max value of RANGE_HI_KEY :

    RANGE_HI_KEY

    ---------------------

    ....

    303529

    333257

    333258

    I think it may be the reason why sorting is wrong.

    Then I update the statistics !

    When I run the t-sql following immediately,I got the same wrong result!

    select b.ID,b.Seq

    from IcTable a,IcStock b

    where b.ID='333295'

    and a.FID=b.FSourceID

    But after 1 min or 2 min later,I run again ,and got the right result.

    My question:

    1.Will the statistics affect the sorting ?

    2.After I update the statistics,does the Query Optimzer use the statistics immediate?

    Aha ..do not tell me to use the ORDER BY~because the ORDER BY can lead the right result~what i want is the fact~

    tks ,every one

    ---------------------------------------
    Thorn Bird...

  • The statistics might affect the chosen query plan.

    The query plan might affect the order of the returned rows.

    If the order of returned rows is important for you, you MUST specify ORDER BY. Without ORDER BY the SQL server is allowed to return data in any order it feels like.

    /SG

  • The fact is that data is neither stored nor retreived in precise order. Without the ORDER BY, that I'm not telling you to use, you have no guarantee how the data will be retrieved. It may be retrieved in the order in which pages are accessed. It may be retrieved in the order in which the linked list points through the data. It might just be random because of some other operation within the query. You can't possibly know. The only way to guarantee order is to use the, apparently unwanted, ORDER BY statement.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Very appreciate your answer~

    You say "The fact is that data is neither stored nor retreived in precise order" , but when there is a clustered index on one column,will the data on that column be sorted precisely?

    😀

    AND WHAT THE ANSWER FOR THIS QUESTION?

    2.After I update the statistics,does the Query Optimzer use the statistics immediate?

    ---------------------------------------
    Thorn Bird...

  • It's stored roughly in the correct order. Roughly - as in it may not be exactly sorted within the page.

    That said, like others have said - the physical order has no bearing on how things come back from a query. SQL Server is a set engine relying on performing operations on unordered sets. Ordering is one of the very last things that happens, and might not happen at all unless you specifically request it. So - as was mentioned by others - order is guaranteed only when you use an ORDER BY clause.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • kinzent (4/28/2010)


    You say "The fact is that data is neither stored nor retreived in precise order" , but when there is a clustered index on one column,will the data on that column be sorted precisely?

    Only logically. Physically - it might or might not be, as Matt said.

    AND WHAT THE ANSWER FOR THIS QUESTION? 2.After I update the statistics,does the Query Optimzer use the statistics immediate?

    Yes. Also, any stored execution plan that referenced the (updated) statistics when it was compiled will cause a optimality-related recompilation of that plan next time it is used.

  • Thanks for your help~:-P

    ---------------------------------------
    Thorn Bird...

  • Late getting back in, but Paul answered the questions same way I would have, only better.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Paul White NZ (4/28/2010)


    kinzent (4/28/2010)


    You say "The fact is that data is neither stored nor retreived in precise order" , but when there is a clustered index on one column,will the data on that column be sorted precisely?

    Only logically. Physically - it might or might not be, as Matt said.

    Paul, could you possibly expand a little on what you mean here ?

    What do you mean by logical and physical in this context ?

    /SG

  • Stefan_G (4/29/2010)


    Paul White NZ (4/28/2010)


    kinzent (4/28/2010)


    You say "The fact is that data is neither stored nor retreived in precise order" , but when there is a clustered index on one column,will the data on that column be sorted precisely?

    Only logically. Physically - it might or might not be, as Matt said.

    Paul, could you possibly expand a little on what you mean here ?

    What do you mean by logical and physical in this context ?

    /SG

    Logically the data is sorted in precise order, meaning that the index knows exactly which comes next in sequence. But the storage mechanism may have the data stored in an order that doesn't match the logical order of the index. Pretty much what Matt described up above.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (4/29/2010)


    Logically the data is sorted in precise order, meaning that the index knows exactly which comes next in sequence. But the storage mechanism may have the data stored in an order that doesn't match the logical order of the index. Pretty much what Matt described up above.

    But for a clustered index the data is as far as I know stored sorted within each page. Or am I mistaken ?

  • Rows are not necessarily sorted within a page, but there's a slot index that defines the order that they should be in. The pages however are not necessarily stored in the physical order of the clustering key. Logically ordered, yes.

    Besides, without an order by, there's no obligation for the query processor to read the data ordered by the cluster, nor is there any obligation for it to maintain that order, and there are several query operators that change row order.

    Bottom line, no ORDER BY, no guarantee of order of returned data.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It's down to page splitting as inserts and updates occur, the data may not fit on a given page. At that point the page is split. Half the data remains on the original page. Half the data moves to a new page. Because SQL Server is not going to rearrange all the pages every time a page split occurs (thank the gods), the allocation of the page can, and and probably will, go to somewhere else other than the last page in the list. In addition to worrying about page splits and their affect on the stored data, within the page, new data is added to the end of page, even though it may logically be higher in the hiearchy than the other data in the page. Again, SQL Server doesn't rearrange everything because of these inserts. Instead if moves around just the stuff it has to, the information on the page, the information between pages and the information on parent pages that point down to the appropriate locations.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • But... almost forgot this, part of why you use monotonically increasing values as clustered keys so often (or identities defaulted to the primary key) is because inserts always occur at the end of the chain, so you don't see page splits nearly as often.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (4/29/2010)


    But... almost forgot this, part of why you use monotonically increasing values as clustered keys so often (or identities defaulted to the primary key) is because inserts always occur at the end of the chain, so you don't see page splits nearly as often.

    However updates that grow the row can still cause splits if the fill factor was too high. There shouldn't be as many as say a cluster on a random guid, but there could still be lots of splits.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply