May 5, 2013 at 6:31 pm
I am having problems ordering records to go into another table. The following is my code:
USE
EURUSD
GO
SELECT *
INTO NEWTABLE
FROM OLDTABLE
ORDER BY RECORD
After processing into another table I still have a wrong starting record and gaps in the continuous records. When I process a SELECT TOP 1000 query I am expecting to see the first record proceeding through the 1000th. record. Instead I am seeing record 8625 forward. How do I create a query to create a new table with the proper sequential ordering?
May 5, 2013 at 7:03 pm
mcertini (5/5/2013)
I am having problems ordering records to go into another table. The following is my code:
USE
EURUSD
GO
SELECT *
INTO NEWTABLE
FROM OLDTABLE
ORDER BY RECORD
After processing into another table I still have a wrong starting record and gaps in the continuous records. When I process a SELECT TOP 1000 query I am expecting to see the first record proceeding through the 1000th. record. Instead I am seeing record 8625 forward. How do I create a query to create a new table with the proper sequential ordering?
This will create a new column using the special system IDENTITY function which works when doing SELECT INTO. Your rows will maintain their original keys but will now have a new sequentially numbered column.
SELECT
IDENTITY(INT,1,1) AS ID,
OLD.*
INTO NEWTABLE AS NEW
FROM OLDTABLE AS OLD
ORDER BY RECORD
Second option...this will also create a sequentially ordered column based on your sort criteria which gives you more flexibility on the sorting.
SELECT
ROW_NUMBER() OVER (ORDER BY OLD.RECORD, OLD.COL1, OLD.COL2 [etc]) AS ID,
OLD.*
INTO NEWTABLE AS NEW
FROM OLDTABLE AS OLD
There's no guarantee that when you insert large amounts of data into a table (especially an existing table) that the rows will be in physical order. If you create a pseudo-key like this for ordering you will still need to apply an ORDER BY to be sure of getting results back in your desired order.
If you really need the rows in physical order in the file you can drop all indexes and re-create a clustered index on the columns to be ordered. The index process will then by definition re-order the data physically based on whatever column definitions you define for the clustered index.
If for some reason your table is using uniqueidentifiers (GUIDs) as a sort key, there is a new SequentialGUID datatype available which can be used to keep your rows in sequential order. Otherwise, trying to sort based on a GUID's value is somewhat pointless.
May 5, 2013 at 7:23 pm
Steven,
Thank you for the reply it is much appreciated.
Mike
May 6, 2013 at 7:18 am
Just clarifying the post above, because we need to do this many times in our company...Isn't it true that the option above - an Identity column - only will work for ordering on small sets of data because of paging? I am not sure if this is what you are trying to say above. So, the only way to get a truly ordered data set is to use the clustered index? Will RowNumber() work on large sets also? Thanks.
Kris Robinett
May 6, 2013 at 7:40 am
I have to ask why order matters for the records stored in the table. SQL Server makes no guarantee about that. The only reliable way I am aware of to present records in a desired sequence is to use an order by clause when you select the records. I would never rely on the storage order of data in a SQL Server table.
This behavior is different in the AS400 world, where you can present data in arrival order, and influence that order with an order by in the table load process.
May 6, 2013 at 7:47 am
Many times we need to rely on records being next to each other to determine next Value ordered by a Date, let's say. So, if the identity column is called "Ref". We need to rely on Ref = Ref-1. It doesn't necessarily have to display in that order, but the Refs have to be in order based on the fields we determine. Here's a quick example of something we would do. The Clustered Index would be on Vendor, Item, and ChangeDate for this example.
Update tmpCost
set cur.NextCost = nxt.ListCost
from tmpCost cur
inner join tmpCost nxt
on cur.Vendor = nxt.Vendor
and cur.Item = nxt.Item
and cur.Ref + 1 = nxt.Ref
Thanks.
Kris Robinett
May 6, 2013 at 8:46 am
Ross McMicken (5/6/2013)
The only reliable way I am aware of to present records in a desired sequence is to use an order by clause when you select the records. I would never rely on the storage order of data in a SQL Server table.
That is absolutely correct 100%. Tables by their nature have no order. They are unordered sets. The only way to ensure order is with an order by clause. Storage order means absolutely nothing.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
May 6, 2013 at 9:12 am
Kricky (5/6/2013)
Many times we need to rely on records being next to each other to determine next Value ordered by a Date, let's say. So, if the identity column is called "Ref". We need to rely on Ref = Ref-1. It doesn't necessarily have to display in that order, but the Refs have to be in order based on the fields we determine. Here's a quick example of something we would do. The Clustered Index would be on Vendor, Item, and ChangeDate for this example.Update tmpCost
set cur.NextCost = nxt.ListCost
from tmpCost cur
inner join tmpCost nxt
on cur.Vendor = nxt.Vendor
and cur.Item = nxt.Item
and cur.Ref + 1 = nxt.Ref
Thanks.
This code breaks down if you have ever inserted rows out of sequence of the clustered key, though. For example, if you insert 10 rows for Vendor A/Item 1/ChangeDate 1-10, 10 rows for Vendor A/Item 2/ChangeDate 1-10, 10 rows for Vendor B/Item 1/ChangeDate 1-10, then 10 more rows for Vendor A/Item 1/ChangeDate 11-20, Vendor A/Item 1 will have rows with Ref 1-10 and 31-40 (the value of an IDENTITY column is incremented each time a row is inserted and doesn't depend on the clustered key order). Your update statement will update the rows with Ref 1-9 and Ref 31-39, but not Ref 10 because there is no row that satisfies the join condition cur.Ref + 1 = nxt.Ref.
Even with a clustered index, which imposes the logical order of the clustered key on the page chain holding the data for the table (pages in the page chain and rows on each page are ordered according to the clustered key), SQL Server can and will return rows in any order unless constrained by an ORDER BY clause. Writing DML that depends on row ordering to return the correct result without specifying an ORDER BY clause is a bad idea, period.
Jason Wolfkill
May 6, 2013 at 12:49 pm
If you create Clustered Index on RECORD column of new table. It sorts and stores data in the clustered index key.
May 6, 2013 at 1:17 pm
Neeraj Dwivedi (5/6/2013)
If you create Clustered Index on RECORD column of new table. It sorts and stores data in the clustered index key.
That is true for storage (well most of the time). But it is absolutely not true for retrieval. If you want rows in a certain order from a select there is ONLY 1 way do ensure the order, use an order by clause.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
May 6, 2013 at 1:20 pm
Neeraj Dwivedi (5/6/2013)
If you create Clustered Index on RECORD column of new table. It sorts and stores data in the clustered index key.
the clustered index, just because it is sorted, never, ever guarantees a specific, repeatable order of the data will be returned. it might work consistently coincidentally for a while, but a WHERE statement, other indexes, parallel processing, reindexing, row locks,and many other things can affect the ordering of the data in an unordered dataset.
ONLY an explicit Order By [ColumnName] will return consistent results in SQL server.
Lowell
May 6, 2013 at 1:28 pm
Thank you both Lowell & Sean for pointing my mistake out, I was thinking about the storing data & haven't read the post properly.
May 6, 2013 at 3:50 pm
Just to clarify: with ROW_NUMBER(), you don't have to ORDER BY all the columns you want to SELECT; just one ORDER BY column is fine if that's all you need. I don't see any real advantage of ROW_NUMBER() over identity in this case; seems just a matter of your own preference.
SELECT
ROW_NUMBER() OVER (ORDER BY OLD.[RECORD]) AS ID,
OLD.*
INTO NEWTABLE
FROM OLDTABLE AS OLD
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply