April 17, 2007 at 5:41 pm
I have a table that contains a SortOrder field that allows the user to re-arrange the order in which items are returned by the query.
tbl_Names
Name SortOrder
------ ---------
Jeff 1
Tom 2
Larry 5
Emily 11
The gaps in the Sort order are caused by records being deleted.
How can I write a query that renumbers the SortOrder column and preserves the original order. The result of the Update should be:
Name SortOrder
------ ---------
Jeff 1
Tom 2
Larry 3
Emily 4
I apologize if this is a trivial question. I cant seem to get my arms around it.
Thanks,
Steve
April 17, 2007 at 7:31 pm
UPDATE A
SET A.SortOrder =
(SELECT COUNT(*) FROM tbl_Names Z WHERE Z.SortOrder <= A.SortOrder)
FROM tbl_Names A
April 17, 2007 at 10:32 pm
Dont worry... it's not a trivial question...
There's a half dozen or more ways to do such a thing... Koji posted one of them... but, beware... Koji's code relies on a "triangular join" which is a bit more than half of a full cross-join... it will simply "die" after you get about 10,000 rows in the table.
So, my question would be, how many rows are in the table you need to do this on? Also, why do you need to do this? An INT column will handle over 2 billion numbers... why are you trying to recover the missing numbers?
--Jeff Moden
Change is inevitable... Change for the better is not.
April 17, 2007 at 10:50 pm
Make use of the new ROW_NUMBER() function available in SQL Server 2005.
And depending on your business rules, RANK() or DENSE_RANK() would be helpful too.
N 56°04'39.16"
E 12°55'05.25"
April 17, 2007 at 10:57 pm
Hi, Jeff,
I would like to study on "triangular join."
Would you point me to right direction?
Where can get some information?
April 18, 2007 at 9:05 am
'So, my question would be, how many rows are in the table you need to do this on? Also, why do you need to do this? An INT column will handle over 2 billion numbers... why are you trying to recover the missing numbers?'
There are over 100k rows in the table but the reordering just acts upon a subset of these at any one time. No more than 1000 would be in any subset. The reason for this renumbering is that the application allows a user to insert records before or after the specified record and the current application logic is all done based on ordinal position. So to insert a new record before record number 3 requires all records greater than 3 to be shifted down one and the new record to be inserted as SortOrder 3.
Can you give me an example of another type of solution to this so I can study the various approaches? Thanks!
Steve
April 18, 2007 at 11:03 pm
DECLARE
@Sample TABLE (Name VARCHAR(10), SortOrder INT)
INSERT
@Sample
SELECT 'Jeff', 1 UNION ALL
SELECT 'Tom', 2 UNION ALL
SELECT 'Larry', 5 UNION ALL
SELECT 'Emily', 11
SELECT
* FROM @Sample ORDER BY SortOrder
UPDATE s
SET s.SortOrder = x.RecID
FROM @Sample AS s
INNER JOIN (
SELECT Name,
SortOrder,
ROW_NUMBER() OVER (ORDER BY SortOrder) AS RecID
FROM @Sample
) AS x ON x.Name = s.Name AND x.SortOrder = s.SortOrder
SELECT
* FROM @Sample ORDER BY SortOrder
N 56°04'39.16"
E 12°55'05.25"
April 19, 2007 at 8:20 am
Not sure where to look, Koji... it's mostly personal experience. Like I said, it's really half a cross-join... I guess my suggestion would to be to readup on cross-joins and correlated sub-queries.
If you don't come up with anything worth while on those two searches, post back and I'll write something up...
--Jeff Moden
Change is inevitable... Change for the better is not.
April 19, 2007 at 8:22 am
heh... I guess I'm just gonna have to break down and upgrade one of these near days...
--Jeff Moden
Change is inevitable... Change for the better is not.
April 20, 2007 at 12:12 am
A simpler method!
DECLARE
@Sample TABLE (Name VARCHAR(10), SortOrder INT)
INSERT
@Sample
SELECT 'Jeff', 1 UNION ALL
SELECT 'Tom', 2 UNION ALL
SELECT 'Larry', 5 UNION ALL
SELECT 'Emily', 11
SELECT
* FROM @Sample ORDER BY SortOrder
UPDATE
s
SET s.SortOrder = s.RecID
FROM (
SELECT SortOrder,
ROW_NUMBER() OVER (ORDER BY SortOrder) AS RecID
FROM @Sample
) AS s
SELECT * FROM @Sample ORDER BY SortOrder
N 56°04'39.16"
E 12°55'05.25"
April 21, 2007 at 11:18 pm
Hi there, Koji,
Got your private message and thought I'd post my reply here so that others can see it, as well...
Koji,
There's not much to a Triangular Join (least, not up front)... like I said, it's a bit more than half a cross join...
Lets say you have two tables, A and B, each with 4 rows numbered 1 through 4 like this...
CREATE TABLE TableA (RowNum INT)
INSERT INTO TableA (RowNum)
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4
CREATE TABLE TableB (RowNum INT)
INSERT INTO TableB (RowNum)
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4
We can create an intentional cross-join between the two tables that will return every combination of A and B using either of the two follow code snippets...
--===== A cross-join that produces 16 combinations
SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B
FROM TableA a,
TableB b
--===== Same cross-join using new ANSI syntax produces 16 combinations
SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B
FROM TableA a
CROSS JOIN
TableB b
The results of both cross-joins form a result set that can be depicted much as you would a simple multiplication table (with 1 axis backwards from the other) like this... (blue area identifies all returns from the cross-join)... this is why the formula for the number of internally spawned rows for a cross-join is simply X2 for equal length tables or A*B for unequal length tables...
TableB |
4
3
2
1
TableA
1
1,4
1,3
1,2
1,1
2
2,4
2,3
2,2
2,1
3
3,4
3,3
3,2
3,1
4
4,4
4,3
4,2
4,1
A Triangular Join is nothing more than about half of a Cross-Join. An example using the two tables we created would be...
--===== A triangular-join that produces 10 combinations
SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B
FROM TableA a,
TableB b
WHERE a.RowNum <= b.RowNum
...and the result set can be represented as in the blue area of the following graphic... notice that it forms a triangle and that's why these are called "Triangular Joins". The formula for the number of interal rows spawned for this type ( <= ) of triangular join is (X2+X)/2 (the old "sum of the sequential numbers formula")...
TableB |
4
3
2
1
TableA
1
1,4
1,3
1,2
1,1
2
2,4
2,3
2,2
2,1
3
3,4
3,3
3,2
3,1
4
4,4
4,3
4,2
4,1
If your code snippet is sans an equality, like this...
--===== A triangular-join that produces 6 combinations
SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B
FROM TableA a,
TableB b
WHERE a.RowNum < b.RowNum
You still end up with a triangular join, but smaller. It's represented by the formula (if tables have equal row counts) of (X2-X)/2
TableB |
4
3
2
1
TableA
1
1,4
1,3
1,2
1,1
2
2,4
2,3
2,2
2,1
3
3,4
3,3
3,2
3,1
4
4,4
4,3
4,2
4,1
Last, but not least, you can do the "double" triangular join using a full inequality... this is great for making schedules of teams to play teams and is represented by the following code snippet...
--===== A triangular-join that produces 12 combinations
SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B
FROM TableA a,
TableB b
WHERE a.RowNum <> b.RowNum
And, from the following graphic, you can see why it's called a "double" triangular join. Because of the inequality, it makes good team play schedules because no team ever plays against itself... the formula to calculate rows is simply X2-X...
TableB |
4
3
2
1
TableA
1
1,4
1,3
1,2
1,1
2
2,4
2,3
2,2
2,1
3
3,4
3,3
3,2
3,1
4
4,4
4,3
4,2
4,1
Lot's of people use triangular joins to do things like make row numbers in their code. The problem is the number of internal joins the code does to produce the row numbers (some call this a running count)... take the following code, for example...
--===== Create a small demo table
CREATE TABLE TableC (RowNum INT)
INSERT INTO TableC (RowNum)
SELECT 84 UNION ALL
SELECT 5 UNION ALL
SELECT 99 UNION ALL
SELECT 12
... and, let's list all the values with a running count...
--===== List the items with a running count
SELECT RunCount = (SELECT COUNT(*) FROM TableC cs WHERE cs.SomeVal <= c.SomeVal),
c.SomeVal
FROM TableC c
ORDER BY RunCount
... notice the correlated subquery (a query that makes reference outside itself) and the big red triangular join it makes with the outer query? Although the result set that gets displayed only returns 4 rows, the number of INNER rows the query had to touch is (X2+X)/2. In this case, it had to touch 10 rows just to get the RunCount (X=#of rows in table which is 4 for this example). Check the ACTUAL (not estimated) execution plan... you'll see I'm right... then, it also has to touch the normal 4 rows to make the return. So, the real formula for the number of rows touched is actually ((X2+X)/2) + X.... or 14 rows just to return 4. Let's see what happens when we try that on a table of 10,000 rows...
((X2+X)/2)+X = ((10,0002+10,000)/2)+10,000 = 50,015,000 internal rows
Ya see that big ol' number of internal rows touched? That's right, it has to look at over 50 MILLION rows to generate a stupid little running count for 10,000 rows. Let's try it for 20,000 rows...
((X2+X)/2)+X = ((20,0002+20,000)/2)+20,000 = 200,030,000 internal rows
That's right... it's not twice as bad... it's almost 4 times worse!!!
THAT's why I say, "Be careful, it has a triangular join in it" and "Be careful, there's a correlated subquery in that" and "You've gotta be kidding me!!!".
So, with some restraint and the right criteria, triangular joins can be used for some pretty remarkable things... make finite schedules... do high speed dupe checks... etc. But, you've gotta be really careful... they can bring a CPU and Disk System right to it's knees if you're not.
What else would you like to know about trigangular joins? For more information, I recommend the following GOOGLE search... WHAT IS A "TRIANGULAR JOIN" SQL
You'll find that I'm not the only one that says "Be careful when using triangular joins".
--Jeff Moden
Change is inevitable... Change for the better is not.
April 22, 2007 at 9:58 am
The physically most efficient solution is to use a temporary table with an identity column to be new sort order.
IF OBJECT_ID('tempdb..#Sample') is not null drop table #Sample
IF OBJECT_ID('tempdb..#OrderNew') is not null drop table #OrderNew
go
-- Generate test data
select id as PK
, id as SortOrder
into #Sample
from master.dbo.sysobjects
create unique clustered index Sample_P on #Sample (PK)
go
-- Temporary table with new sortOrder
create table #OrderNew
( PK integer not null
, SortOrderOldinteger
, SortOrderNewinteger IDENTITY(1,1)
)
BEGIN TRAN
-- populate new sortOrder
-- Lock the source table to prevent any changes
insert into #OrderNew
( PK , SortOrderOld )
select PK , SortOrder
from #Sample with (TABLOCKX)
order by SortOrder
go
create unique clustered index OrderNew_P on #OrderNew (PK)
go
update #Sample
setSortOrder = ( #OrderNew.SortOrderNew * 2 ) -- allow gaps
from#OrderNew
where#OrderNew.PK = #Sample.Pk
go
commit
go
select * from #Sample
SQL = Scarcely Qualifies as a Language
April 22, 2007 at 10:22 am
That's probably right in SQL Server 2000... dunno about 2005 with that nice new RowNum and PARTITION OVER statement.
I do gotta ask though, why on Earth are you declaring an explicit transaction on a temporary table?
--Jeff Moden
Change is inevitable... Change for the better is not.
April 22, 2007 at 10:27 am
The #Sample is for demonstration purposes only but when it is a base table, ANY reordering process does not work should any other connection perform an insert, update or delete on the table.
SQL = Scarcely Qualifies as a Language
April 22, 2007 at 1:28 pm
Ok... let's say you do want an explicit transaction for that... where's the error check and rollback code that should go with it?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply