Sequential Ordering of a Column

Question

Post reply

Sequential Ordering of a Column

slserra

SSC Enthusiast

Points: 196
More actions
April 17, 2007 at 5:41 pm

#173724

I have a table that contains a SortOrder field that allows the user to re-arrange the order in which items are returned by the query.
tbl_Names
Name   SortOrder
------ ---------
Jeff          1
Tom         2
Larry        5
Emily       11
The gaps in the Sort order are caused by records being deleted.
How can I write a query that renumbers the SortOrder column and preserves the original order. The result of the Update should be:
Name   SortOrder
------ ---------
Jeff          1
Tom         2
Larry        3
Emily        4

I apologize if this is a trivial question. I cant seem to get my arms around it.

Thanks,
Steve

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

Koji Matsumura SSCrazy Points: 2241 More actions · Answer 1

UPDATE A

SET A.SortOrder =

(SELECT COUNT(*) FROM tbl_Names Z WHERE Z.SortOrder <= A.SortOrder)

FROM tbl_Names A

Jeff Moden SSC Guru Points: 1004504 More actions · Answer 2

Dont worry... it's not a trivial question...

There's a half dozen or more ways to do such a thing... Koji posted one of them... but, beware... Koji's code relies on a "triangular join" which is a bit more than half of a full cross-join... it will simply "die" after you get about 10,000 rows in the table.

So, my question would be, how many rows are in the table you need to do this on? Also, why do you need to do this? An INT column will handle over 2 billion numbers... why are you trying to recover the missing numbers?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 3

Make use of the new ROW_NUMBER() function available in SQL Server 2005.

And depending on your business rules, RANK() or DENSE_RANK() would be helpful too.

N 56°04'39.16"
E 12°55'05.25"

Koji Matsumura SSCrazy Points: 2241 More actions · Answer 4

Hi, Jeff,

I would like to study on "triangular join."

Would you point me to right direction?

Where can get some information?

slserra SSC Enthusiast Points: 196 More actions · Answer 5

'So, my question would be, how many rows are in the table you need to do this on? Also, why do you need to do this? An INT column will handle over 2 billion numbers... why are you trying to recover the missing numbers?'

There are over 100k rows in the table but the reordering just acts upon a subset of these at any one time. No more than 1000 would be in any subset. The reason for this renumbering is that the application allows a user to insert records before or after the specified record and the current application logic is all done based on ordinal position. So to insert a new record before record number 3 requires all records greater than 3 to be shifted down one and the new record to be inserted as SortOrder 3.

Can you give me an example of another type of solution to this so I can study the various approaches? Thanks!

Steve

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 6

DECLARE

@Sample TABLE (Name VARCHAR(10), SortOrder INT)

INSERT

@Sample

SELECT 'Jeff', 1 UNION ALL

SELECT 'Tom', 2 UNION ALL

SELECT 'Larry', 5 UNION ALL

SELECT 'Emily', 11

SELECT

* FROM @Sample ORDER BY SortOrder

UPDATE s

SET s.SortOrder = x.RecID

FROM @Sample AS s

INNER JOIN (

SELECT Name,

SortOrder,

ROW_NUMBER() OVER (ORDER BY SortOrder) AS RecID

FROM @Sample

) AS x ON x.Name = s.Name AND x.SortOrder = s.SortOrder

SELECT

* FROM @Sample ORDER BY SortOrder

N 56°04'39.16"
E 12°55'05.25"

Jeff Moden SSC Guru Points: 1004504 More actions · Answer 7

Not sure where to look, Koji... it's mostly personal experience. Like I said, it's really half a cross-join... I guess my suggestion would to be to readup on cross-joins and correlated sub-queries.

If you don't come up with anything worth while on those two searches, post back and I'll write something up...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004504 More actions · Answer 8

heh... I guess I'm just gonna have to break down and upgrade one of these near days...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 9

A simpler method!

DECLARE

@Sample TABLE (Name VARCHAR(10), SortOrder INT)

INSERT

@Sample

SELECT 'Jeff', 1 UNION ALL

SELECT 'Tom', 2 UNION ALL

SELECT 'Larry', 5 UNION ALL

SELECT 'Emily', 11

SELECT

* FROM @Sample ORDER BY SortOrder

UPDATE

s

SET s.SortOrder = s.RecID

FROM (

SELECT SortOrder,

ROW_NUMBER() OVER (ORDER BY SortOrder) AS RecID

FROM @Sample

) AS s

SELECT * FROM @Sample ORDER BY SortOrder

N 56°04'39.16"
E 12°55'05.25"

Jeff Moden SSC Guru Points: 1004504 More actions · Answer 10

Hi there, Koji,

Got your private message and thought I'd post my reply here so that others can see it, as well...

Koji,

There's not much to a Triangular Join (least, not up front)... like I said, it's a bit more than half a cross join...

Lets say you have two tables, A and B, each with 4 rows numbered 1 through 4 like this...

CREATE TABLE TableA (RowNum INT)

INSERT INTO TableA (RowNum)

SELECT 1 UNION ALL

SELECT 2 UNION ALL

SELECT 3 UNION ALL

SELECT 4

CREATE TABLE TableB (RowNum INT)

INSERT INTO TableB (RowNum)

SELECT 1 UNION ALL

SELECT 2 UNION ALL

SELECT 3 UNION ALL

SELECT 4

We can create an intentional cross-join between the two tables that will return every combination of A and B using either of the two follow code snippets...

--===== A cross-join that produces 16 combinations

SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B

FROM TableA a,

TableB b

--===== Same cross-join using new ANSI syntax produces 16 combinations

SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B

FROM TableA a

CROSS JOIN

TableB b

The results of both cross-joins form a result set that can be depicted much as you would a simple multiplication table (with 1 axis backwards from the other) like this... (blue area identifies all returns from the cross-join)... this is why the formula for the number of internally spawned rows for a cross-join is simply X²for equal length tables or A*B for unequal length tables...

TableB

4

3

2

1

TableA

1

1,4

1,3

1,2

1,1

2

2,4

2,3

2,2

2,1

3

3,4

3,3

3,2

3,1

4

4,4

4,3

4,2

4,1

A Triangular Join is nothing more than about half of a Cross-Join. An example using the two tables we created would be...

--===== A triangular-join that produces 10 combinations

SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B

FROM TableA a,

TableB b

WHERE a.RowNum <= b.RowNum

...and the result set can be represented as in the blue area of the following graphic... notice that it forms a triangle and that's why these are called "Triangular Joins". The formula for the number of interal rows spawned for this type ( <= ) of triangular join is (X²+X)/2 (the old "sum of the sequential numbers formula")...

TableB

4

3

2

1

TableA

1

1,4

1,3

1,2

1,1

2

2,4

2,3

2,2

2,1

3

3,4

3,3

3,2

3,1

4

4,4

4,3

4,2

4,1

If your code snippet is sans an equality, like this...

--===== A triangular-join that produces 6 combinations

SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B

FROM TableA a,

TableB b

WHERE a.RowNum < b.RowNum

You still end up with a triangular join, but smaller. It's represented by the formula (if tables have equal row counts) of (X²-X)/2

TableB

4

3

2

1

TableA

1

1,4

1,3

1,2

1,1

2

2,4

2,3

2,2

2,1

3

3,4

3,3

3,2

3,1

4

4,4

4,3

4,2

4,1

Last, but not least, you can do the "double" triangular join using a full inequality... this is great for making schedules of teams to play teams and is represented by the following code snippet...

--===== A triangular-join that produces 12 combinations

SELECT a.RowNum AS RowNum_A,b.RowNum AS RowNum_B

FROM TableA a,

TableB b

WHERE a.RowNum <> b.RowNum

And, from the following graphic, you can see why it's called a "double" triangular join. Because of the inequality, it makes good team play schedules because no team ever plays against itself... the formula to calculate rows is simply X²-X...

TableB

4

3

2

1

TableA

1

1,4

1,3

1,2

1,1

2

2,4

2,3

2,2

2,1

3

3,4

3,3

3,2

3,1

4

4,4

4,3

4,2

4,1

Lot's of people use triangular joins to do things like make row numbers in their code. The problem is the number of internal joins the code does to produce the row numbers (some call this a running count)... take the following code, for example...

--===== Create a small demo table

CREATE TABLE TableC (RowNum INT)

INSERT INTO TableC (RowNum)

SELECT 84 UNION ALL

SELECT 5 UNION ALL

SELECT 99 UNION ALL

SELECT 12

... and, let's list all the values with a running count...

--===== List the items with a running count

SELECT RunCount = (SELECT COUNT(*) FROM TableC cs WHERE cs.SomeVal <= c.SomeVal),

c.SomeVal

FROM TableC c

ORDER BY RunCount

... notice the correlated subquery (a query that makes reference outside itself) and the big red triangular join it makes with the outer query? Although the result set that gets displayed only returns 4 rows, the number of INNER rows the query had to touch is (X²+X)/2. In this case, it had to touch 10 rows just to get the RunCount (X=#of rows in table which is 4 for this example). Check the ACTUAL (not estimated) execution plan... you'll see I'm right... then, it also has to touch the normal 4 rows to make the return. So, the real formula for the number of rows touched is actually ((X²+X)/2) + X.... or 14 rows just to return 4. Let's see what happens when we try that on a table of 10,000 rows...

((X²+X)/2)+X = ((10,000²+10,000)/2)+10,000 = 50,015,000 internal rows

Ya see that big ol' number of internal rows touched? That's right, it has to look at over 50 MILLION rows to generate a stupid little running count for 10,000 rows. Let's try it for 20,000 rows...

((X²+X)/2)+X = ((20,000²+20,000)/2)+20,000 = 200,030,000 internal rows

That's right... it's not twice as bad... it's almost 4 times worse!!!

THAT's why I say, "Be careful, it has a triangular join in it" and "Be careful, there's a correlated subquery in that" and "You've gotta be kidding me!!!".

So, with some restraint and the right criteria, triangular joins can be used for some pretty remarkable things... make finite schedules... do high speed dupe checks... etc. But, you've gotta be really careful... they can bring a CPU and Disk System right to it's knees if you're not.

What else would you like to know about trigangular joins? For more information, I recommend the following GOOGLE search... WHAT IS A "TRIANGULAR JOIN" SQL

You'll find that I'm not the only one that says "Be careful when using triangular joins".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Carl Federl One Orange Chip Points: 25384 More actions · Answer 11

The physically most efficient solution is to use a temporary table with an identity column to be new sort order.

IF OBJECT_ID('tempdb..#Sample') is not null drop table #Sample

IF OBJECT_ID('tempdb..#OrderNew') is not null drop table #OrderNew

go

-- Generate test data

select id as PK

, id as SortOrder

into #Sample

from master.dbo.sysobjects

create unique clustered index Sample_P on #Sample (PK)

go

-- Temporary table with new sortOrder

create table #OrderNew

( PK integer not null

, SortOrderOldinteger

, SortOrderNewinteger IDENTITY(1,1)

)

BEGIN TRAN

-- populate new sortOrder

-- Lock the source table to prevent any changes

insert into #OrderNew

( PK , SortOrderOld )

select PK , SortOrder

from #Sample with (TABLOCKX)

order by SortOrder

go

create unique clustered index OrderNew_P on #OrderNew (PK)

go

update #Sample

setSortOrder = ( #OrderNew.SortOrderNew * 2 ) -- allow gaps

from#OrderNew

where#OrderNew.PK = #Sample.Pk

go

commit

go

select * from #Sample

SQL = Scarcely Qualifies as a Language

Jeff Moden SSC Guru Points: 1004504 More actions · Answer 12

That's probably right in SQL Server 2000... dunno about 2005 with that nice new RowNum and PARTITION OVER statement.

I do gotta ask though, why on Earth are you declaring an explicit transaction on a temporary table?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Carl Federl One Orange Chip Points: 25384 More actions · Answer 13

The #Sample is for demonstration purposes only but when it is a base table, ANY reordering process does not work should any other connection perform an insert, update or delete on the table.

SQL = Scarcely Qualifies as a Language

Jeff Moden SSC Guru Points: 1004504 More actions · Answer 14

Ok... let's say you do want an explicit transaction for that... where's the error check and rollback code that should go with it?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)