Multi Column Sort

Question

Multi Column Sort

Viewing 15 posts - 91 through 105 (of 180 total)

You must be logged in to reply to this topic. Login to reply

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 1

Christian Buettner (4/7/2009)
Hi Chris,
Very fancy stuff that you posted. Wasn't even aware you could do such fancy stuff in an ORDER BY:-).
Actually I did want to tell you that your queries did not work, but then I realized that mine did not either. Reason: The data in table #test is flawed.
SELECT A.id, b.id, A.col2, B.col2, a.col3, B.col3 FROM #test A, #test b
WHERE A.col2 > b.col2 AND a.col3 < B.col3
No matter how you sort, either col2 or col3 will remain unsorted (could apply to other combinations as well).

Neither was I mate until I saw something really wacky in an order by here on the forum some months ago!

Here's a data set which passes your test:

USE Utilities

GO

DROP TABLE #qtest

DROP TABLE #test

CREATE TABLE #qtest (id int, col1 numeric(6,2), col2 numeric(6,2), col3 numeric(6,2), col4 numeric(6,2), col5 numeric(6,2), col6 numeric(6,2), col7 numeric(6,2), col8 numeric(6,2))

INSERT INTO #qtest (id, col1, col2, col3, col4, col5, col6, col7, col8)

SELECT 1, 2.35, 3.01, 3.49, 4.25, 4.79, 5.36, 5.82, 6.31

UNION SELECT 2, null, 3.59, 4.32, null, 5.21, null, null, 7.12

UNION SELECT 3, NULL, null, null, 5.23, 5.41, 6.33, 6.89, 7.99

UNION SELECT 4, null, 5.46, 5.31, 5.92, null, 6.87, 7.34, null

UNION SELECT 5, 5.12, 5.66, 6.31, 6.92, null, 7.87, 8.34, null

--- 25,000 rows

SELECT ROW_NUMBER() OVER (ORDER BY Batch, id) AS [id],

col1, col2, col3, col4, col5, col6, col7, col8

INTO #test

FROM (SELECT CAST(1 AS INT) AS [Batch], *

FROM #qtest

UNION ALL

SELECT 1+(number/4), id, col1+number, col2+number, col3+number, col4+number, col5+number, col6+number, col7+number, col8+number

FROM #qtest, Utilities.dbo.Numbers

WHERE number%4 = 0 AND number <= (20000-4) -- 25,000

) d

/****** Object: Index [seq] Script Date: 04/04/2009 14:02:11 ******/

CREATE CLUSTERED INDEX [seqm] ON #test

([col1] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

GO

SELECT A.id, b.id, A.col2, B.col2, a.col3, B.col3 FROM #test A, #test b

WHERE A.col2 > b.col2 AND a.col3 < B.col3

I'll run everything through later this morning.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

GSquared SSC Guru Points: 260824 More actions · Answer 2

You're right. I was in a rush yesterday, and forgot to take into account the change from null to zero would affect the join math.

So far as I can tell, this version gives correct sorting results:

update #test

set col1 = isnull(col1,0),

col2 = isnull(col2,0),

col3 = isnull(col3,0),

col4 = isnull(col4,0),

col5 = isnull(col5,0),

col6 = isnull(col6,0),

col7 = isnull(col7,0),

col8 = isnull(col8,0)

;with CTE (CTEID, Seq1) as

(select t1.id, count(*)

from #test t1

inner join #test t2

on t1.col1 >= t2.col1

and t1.col2 >= t2.col2

and t1.col3 >= t2.col3

and t1.col4 >= t2.col4

and t1.col5 >= t2.col5

and t1.col6 >= t2.col6

and t1.col7 >= t2.col7

and t1.col8 >= t2.col8

group by t1.id)

select #test.*

from #test

left outer join CTE

on id = cteid

order by seq1;

On 39968 rows (generated using 10k rows from the Numbers table), it took 4:27 to finish on my machine.

The original spec doesn't care about the null vs zero thing. As need be, you could add an insert into a temp table, do the update to zero on that, sort, then return the final result with a select that either uses the original data or a NullIf on the temp table. You'll get the right output with any of those.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

Looks like it needs a little more tweaking - first column is correct sequential order:

12.353.013.494.254.795.365.826.31

20.003.594.320.005.210.000.007.12

30.000.000.005.235.416.336.897.99

40.005.465.315.920.006.877.340.00

90.009.469.319.920.0010.8711.340.00

55.125.666.316.920.007.878.340.00

70.007.598.320.009.210.000.0011.12

80.000.000.009.239.4110.3310.8911.99

130.000.000.0013.2313.4114.3314.8915.99

It's mighty fast though, a little over one minute on this lappy.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

GSquared SSC Guru Points: 260824 More actions · Answer 4

On the same dataset, without updating to zero, the complex Case statement took 8:57 to run.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 5

GSquared (4/7/2009)
On the same dataset, without updating to zero, the complex Case statement took 8:57 to run.

I'm missing something and can't figure out what it is! To simplify, here's a dataset generator for 15 rows:

USE Utilities

GO

DROP TABLE #qtest

DROP TABLE #test

CREATE TABLE #qtest (id int, col1 numeric(6,2), col2 numeric(6,2), col3 numeric(6,2), col4 numeric(6,2), col5 numeric(6,2), col6 numeric(6,2), col7 numeric(6,2), col8 numeric(6,2))

INSERT INTO #qtest (id, col1, col2, col3, col4, col5, col6, col7, col8)

SELECT 1, 2.35, 3.01, 3.49, 4.25, 4.79, 5.36, 5.82, 6.31

UNION SELECT 2, null, 3.59, 4.32, null, 5.21, null, null, 7.12

UNION SELECT 3, NULL, null, null, 5.23, 5.41, 6.33, 6.89, 7.99

UNION SELECT 4, null, 5.31, 5.46, 5.92, null, 6.87, 7.34, null

UNION SELECT 5, 5.12, 5.66, 6.31, 6.92, null, 7.87, 8.34, null

--- 25,000 rows

SELECT ROW_NUMBER() OVER (ORDER BY Batch, id) AS [id],

col1, col2, col3, col4, col5, col6, col7, col8

INTO #test

FROM (SELECT CAST(1 AS INT) AS [Batch], *

FROM #qtest

UNION ALL

SELECT 1+(number/4), id, col1+number, col2+number, col3+number, col4+number, col5+number, col6+number, col7+number, col8+number

FROM #qtest, Utilities.dbo.Numbers

WHERE number%4 = 0 AND number < 12 --<= (20000-4) -- 25,000

) d

/****** Object: Index [seq] Script Date: 04/04/2009 14:02:11 ******/

CREATE CLUSTERED INDEX [seqm] ON #test

([col1] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

GO

Here's the data it generates, which looks ok by eye:

12.353.013.494.254.795.365.826.31

20.003.594.320.005.210.000.007.12

30.000.000.005.235.416.336.897.99

40.005.315.465.920.006.877.340.00

55.125.666.316.920.007.878.340.00

66.357.017.498.258.799.369.8210.31

70.007.598.320.009.210.000.0011.12

80.000.000.009.239.4110.3310.8911.99

90.009.319.469.920.0010.8711.340.00

109.129.6610.3110.920.0011.8712.340.00

1110.3511.0111.4912.2512.7913.3613.8214.31

120.0011.5912.320.0013.210.000.0015.12

130.000.000.0013.2313.4114.3314.8915.99

140.0013.3113.4613.920.0014.8715.340.00

1513.1213.6614.3114.920.0015.8716.340.00

Here's the result of GSquared's query:

12.353.013.494.254.795.365.826.31

20.003.594.320.005.210.000.007.12

30.000.000.005.235.416.336.897.99

40.005.315.465.920.006.877.340.00

55.125.666.316.920.007.878.340.00

70.007.598.320.009.210.000.0011.12

80.000.000.009.239.4110.3310.8911.99

90.009.319.469.920.0010.8711.340.00

120.0011.5912.320.0013.210.000.0015.12

130.000.000.0013.2313.4114.3314.8915.99

140.0013.3113.4613.920.0014.8715.340.00

109.129.6610.3110.920.0011.8712.340.00

66.357.017.498.258.799.369.8210.31

1513.1213.6614.3114.920.0015.8716.340.00

1110.3511.0111.4912.2512.7913.3613.8214.31

:ermm:

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

GSquared SSC Guru Points: 260824 More actions · Answer 6

No, you're not missing anything. That solution just plain doesn't work. I missed a bit when looking at it. That's what I get for rushing.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Ramesh Saive SSC-Insane Points: 24275 More actions · Answer 7

Guys, It looks like an interesting problem for me and I would like to try my hands on it. But the problem is that I still not completely understood the requirement. What I understood is that the OP wants an ordering of columns from top to bottom moving from column 1 to column 8. Am I correct??

And I am assuming that column values in a row is incremental from left to right??

--Ramesh

Vijaya Kadiyala SSCrazy Eights Points: 9149 More actions · Answer 8

I am going in circles in circles in this thread...:blink:

Thanks -- Vijaya Kadiyala
www.dotnetvj.com
SQL Server Articles For Beginers

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 9

Ramesh (4/7/2009)
Guys, It looks like an interesting problem for me and I would like to try my hands on it. But the problem is that I still not completely understood the requirement. What I understood is that the OP wants an ordering of columns from top to bottom moving from column 1 to column 8. Am I correct??
And I am assuming that column values in a row is incremental from left to right??

That sounds about right, Ramesh. The sample data I posted earlier today will give you a visual.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Ramesh Saive SSC-Insane Points: 24275 More actions · Answer 10

Chris Morris (4/7/2009)
Ramesh (4/7/2009)
Guys, It looks like an interesting problem for me and I would like to try my hands on it. But the problem is that I still not completely understood the requirement. What I understood is that the OP wants an ordering of columns from top to bottom moving from column 1 to column 8. Am I correct??
And I am assuming that column values in a row is incremental from left to right??
That sounds about right, Ramesh. The sample data I posted earlier today will give you a visual.

Thanks Chris for the sample data.

I've understood the complexity in it and now its my turn to post my first solution (and hopefully the last as well). It took me an hour and a hell lot of brain hammering to product this code.

IF ( OBJECT_ID( 'tempdb..#test' ) IS NOT NULL )

DROP TABLE #test

CREATE TABLE #test

(

id INT NOT NULL PRIMARY KEY CLUSTERED,

col1 NUMERIC(18,2) NULL,

col2 NUMERIC(18,2) NULL,

col3 NUMERIC(18,2) NULL,

col4 NUMERIC(18,2) NULL,

col5 NUMERIC(18,2) NULL,

col6 NUMERIC(18,2) NULL,

col7 NUMERIC(18,2) NULL,

col8 NUMERIC(18,2) NULL

)

; WITH qtest

AS

(

SELECT 1 AS id, 2.35 AS col1, 3.01 AS col2, 3.49 AS col3, 4.25 AS col4, 4.79 AS col5, 5.36 AS col6, 5.82 AS col7, 6.31 AS col8

UNION ALL

SELECT 2, NULL, 3.59, 4.32, NULL, 5.21, NULL, NULL, 7.12

UNION ALL

SELECT 3, NULL, NULL, NULL, 5.23, 5.41, 6.33, 6.89, 7.99

UNION ALL

SELECT 4, NULL, 5.31, 5.46, 5.92, NULL, 6.87, 7.34, NULL

UNION ALL

SELECT 5, 5.12, 5.66, 6.31, 6.92, NULL, 7.87, 8.34, NULL

)

INSERT #test( id, col1, col2, col3, col4, col5, col6, col7, col8 )

SELECT ROW_NUMBER() OVER( ORDER BY Batch, id ) AS [id], col1, col2, col3, col4, col5, col6, col7, col8

FROM (

SELECT CAST( 1 AS INT ) AS [Batch], *

FROM qtest

UNION ALL

SELECT 1 + ( Number / 4 ), id, col1 + Number, col2 + Number, col3 + Number, col4 + Number,

col5 + Number, col6 + Number, col7 + Number, col8 + Number

FROM qtest, dbo.Numbers

WHERE Number % 4 = 0 AND Number < 12 ---<= (20000-4) -- 25,000

) d

-- Expected order of output

SELECT * FROM #test ORDER BY id

-- Unpivot the columns into rows, so that each id will have value of each column in terms of rows

-- note the unpivot operation skips the null valued columns from the resultset

; WITH UnpivotTest

AS

(

SELECT *

FROM #test t

UNPIVOT

(

OrderVal FOR OrderCol IN( [col1], [col2], [col3], [col4], [col5], [col6], [col7], [col8] )

) P

),

-- set the row number of each row partition by each column, this will get you the position of an item (id)

-- in each column, after that we'll get these values for the first 6 rows

-- id, col1, col2, col3, col4

-- 1, 1, 1, 1, 1

-- 2, -, 2, 2, -

-- 3, -, -, -, 2

-- 4, -, 3, 3, 3

-- 5, 2, 4, 4, 4

-- 6, 3, 5, 5, 5

-- After that we are taking the max. of each id and count for each id

-- and ordering the final resultset by id and then by count

VerticalOrderedTest

AS

(

SELECT id, MAX( rn ) AS rn, COUNT( * ) AS cnt

FROM (

SELECT ROW_NUMBER() OVER( PARTITION BY OrderCol ORDER BY ISNULL( NULLIF( OrderVal, 0 ), 9999 ) ) AS rn, id

FROM UnpivotTest

) t

GROUP BY id

)

SELECT v.id, t.col1, t.col2, t.col3, t.col4, t.col5, t.col6, t.col7, t.col8

FROM VerticalOrderedTest v

INNER JOIN #test t ON v.id = t.id

ORDER BY v.rn, v.cnt

--Ramesh

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 11

Can I play too?

DECLARE@Sample TABLE

(

id int primary key clustered,

col1 numeric(6, 2),

col2 numeric(6, 2),

col3 numeric(6, 2),

col4 numeric(6, 2),

col5 numeric(6,2),

col6 numeric(6,2),

col7 numeric(6,2),

col8 numeric(6,2)

)

INSERT@Sample

SELECT1, 2.35, 3.01, 3.49, 4.25, 4.79, 5.36, 5.82, 6.31 UNION ALL

SELECT2, NULL, 3.59, 4.32, NULL, 5.21, NULL, NULL, 7.12 UNION ALL

SELECT3, NULL, NULL, NULL, 5.23, 5.41, 6.33, 6.89, 7.99 UNION ALL

SELECT4, NULL, 5.31, 5.46, 5.92, NULL, 6.87, 7.34, NULL UNION ALL

SELECT5, 5.12, 5.66, 6.31, 6.92, NULL, 7.87, 8.34, NULL

-- Before

SELECT*

FROM@Sample

-- After

SELECTp.ID,

s.col1,

s.col2,

s.col3,

s.col4,

s.col5,

s.col6,

s.col7,

s.col8

FROM(

SELECTu.ID,

u.theValue,

ROW_NUMBER() OVER (PARTITION BY u.ID ORDER BY u.theValue) AS recID

FROM@Sample AS s

UNPIVOT(

theValue

FOR theCol IN (s.col1, s.col2, s.col3, s.col4, s.col5, s.col6, s.col7, s.col8)

) AS u

) AS d

PIVOT(

MAX(d.theValue)

FOR d.recID IN ([1], [2], [3], [4], [5], [6], [7], [8])

) AS p

INNER JOIN@Sample AS s ON s.ID = p.ID

ORDER BYp.[1],

p.[2],

p.[3],

p.[4],

p.[5],

p.[6],

p.[7],

p.[8]

EDIT: Added SQL-script as attachment.

N 56°04'39.16"
E 12°55'05.25"

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 12

Very elegant! Makes my CASEy stuff look like a dustbin full of spaghetti!

Who's gonna test with 40,000 rows then? 😀

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Ramesh Saive SSC-Insane Points: 24275 More actions · Answer 13

Peso,

I think the results of your query seems to be incorrect. Here are the results...

-- Expected Results

idcol1col2col3col4col5col6col7col8

12.353.013.494.254.795.365.826.31

2NULL3.594.32NULL5.21NULLNULL7.12

3NULLNULLNULL5.235.416.336.897.99

4NULL5.315.465.92NULL6.877.34NULL

55.125.666.316.92NULL7.878.34NULL

-- Actual Results

IDcol1col2col3col4col5col6col7col8

12.353.013.494.254.795.365.826.31

2NULL3.594.32NULL5.21NULLNULL7.12

55.125.666.316.92NULL7.878.34NULL

3NULLNULLNULL5.235.416.336.897.99

4NULL5.315.465.92NULL6.877.34NULL

--Ramesh

Ramesh Saive SSC-Insane Points: 24275 More actions · Answer 14

Chris Morris (4/8/2009)
Very elegant! Makes my CASEy stuff look like a dustbin full of spaghetti!
Who's gonna test with 40,000 rows then? 😀

Fortunately, I had enough time to test for 5000 rows, but not enough patience:w00t: to test for 40,000 rows.

Here are the results

--Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

--Table '#test'. Scan count 2, logical reads 136, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

--Ramesh' Version : 876 ms

--Warning: Null value is eliminated by an aggregate or other SET operation.

--Table '#test'. Scan count 3752, logical reads 255136, physical reads 0, read-ahead reads 10, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

--ChrisM' Version 3 ("condensed" version 2) : 36376 ms

Test code attached...

--Ramesh

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 15

Using Ramesh's test cases, I get these timings:

Ramesh' Version : 526 ms

ChrisM' Version 3 ("condensed" version 2) : 23480 ms

Peso : 236 ms

N 56°04'39.16"
E 12°55'05.25"