RANK/TOP WITHOUT TIES

Question

Post reply

RANK/TOP WITHOUT TIES

bkmooney

SSCrazy

Points: 2041
More actions
August 28, 2012 at 2:55 pm

#280488

Hopefully someone can set me straight here - I am looking for a way to select the top n items, so long as the n+1th item does NOT have the same value as the nth item.
Basically, I'm looking for the opposite of:
SELECT TOP 5 WITH TIES col1
FROM Mytable
ORDER BY col1
If sorting all values of col1 would give these values: 5,4,3,2,1,1,1
... then the WITH TIES query above would return 7 rows...
I'm looking for a query that would return only 4 rows: 5,4,3,2
Since including all of the 1's would push it over the limit of n rows, I want to exclude ALL rows with this value, not just the one that happens to get sorted first.
Basically, I want the TOP n, except for when the TOP WITH TIES n will have more than n rows.
I hope I'm making sense here - as I feel there's got to be a clever way to do this that I'm just not seeing immediately.
Thanks,
Brigid

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Dwain Camps SSC Guru Points: 86978 More actions · Answer 1

Look at the RANK() and DENSE_RANK() window functions.

http://msdn.microsoft.com/en-us/library/ms176102.aspx

http://msdn.microsoft.com/en-us/library/ms173825.aspx

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

bkmooney SSCrazy Points: 2041 More actions · Answer 2

dwain.c (8/28/2012)
Look at the RANK() and DENSE_RANK() window functions.

Not sure how RANK() OR DENSE_RANK() would help in this example. I thought a lot about the functions RANK, DENSE_RANK, and ROW_NUMBER and am hoping someone can see something clever that won't necessitate joining on the result of these functions...

These functions applied to the "col1" data I gave above would give:

col1rankdenserow_num

5111

4222

3333

2444

1555

1556

1557

So basically, I would like something that says something like:

select where rank <= 5 and the max row_num associated with that rank is also <= 5

Is there a way to set the assignment of the rank function? So that instead of setting all the tied values to the next integer, it would set it to the maximum row number for that value?

My ideal MYRANK function would give

col1myrank

51

42

33

24

17

Then I could just SELECT col1 WHERE myrank <= 5.

I feel like there's got to be a simple explanation that I'm just not seeing... all help is greatly appreciated!

Sean Lange SSC Guru Points: 286590 More actions · Answer 3

This is pretty simple using Row_Number. Please notice how I posted sample data to make this a lot easier for anybody else. You should do something similar in your future posts.

;with MyRank (col1)

as

(

select 5 union all

select 4 union all

select 3 union all

select 2 union all

select 1 union all

select 1

)

select col1, MAX(RowNum) as MaxRowNum

from

(

select col1, ROW_NUMBER() over (order by col1 desc) as RowNum

from MyRank

) x

group by col1

having Max(RowNum) <= 5

order by col1 desc

Or for those who would prefer cacading ctes, you can modify it like this.

;with MyRank (col1)

as

(

select 5 union all

select 4 union all

select 3 union all

select 2 union all

select 1 union all

select 1

)

, MaxRank as

(

select col1, ROW_NUMBER() over (order by col1 desc) as RowNum

from MyRank

)

select col1, MAX(RowNum) as MaxRowNum

from MaxRank

group by col1

having Max(RowNum) <= 5

order by col1 desc

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Luis Cazares SSC Guru Points: 183706 More actions · Answer 4

Another option:

;with MyRank (col1)

as

(

select 5 union all

select 4 union all

select 3 union all

select 2 union all

select 1 union all

select 1

)

SELECT TOP 5 col1

FROM MyRank

GROUP BY col1

HAVING COUNT(*) = 1

ORDER BY col1

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Sean Lange SSC Guru Points: 286590 More actions · Answer 5

Luis Cazares (8/29/2012)
Another option:
;with MyRank (col1)
as
(
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 1 union all
select 1 union all
select 1
)
SELECT TOP 5 col1
FROM MyRank
GROUP BY col1
HAVING COUNT(*) = 1
ORDER BY col1

That doesn't quite meet the requirements. If for example they wanted to top 7 this code would not work because it would not return a row where col1 = 1.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Luis Cazares SSC Guru Points: 183706 More actions · Answer 6

I tested my last solution and won't give the exact results.

Try this instead:

;with MyRank (col1)

as

(

select 6 union all

select 5 union all

select 4 union all

select 3 union all

select 2 union all

select 1

), MyRank2

as

(

SELECT TOP 5 col1, COUNT(*) AS RCount

FROM MyRank

GROUP BY col1

)

SELECT col1

FROM MyRank2

WHERE RCount = 1

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Sean Lange SSC Guru Points: 286590 More actions · Answer 7

Luis Cazares (8/29/2012)
I tested my last solution and won't give the exact results.
Try this instead:
;with MyRank (col1)
as
(
select 6 union all
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 2 union all
select 1
), MyRank2
as
(
SELECT TOP 5 col1, COUNT(*) AS RCount
FROM MyRank
GROUP BY col1
)
SELECT col1
FROM MyRank2
WHERE RCount = 1

Still not quite right. You have a TOP but not Order By. Which top 5 will it get?

Adding the order by seems to get it though.

;with MyRank (col1)

as

(

select 6 union all

select 5 union all

select 4 union all

select 3 union all

select 2 union all

select 1

), MyRank2

as

(

SELECT TOP 5 col1, COUNT(*) AS RCount

FROM MyRank

GROUP BY col1

order by col1 desc --Need this order by to ensure which top 5

)

SELECT col1

FROM MyRank2

WHERE RCount = 1

This seems to have a slightly simpler execution plan than mine too. 😛 We have certainly proven there is more than 1 way to do this. I am sure somebody will come along with another one within a couple hours.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 8

Sean Lange (8/29/2012)
Luis Cazares (8/29/2012)
I tested my last solution and won't give the exact results.
Try this instead:
;with MyRank (col1)
as
(
select 6 union all
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 2 union all
select 1
), MyRank2
as
(
SELECT TOP 5 col1, COUNT(*) AS RCount
FROM MyRank
GROUP BY col1
)
SELECT col1
FROM MyRank2
WHERE RCount = 1
Still not quite right. You have a TOP but not Order By. Which top 5 will it get?
Adding the order by seems to get it though.
;with MyRank (col1)
as
(
select 6 union all
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 2 union all
select 1
), MyRank2
as
(
SELECT TOP 5 col1, COUNT(*) AS RCount
FROM MyRank
GROUP BY col1
order by col1 desc --Need this order by to ensure which top 5
)
SELECT col1
FROM MyRank2
WHERE RCount = 1
This seems to have a slightly simpler execution plan than mine too. 😛 We have certainly proven there is more than 1 way to do this. I am sure somebody will come along with another one within a couple hours.

What does the OP want with the following?

;with MyRank (col1)

as

(

select 6 union all

select 5 union all

select 4 union all

select 3 union all

select 2 union all

select 1

), MyRank2

as

(

SELECT TOP 5 col1, COUNT(*) AS RCount

FROM MyRank

GROUP BY col1

order by col1 desc --Need this order by to ensure which top 5

)

SELECT col1

FROM MyRank2

WHERE RCount = 1

Luis Cazares SSC Guru Points: 183706 More actions · Answer 9

I had the idea that the group by would do the sort. However, it might not be the safest thing to do it.

Your query gave me some incorrect results when I change the values.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Sean Lange SSC Guru Points: 286590 More actions · Answer 10

Actually I don't think your works Luis.

I added some more data and it doesn't do what the OP asked for originally.

;with MyRank (col1)

as

(

select 6 union all

select 5 union all

select 4 union all

select 3 union all

select 2 union all

select 1

), MyRank2

as

(

SELECT TOP 5 col1, COUNT(*) AS RCount

FROM MyRank

GROUP BY col1

order by col1 desc --Need this order by to ensure which top 5

)

SELECT col1

FROM MyRank2

WHERE RCount = 1

If I understand the requirements this should only return 1 row. Trying the same with the Row_Number solution seems to still work as the OP stated in their original post and returns only col1 = 6.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

bkmooney SSCrazy Points: 2041 More actions · Answer 11

Thanks to both of you for your help and ideas - I thought I would be able to do this without subqueries with separate groups, but that may not be the case.

The only other additional chance I thought I had was to make a function combining RANK ASC and RANK DESC.

;with MyTable (col1)

as

(

select 1 union all

select 2 union all

select 3 union all

select 5 union all

select 9 union all

select 10 union all

select 10

)

SELECT col1

, RANK() OVER(ORDER BY col1 DESC) AS Rank_DESC

, RANK() OVER(ORDER BY col1 ASC) AS Rank_ASC

, RANK() OVER(ORDER BY col1 DESC) + RANK() OVER(ORDER BY col1 ASC) AS Rank_Sum

FROM MyTable

If you run this, you can see that the sum of RANK ASC + RANK DESC is greater than the number of rows only when that row is not part of a tie.

But, since you can't have RANK functions within a WHERE clause, this still leaves me as having a subquery of a sort. (As you can guess, my actual data is a good deal larger than this, so I was trying to avoid having to cache subquery results if it was possible.)

Thanks a bunch - and if my comments inspire any further ideas - I'd love to hear them!

Luis Cazares SSC Guru Points: 183706 More actions · Answer 12

You're right Sean, I missed something.

EDIT: I need more coffee or to read more carefully.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2