August 28, 2012 at 2:55 pm
Hopefully someone can set me straight here - I am looking for a way to select the top n items, so long as the n+1th item does NOT have the same value as the nth item.
Basically, I'm looking for the opposite of:
SELECT TOP 5 WITH TIES col1
FROM Mytable
ORDER BY col1
If sorting all values of col1 would give these values: 5,4,3,2,1,1,1
... then the WITH TIES query above would return 7 rows...
I'm looking for a query that would return only 4 rows: 5,4,3,2
Since including all of the 1's would push it over the limit of n rows, I want to exclude ALL rows with this value, not just the one that happens to get sorted first.
Basically, I want the TOP n, except for when the TOP WITH TIES n will have more than n rows.
I hope I'm making sense here - as I feel there's got to be a clever way to do this that I'm just not seeing immediately.
Thanks,
Brigid
August 28, 2012 at 7:21 pm
Look at the RANK() and DENSE_RANK() window functions.
http://msdn.microsoft.com/en-us/library/ms176102.aspx
http://msdn.microsoft.com/en-us/library/ms173825.aspx
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
August 29, 2012 at 7:19 am
dwain.c (8/28/2012)
Look at the RANK() and DENSE_RANK() window functions.
Not sure how RANK() OR DENSE_RANK() would help in this example. I thought a lot about the functions RANK, DENSE_RANK, and ROW_NUMBER and am hoping someone can see something clever that won't necessitate joining on the result of these functions...
These functions applied to the "col1" data I gave above would give:
col1rankdenserow_num
5111
4222
3333
2444
1555
1556
1557
So basically, I would like something that says something like:
select where rank <= 5 and the max row_num associated with that rank is also <= 5
Is there a way to set the assignment of the rank function? So that instead of setting all the tied values to the next integer, it would set it to the maximum row number for that value?
My ideal MYRANK function would give
col1myrank
51
42
33
24
17
17
17
Then I could just SELECT col1 WHERE myrank <= 5.
I feel like there's got to be a simple explanation that I'm just not seeing... all help is greatly appreciated!
August 29, 2012 at 7:43 am
This is pretty simple using Row_Number. Please notice how I posted sample data to make this a lot easier for anybody else. You should do something similar in your future posts.
;with MyRank (col1)
as
(
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 1 union all
select 1 union all
select 1
)
select col1, MAX(RowNum) as MaxRowNum
from
(
select col1, ROW_NUMBER() over (order by col1 desc) as RowNum
from MyRank
) x
group by col1
having Max(RowNum) <= 5
order by col1 desc
Or for those who would prefer cacading ctes, you can modify it like this.
;with MyRank (col1)
as
(
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 1 union all
select 1 union all
select 1
)
, MaxRank as
(
select col1, ROW_NUMBER() over (order by col1 desc) as RowNum
from MyRank
)
select col1, MAX(RowNum) as MaxRowNum
from MaxRank
group by col1
having Max(RowNum) <= 5
order by col1 desc
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
August 29, 2012 at 8:07 am
Another option:
;with MyRank (col1)
as
(
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 1 union all
select 1 union all
select 1
)
SELECT TOP 5 col1
FROM MyRank
GROUP BY col1
HAVING COUNT(*) = 1
ORDER BY col1
August 29, 2012 at 8:10 am
Luis Cazares (8/29/2012)
Another option:
;with MyRank (col1)
as
(
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 1 union all
select 1 union all
select 1
)
SELECT TOP 5 col1
FROM MyRank
GROUP BY col1
HAVING COUNT(*) = 1
ORDER BY col1
That doesn't quite meet the requirements. If for example they wanted to top 7 this code would not work because it would not return a row where col1 = 1.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
August 29, 2012 at 8:14 am
I tested my last solution and won't give the exact results.
Try this instead:
;with MyRank (col1)
as
(
select 6 union all
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 2 union all
select 1
), MyRank2
as
(
SELECT TOP 5 col1, COUNT(*) AS RCount
FROM MyRank
GROUP BY col1
)
SELECT col1
FROM MyRank2
WHERE RCount = 1
August 29, 2012 at 8:24 am
Luis Cazares (8/29/2012)
I tested my last solution and won't give the exact results.Try this instead:
;with MyRank (col1)
as
(
select 6 union all
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 2 union all
select 1
), MyRank2
as
(
SELECT TOP 5 col1, COUNT(*) AS RCount
FROM MyRank
GROUP BY col1
)
SELECT col1
FROM MyRank2
WHERE RCount = 1
Still not quite right. You have a TOP but not Order By. Which top 5 will it get?
Adding the order by seems to get it though.
;with MyRank (col1)
as
(
select 6 union all
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 2 union all
select 1
), MyRank2
as
(
SELECT TOP 5 col1, COUNT(*) AS RCount
FROM MyRank
GROUP BY col1
order by col1 desc --Need this order by to ensure which top 5
)
SELECT col1
FROM MyRank2
WHERE RCount = 1
This seems to have a slightly simpler execution plan than mine too. 😛 We have certainly proven there is more than 1 way to do this. I am sure somebody will come along with another one within a couple hours.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
August 29, 2012 at 8:29 am
Sean Lange (8/29/2012)
Luis Cazares (8/29/2012)
I tested my last solution and won't give the exact results.Try this instead:
;with MyRank (col1)
as
(
select 6 union all
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 2 union all
select 1
), MyRank2
as
(
SELECT TOP 5 col1, COUNT(*) AS RCount
FROM MyRank
GROUP BY col1
)
SELECT col1
FROM MyRank2
WHERE RCount = 1
Still not quite right. You have a TOP but not Order By. Which top 5 will it get?
Adding the order by seems to get it though.
;with MyRank (col1)
as
(
select 6 union all
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 2 union all
select 1
), MyRank2
as
(
SELECT TOP 5 col1, COUNT(*) AS RCount
FROM MyRank
GROUP BY col1
order by col1 desc --Need this order by to ensure which top 5
)
SELECT col1
FROM MyRank2
WHERE RCount = 1
This seems to have a slightly simpler execution plan than mine too. 😛 We have certainly proven there is more than 1 way to do this. I am sure somebody will come along with another one within a couple hours.
What does the OP want with the following?
;with MyRank (col1)
as
(
select 6 union all
select 5 union all
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 1
), MyRank2
as
(
SELECT TOP 5 col1, COUNT(*) AS RCount
FROM MyRank
GROUP BY col1
order by col1 desc --Need this order by to ensure which top 5
)
SELECT col1
FROM MyRank2
WHERE RCount = 1
August 29, 2012 at 8:31 am
I had the idea that the group by would do the sort. However, it might not be the safest thing to do it.
Your query gave me some incorrect results when I change the values.
August 29, 2012 at 8:35 am
Actually I don't think your works Luis.
I added some more data and it doesn't do what the OP asked for originally.
;with MyRank (col1)
as
(
select 6 union all
select 5 union all
select 5 union all
select 5 union all
select 5 union all
select 5 union all
select 5 union all
select 5 union all
select 4 union all
select 3 union all
select 2 union all
select 1
), MyRank2
as
(
SELECT TOP 5 col1, COUNT(*) AS RCount
FROM MyRank
GROUP BY col1
order by col1 desc --Need this order by to ensure which top 5
)
SELECT col1
FROM MyRank2
WHERE RCount = 1
If I understand the requirements this should only return 1 row. Trying the same with the Row_Number solution seems to still work as the OP stated in their original post and returns only col1 = 6.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
August 29, 2012 at 8:45 am
Thanks to both of you for your help and ideas - I thought I would be able to do this without subqueries with separate groups, but that may not be the case.
The only other additional chance I thought I had was to make a function combining RANK ASC and RANK DESC.
;with MyTable (col1)
as
(
select 1 union all
select 2 union all
select 2 union all
select 3 union all
select 5 union all
select 5 union all
select 9 union all
select 9 union all
select 10 union all
select 10
)
SELECT col1
, RANK() OVER(ORDER BY col1 DESC) AS Rank_DESC
, RANK() OVER(ORDER BY col1 ASC) AS Rank_ASC
, RANK() OVER(ORDER BY col1 DESC) + RANK() OVER(ORDER BY col1 ASC) AS Rank_Sum
FROM MyTable
If you run this, you can see that the sum of RANK ASC + RANK DESC is greater than the number of rows only when that row is not part of a tie.
But, since you can't have RANK functions within a WHERE clause, this still leaves me as having a subquery of a sort. (As you can guess, my actual data is a good deal larger than this, so I was trying to avoid having to cache subquery results if it was possible.)
Thanks a bunch - and if my comments inspire any further ideas - I'd love to hear them!
August 29, 2012 at 9:03 am
You're right Sean, I missed something.
EDIT: I need more coffee or to read more carefully.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply