top row number within partition

Question

Post reply

top row number within partition

Jules Bonnot

SSCertifiable

Points: 5488
More actions
May 6, 2009 at 5:53 am

#131217

i am ranking a select using row number with partition. Is there a way of adding another colum that will comtian the highest rank umber for each partitioned group.
so
1 5 val1
2 5 val1
3 5 val1
4 5 val1
5 5 val1
1 3 val2
2 3 val2
3 3 val2
1 2 val3
2 2 val3
thanks,
Jules
www.sql-library.com[/url]

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Jules Bonnot SSCertifiable Points: 5488 More actions · Answer 1

i hope its clear that its the second column i want to get. the first one is gerenerated by a rownumber function

www.sql-library.com[/url]

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 2

I assume you are partitioning on column 3, and that by "top-ranked" you mean the rows where row_number() = 1.

I'm changing up the data a bit in the example below to clarify which values are actually being returned. The trick is to take each distinct value in column 3 and cross apply a subquery to pull the first row for that value.

Please let me know if you have any questions.

Bob

declare @sample table (col1 int, col2 int, col3 varchar(5))

insert into @sample

select 1, 5, 'val1' union all

select 2, 6, 'val1' union all

select 3, 7, 'val1' union all

select 4, 8, 'val1' union all

select 5, 9, 'val1' union all

select 1, 3, 'val2' union all

select 2, 4, 'val2' union all

select 3, 5, 'val2' union all

select 1, 9, 'val3' union all

select 2, 3, 'val3'

--

select distinct s.col3,ca.col1,ca.col2

from @sample s

cross apply (select top 1 col1, col2

from @sample s2

where s2.col3 = s.col3

) ca

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Jules Bonnot SSCertifiable Points: 5488 More actions · Answer 3

but this means having another statement. I could use a max subquery to get these values in a second statement which i think would perform better than a cross apply. I was wondering if i could get those value repeating in the same select that generates the row numbers?

thanks for your help.

Jules

www.sql-library.com[/url]

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 4

I misunderstood, sorry about that. You are wanting to build the values you originally posted.

A summary query either before or after would give you those counts and a simple join should run quicker than a cross apply. You could use CTEs to do that and technically it's a single query (see below) but that's just semantics. The query plan is still going to involve two table scans. I don't think that's what you're asking for.

I can't think of a function that can "look ahead" to see your final values and then retroactively apply them to all rows. I'll stay subscribed to this in case someone else can give you a better suggestion.

May I ask why you want that max value included in every row?

declare @sample table (col1 varchar(5))

insert into @sample

select 'val1' union all

select 'val2' union all

select 'val3' union all

select 'val3'

--

;with cte1 as (select ROW_NUMBER() over (partition by col1 order by col1) as rowVal,col1 from @sample)

,cte2 as (select col1, max(rowVal)as maxVal from cte1 group by col1)

--

select c1.rowval,c2.maxVal,c2.col1

from cte1 c1

join cte2 c2 on c2.col1 = c1.col1

--

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Jules Bonnot SSCertifiable Points: 5488 More actions · Answer 5

for a large dataset would it not be better to insertt he results of the cte1 into a temp table and run the max query on that

www.sql-library.com[/url]

Mark Cowne One Orange Chip Points: 26969 More actions · Answer 6

Bob Hovious (5/6/2009)
I misunderstood, sorry about that. You are wanting to build the values you originally posted.
A summary query either before or after would give you those counts and a simple join should run quicker than a cross apply. You could use CTEs to do that and technically it's a single query (see below) but that's just semantics. The query plan is still going to involve two table scans. I don't think that's what you're asking for.
I can't think of a function that can "look ahead" to see your final values and then retroactively apply them to all rows. I'll stay subscribed to this in case someone else can give you a better suggestion.
May I ask why you want that max value included in every row?
declare @sample table (col1 varchar(5))
insert into @sample
select 'val1' union all
select 'val1' union all
select 'val1' union all
select 'val1' union all
select 'val1' union all
select 'val2' union all
select 'val2' union all
select 'val2' union all
select 'val3' union all
select 'val3'
--
;with cte1 as (select ROW_NUMBER() over (partition by col1 order by col1) as rowVal,col1 from @sample)
,cte2 as (select col1, max(rowVal)as maxVal from cte1 group by col1)
--
select c1.rowval,c2.maxVal,c2.col1
from cte1 c1
join cte2 c2 on c2.col1 = c1.col1
--

More simply

select ROW_NUMBER() over (partition by col1 order by col1) as rowVal,

COUNT(*) over (partition by col1) as maxVal,

col1

from @sample

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Jules Bonnot SSCertifiable Points: 5488 More actions · Answer 7

nice one mark. Didnt know you could use partition with count.

www.sql-library.com[/url]

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 8

Obviously, neither did I. But that's worth remembering. Thanks, Mark!

However, it seems the two CTE approach runs a little faster with a larger number of rows. The culprit is I/O. While the two-CTE scheme scans the @sample table twice, the COUNT(*) OVER method scans it once, but creates a workfile that gets scanned three times, resulting in a relatively huge number of logical reads. Something to consider if you are going to be regularly applying this to tables with thousands of rows.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

RBarryYoung SSC Guru Points: 143329 More actions · Answer 9

Jules Bonnot (5/7/2009)
for a large dataset would it not be better to insertt he results of the cte1 into a temp table and run the max query on that

That's definitely an "It Depends". If you can get it to one or two scans like Bob has now, that's almost always going to be faster than any temp table shuffling, even (and especially) for large datasets.

However, in many case the query might try to hold too much in memory or in worktables and then you might be better off breaking it down with intermediate temp tables. I generally write it as one query to begin with and then break it down from there if it has problems.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]