May 6, 2009 at 5:53 am
May 6, 2009 at 6:04 am
May 6, 2009 at 7:09 am
I assume you are partitioning on column 3, and that by "top-ranked" you mean the rows where row_number() = 1.
I'm changing up the data a bit in the example below to clarify which values are actually being returned. The trick is to take each distinct value in column 3 and cross apply a subquery to pull the first row for that value.
Please let me know if you have any questions.
Bob
declare @sample table (col1 int, col2 int, col3 varchar(5))
insert into @sample
select 1, 5, 'val1' union all
select 2, 6, 'val1' union all
select 3, 7, 'val1' union all
select 4, 8, 'val1' union all
select 5, 9, 'val1' union all
select 1, 3, 'val2' union all
select 2, 4, 'val2' union all
select 3, 5, 'val2' union all
select 1, 9, 'val3' union all
select 2, 3, 'val3'
--
select distinct s.col3,ca.col1,ca.col2
from @sample s
cross apply (select top 1 col1, col2
from @sample s2
where s2.col3 = s.col3
) ca
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
May 6, 2009 at 8:00 am
but this means having another statement. I could use a max subquery to get these values in a second statement which i think would perform better than a cross apply. I was wondering if i could get those value repeating in the same select that generates the row numbers?
thanks for your help.
Jules
May 6, 2009 at 8:24 am
I misunderstood, sorry about that. You are wanting to build the values you originally posted.
A summary query either before or after would give you those counts and a simple join should run quicker than a cross apply. You could use CTEs to do that and technically it's a single query (see below) but that's just semantics. The query plan is still going to involve two table scans. I don't think that's what you're asking for.
I can't think of a function that can "look ahead" to see your final values and then retroactively apply them to all rows. I'll stay subscribed to this in case someone else can give you a better suggestion.
May I ask why you want that max value included in every row?
declare @sample table (col1 varchar(5))
insert into @sample
select 'val1' union all
select 'val1' union all
select 'val1' union all
select 'val1' union all
select 'val1' union all
select 'val2' union all
select 'val2' union all
select 'val2' union all
select 'val3' union all
select 'val3'
--
;with cte1 as (select ROW_NUMBER() over (partition by col1 order by col1) as rowVal,col1 from @sample)
,cte2 as (select col1, max(rowVal)as maxVal from cte1 group by col1)
--
select c1.rowval,c2.maxVal,c2.col1
from cte1 c1
join cte2 c2 on c2.col1 = c1.col1
--
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
May 7, 2009 at 1:26 am
May 7, 2009 at 3:26 am
Bob Hovious (5/6/2009)
I misunderstood, sorry about that. You are wanting to build the values you originally posted.A summary query either before or after would give you those counts and a simple join should run quicker than a cross apply. You could use CTEs to do that and technically it's a single query (see below) but that's just semantics. The query plan is still going to involve two table scans. I don't think that's what you're asking for.
I can't think of a function that can "look ahead" to see your final values and then retroactively apply them to all rows. I'll stay subscribed to this in case someone else can give you a better suggestion.
May I ask why you want that max value included in every row?
declare @sample table (col1 varchar(5))
insert into @sample
select 'val1' union all
select 'val1' union all
select 'val1' union all
select 'val1' union all
select 'val1' union all
select 'val2' union all
select 'val2' union all
select 'val2' union all
select 'val3' union all
select 'val3'
--
;with cte1 as (select ROW_NUMBER() over (partition by col1 order by col1) as rowVal,col1 from @sample)
,cte2 as (select col1, max(rowVal)as maxVal from cte1 group by col1)
--
select c1.rowval,c2.maxVal,c2.col1
from cte1 c1
join cte2 c2 on c2.col1 = c1.col1
--
More simply
select ROW_NUMBER() over (partition by col1 order by col1) as rowVal,
COUNT(*) over (partition by col1) as maxVal,
col1
from @sample
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537May 7, 2009 at 6:09 am
May 7, 2009 at 1:28 pm
Obviously, neither did I. But that's worth remembering. Thanks, Mark!
However, it seems the two CTE approach runs a little faster with a larger number of rows. The culprit is I/O. While the two-CTE scheme scans the @sample table twice, the COUNT(*) OVER method scans it once, but creates a workfile that gets scanned three times, resulting in a relatively huge number of logical reads. Something to consider if you are going to be regularly applying this to tables with thousands of rows.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
May 7, 2009 at 2:09 pm
Jules Bonnot (5/7/2009)
for a large dataset would it not be better to insertt he results of the cte1 into a temp table and run the max query on that
That's definitely an "It Depends". If you can get it to one or two scans like Bob has now, that's almost always going to be faster than any temp table shuffling, even (and especially) for large datasets.
However, in many case the query might try to hold too much in memory or in worktables and then you might be better off breaking it down with intermediate temp tables. I generally write it as one query to begin with and then break it down from there if it has problems.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply