Top and Group By

Question

Post reply

Top and Group By

andegre

SSCertifiable

Points: 5875
More actions
September 16, 2011 at 12:40 pm

#390422

Here's my scenario, I have customers that can have 2 items assigned to them at the same time. Some of them, though, have more than 2, I need to delete/enddate the oldest ones so they only have 2 active at once.
I can do a:
Select CustomerNumber
From TableA
Group By CustomerNumber
Having Count(1) > 2
and that will give me the list of customers that have more than 2. But now here's my problem, how do I retain the top 2, for each customer, based on my IssuedDate [desc]? I can't throw in my IssuedDate into the above query because then it wouldn't do the grouping, but I need the IssuedDate to do something like Rank() or ROW_NUMBER(). Does anyone else have another suggestion? I realize it would be very easy if someone could just give me a hint, or point me in the right direction. Thanks!

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

andegre SSCertifiable Points: 5875 More actions · Answer 1

Ok, I kind of figured out a way, but I don't like it.

I did this:

Select CustomerNumber, IssuedDate

From

(

Select Top 100000 CustomerNumber, IssuedDate, Rank() Over (Partition By CustomerNumber Order By IssuedDate Desc) As [Rank]

From TableA a

Where Exists

(

Select CustomerNumber

From TableA a1

Where a1.CustomerNumber = a.CustomerNumber

Group By CustomerNumber

Having Count(1) > 2

)

Order By CustomerNumber, Rank

) temp

Where temp.Rank <= 2

The part I don't like is having to put the Top 100000 in my middle sub-select. How do I get around that, even though I need my order by clause in that middle sub-select so that my Rank is correct?

BTW - SQL Server said I couldn't put an Order By clause in my sub-select statement unless it had "For XML", or "Top"

LutzM SSC Guru Points: 107049 More actions · Answer 2

Hi Gregory, it's been a while ;-):cool:

you could try something like

WITH cte as

(

SELECT

CustomerNumber,

ROW_NUMBER() OVER(PARTITION BY CustomerNumber ORDER BY IssuedDate desc ) AS row

FROM TableA

)

SELECT *

-- DELETE

FROM cte

WHERE row >2

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

andegre SSCertifiable Points: 5875 More actions · Answer 3

Yes, it has been, Lutz 🙂

Those darn CTE's just haven't clicked in my brain yet, but i'm getting better.

So it appears as though you don't need to have the Group By clause to do a group by as long as you use Rank() or Row_Number()....you guys keep showing me new stuff on here everytime I visit. Thanks again!

LutzM SSC Guru Points: 107049 More actions · Answer 4

The PARTITION BY argument of the ranking functions can be seen as an "implicit GROUP BY".

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

andegre SSCertifiable Points: 5875 More actions · Answer 5

Thanks again Lutz!

Another completely different question for you if you don't mind answering.

I have a stored procedure that needs to do a customer lookup based on some values, it's currently written as a CURSOR :w00t:. If it's possible to write the customer lookup part as a function, that would be much faster, right? (Because it would be set-based and there would be multiple function calls simultaneously instead of RBAR)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

New question, new thread please.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

LutzM SSC Guru Points: 107049 More actions · Answer 7

Under almost any circumstances, a function will be faster than a c.u.r.s.o.r. * cough*.

Especially, if it's an iTvF (inline-Table valued function).

Depending on the size of the sproc, you could open a new thread and post it (together with the tables involved and some sample data, usually).

Since a function can be easily called from a SELECT statement whereas a sproc would be "slightly more difficult", it might also help if you explain or demonstrate how you're planning to use the output of the function.

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]