selecting top N records by type

Question

selecting top N records by type

Paul Koanui

Say Hey Kid

Points: 693
More actions
January 31, 2008 at 1:50 pm

#186174

goal is simple but the structure of the query alludes me:
to keep it simple lets say i have a table with 3 columns: salePrice, saleType, and saleDate. the table contains 10 records. 5 of them have a saleType of "credit" and 5 of them have a saleType of "cash". what would my select statement look like if i wanted a result set that showed the top 2 records of each type ordered by saleDate desc?
i feel like the answer is so simple, yet i don't know how to approach it. btw: my realworld example could have a large number of types and many records related to a specific type.
thanks for your help

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

GSquared SSC Guru Points: 260824 More actions · Answer 1

One of the articles on the front page of the site today, about running totals and ranking functions, has an answer that might work for you.

Insert into a temp table, rank based on the type and the desired sequence, then select from the table where the rank is <= the desired number of rows.

(The ranking function is mildly complex, so I'm not going to quote it here, but it's outlined quite well in the article.)

Of course, it's much easier in SQL 2005. Just use Row_Number() and a CTE. But in SQL 2000, it requires a bit more work.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

SQL ORACLE One Orange Chip Points: 27807 More actions · Answer 2

How about

SELECT top 1 * FROM T1 WHERE name ='cash'

UNION ALL

SELECT TOP 1 * FROM T1 WHERE name = 'credit'

Paul Koanui Say Hey Kid Points: 693 More actions · Answer 3

Thanks for the reply. i'm not sure the union will work because my query can't use literal values like that. it needs to somehow get all records of each specific type and then only return the top N of each in one result set. so if the data looked like this:

saleType salePrice

cash 10

cash 5

cash 4

cash 3

cash 2

credit 10

credit 5

credit 4

credit 3

credit 2

and i wanted to see the top 2 of each, the result set would look like:

saleType salePrice

cash 10

cash 5

credit 10

credit 5

the example i give is really oversimplified in terms of the saleType. imagine that there are 1000 different possible saleTypes and each sale type has 1000 different records and i wanted to see a list of the top 5 records for each 1000 saleType

toniupstny Hall of Fame Points: 3986 More actions · Answer 4

Is this what you mean?

declare @temptbl table

(salestype varchar(6), salesprice int)

insert into @temptbl (salestype, salesprice)

select 'cash', 5 union all

select 'cash', 3 union all

select 'cash', 2 union all

select 'cash', 1 union all

select 'credit', 5 union all

select 'credit', 2 union all

select 'credit', 1 union all

select 'credit', 3

select t1.salesprice, t1.salestype

from @temptbl t1 where t1.salesprice in

(select top 2 t2.salesprice from @temptbl t2 where t1.salestype = t2.salestype

order by t2.salestype, t2.salesprice desc)

order by t1.salestype,t1.salesprice desc

Toni

Paul Koanui Say Hey Kid Points: 693 More actions · Answer 5

Yes, this is exactly what i was looking for! Thanks so much!!!

GSquared SSC Guru Points: 260824 More actions · Answer 6

Keep in mind that the suggested solution (a correlated subquery) is very resource hungry and will be quite slow on a big table with a lot of different values in the column being joined against.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

toniupstny Hall of Fame Points: 3986 More actions · Answer 7

GSquared. Can you please provide the preferred alternate to a correlated subquery?

Thanks

Toni

GSquared SSC Guru Points: 260824 More actions · Answer 8

toniupstny (2/1/2008)
GSquared. Can you please provide the preferred alternate to a correlated subquery?
Thanks
Toni

Insert into a temp table, add rankings to it, select from that.

The correlated subquery will be fast if the column you join to has only a few distinct values, but will have to do a table/index scan for each distinct value. Thus, if there are only two (as in the sample data provided), it will probably be reasonably fast. If there are a dozen, or even hundreds, the query will crawl. I've seen correlated subqueries in such cases, and some of them could easily be rear-ended by a glacier, even on high-power hardware.

The temp table and rankings will be slower, if you only have a couple values, but will be much, much faster if you have more than that.

There was a very good article on the front page of SQL Server Central just a couple of days ago on how to add rankings to a temp table in SQL 2000. (In 2005, you just use the Row_Number function.) Rather than try to reproduce the whole article, I refered to it in my original answer on this same thread (cf).

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

GSquared SSC Guru Points: 260824 More actions · Answer 9

Also, since SQL 2000 doesn't allow the use of a variable in the TOP clause, a ranked temp table will add that functionality. The subquery has to have "top 2" hard-coded into it. Per the original poster, he needs the "top N rows", which sounds to me like a variable.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

toniupstny Hall of Fame Points: 3986 More actions · Answer 10

Thank you GSquared.

I have to confess on being a bit behind in reading the articles this week as I just completed, and passed, my certification test for SQL Server 2000 on Wednesday. I will definitely go back through and pay special attention to that article.

Toni

GSquared SSC Guru Points: 260824 More actions · Answer 11

Congrats on the cert! (I may one day get some certs myself. Haven't been able to afford it yet.)

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

SwePeso SSC-Dedicated Points: 39758 More actions · Answer 12

Here is another approach for Top N algorithm

-- Prepare sample data

DECLARE @Sample TABLE (SalesType VARCHAR(6), SalesPrice INT)

INSERT@Sample

SELECT'cash', 5 UNION ALL

SELECT'cash', 3 UNION ALL

SELECT'cash', 2 UNION ALL

SELECT'cash', 1 UNION ALL

SELECT'credit', 9 UNION ALL

SELECT'credit', 8 UNION ALL

SELECT'credit', 4 UNION ALL

SELECT'credit', 2

-- Initialize user supplied parameter

DECLARE@Items INT

SET@Items = 2

-- Prepare staging table

DECLARE@Stage TABLE (SalesType VARCHAR(6), SalesPrice INT, RecID INT)

INSERT@Stage

SELECT DISTINCTs.SalesType,

NULL,

v.Number

FROM@Sample AS s

INNER JOINmaster..spt_values AS v ON v.Type = 'p'

WHEREv.Number BETWEEN 1 AND @Items

-- Populate staging table

WHILE @Items > 0

BEGIN

UPDATEst

SETst.SalesPrice = x.SalesPrice

FROM@Stage AS st

INNER JOIN(

SELECTsa.SalesType,

MAX(sa.SalesPrice) AS SalesPrice

FROM@Sample AS sa

LEFT JOIN@Stage AS st ON st.SalesType = sa.SalesType

AND st.SalesPrice = sa.SalesPrice

WHEREst.SalesType IS NULL

GROUP BYsa.SalesType

) AS x ON x.SalesType = st.SalesType

WHEREst.RecID = @Items

SET@Items = @Items - 1

END

-- Show the expected result

SELECTSalestype,

SalesPrice

FROM@Stage

ORDER BYSalestype,

SalesPrice DESC

N 56°04'39.16"
E 12°55'05.25"