January 31, 2008 at 1:50 pm
goal is simple but the structure of the query alludes me:
to keep it simple lets say i have a table with 3 columns: salePrice, saleType, and saleDate. the table contains 10 records. 5 of them have a saleType of "credit" and 5 of them have a saleType of "cash". what would my select statement look like if i wanted a result set that showed the top 2 records of each type ordered by saleDate desc?
i feel like the answer is so simple, yet i don't know how to approach it. btw: my realworld example could have a large number of types and many records related to a specific type.
thanks for your help
January 31, 2008 at 2:00 pm
One of the articles on the front page of the site today, about running totals and ranking functions, has an answer that might work for you.
Insert into a temp table, rank based on the type and the desired sequence, then select from the table where the rank is <= the desired number of rows.
(The ranking function is mildly complex, so I'm not going to quote it here, but it's outlined quite well in the article.)
Of course, it's much easier in SQL 2005. Just use Row_Number() and a CTE. But in SQL 2000, it requires a bit more work.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
January 31, 2008 at 3:07 pm
How about
SELECT top 1 * FROM T1 WHERE name ='cash'
UNION ALL
SELECT TOP 1 * FROM T1 WHERE name = 'credit'
January 31, 2008 at 3:17 pm
Thanks for the reply. i'm not sure the union will work because my query can't use literal values like that. it needs to somehow get all records of each specific type and then only return the top N of each in one result set. so if the data looked like this:
saleType salePrice
cash 10
cash 5
cash 4
cash 3
cash 2
credit 10
credit 5
credit 4
credit 3
credit 2
and i wanted to see the top 2 of each, the result set would look like:
saleType salePrice
cash 10
cash 5
credit 10
credit 5
the example i give is really oversimplified in terms of the saleType. imagine that there are 1000 different possible saleTypes and each sale type has 1000 different records and i wanted to see a list of the top 5 records for each 1000 saleType
February 1, 2008 at 4:49 am
Is this what you mean?
declare @temptbl table
(salestype varchar(6), salesprice int)
insert into @temptbl (salestype, salesprice)
select 'cash', 5 union all
select 'cash', 3 union all
select 'cash', 2 union all
select 'cash', 1 union all
select 'credit', 5 union all
select 'credit', 2 union all
select 'credit', 1 union all
select 'credit', 3
select t1.salesprice, t1.salestype
from @temptbl t1 where t1.salesprice in
(select top 2 t2.salesprice from @temptbl t2 where t1.salestype = t2.salestype
order by t2.salestype, t2.salesprice desc)
order by t1.salestype,t1.salesprice desc
Toni
February 1, 2008 at 6:57 am
Yes, this is exactly what i was looking for! Thanks so much!!!
February 1, 2008 at 7:48 am
Keep in mind that the suggested solution (a correlated subquery) is very resource hungry and will be quite slow on a big table with a lot of different values in the column being joined against.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 1, 2008 at 7:56 am
GSquared. Can you please provide the preferred alternate to a correlated subquery?
Thanks
Toni
February 1, 2008 at 12:42 pm
toniupstny (2/1/2008)
GSquared. Can you please provide the preferred alternate to a correlated subquery?Thanks
Toni
Insert into a temp table, add rankings to it, select from that.
The correlated subquery will be fast if the column you join to has only a few distinct values, but will have to do a table/index scan for each distinct value. Thus, if there are only two (as in the sample data provided), it will probably be reasonably fast. If there are a dozen, or even hundreds, the query will crawl. I've seen correlated subqueries in such cases, and some of them could easily be rear-ended by a glacier, even on high-power hardware.
The temp table and rankings will be slower, if you only have a couple values, but will be much, much faster if you have more than that.
There was a very good article on the front page of SQL Server Central just a couple of days ago on how to add rankings to a temp table in SQL 2000. (In 2005, you just use the Row_Number function.) Rather than try to reproduce the whole article, I refered to it in my original answer on this same thread (cf).
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 1, 2008 at 12:44 pm
Also, since SQL 2000 doesn't allow the use of a variable in the TOP clause, a ranked temp table will add that functionality. The subquery has to have "top 2" hard-coded into it. Per the original poster, he needs the "top N rows", which sounds to me like a variable.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 1, 2008 at 6:31 pm
Thank you GSquared.
I have to confess on being a bit behind in reading the articles this week as I just completed, and passed, my certification test for SQL Server 2000 on Wednesday. I will definitely go back through and pay special attention to that article.
Toni
February 4, 2008 at 10:59 am
Congrats on the cert! (I may one day get some certs myself. Haven't been able to afford it yet.)
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 5, 2008 at 12:55 am
Here is another approach for Top N algorithm
-- Prepare sample data
DECLARE @Sample TABLE (SalesType VARCHAR(6), SalesPrice INT)
INSERT@Sample
SELECT'cash', 5 UNION ALL
SELECT'cash', 3 UNION ALL
SELECT'cash', 2 UNION ALL
SELECT'cash', 1 UNION ALL
SELECT'credit', 9 UNION ALL
SELECT'credit', 8 UNION ALL
SELECT'credit', 4 UNION ALL
SELECT'credit', 2
-- Initialize user supplied parameter
DECLARE@Items INT
SET@Items = 2
-- Prepare staging table
DECLARE@Stage TABLE (SalesType VARCHAR(6), SalesPrice INT, RecID INT)
INSERT@Stage
SELECT DISTINCTs.SalesType,
NULL,
v.Number
FROM@Sample AS s
INNER JOINmaster..spt_values AS v ON v.Type = 'p'
WHEREv.Number BETWEEN 1 AND @Items
-- Populate staging table
WHILE @Items > 0
BEGIN
UPDATEst
SETst.SalesPrice = x.SalesPrice
FROM@Stage AS st
INNER JOIN(
SELECTsa.SalesType,
MAX(sa.SalesPrice) AS SalesPrice
FROM@Sample AS sa
LEFT JOIN@Stage AS st ON st.SalesType = sa.SalesType
AND st.SalesPrice = sa.SalesPrice
WHEREst.SalesType IS NULL
GROUP BYsa.SalesType
) AS x ON x.SalesType = st.SalesType
WHEREst.RecID = @Items
SET@Items = @Items - 1
END
-- Show the expected result
SELECTSalestype,
SalesPrice
FROM@Stage
ORDER BYSalestype,
SalesPrice DESC
N 56°04'39.16"
E 12°55'05.25"
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply