TOP x within group

  • I repeatedly have a need to do something that falls into a general category of query that doesn't seem to exist. Or does it. Maybe you can tell me how.

    The problem is I want a query that is a join or similar that results in a sorted list which has groups within it. I want to return the top X of each group, not of the entire list.

    To illustrate using Northwind database in an overly simplified fashion, what I want might be expressed as

    select SupplierID, ProductID, ProductName

    from Products

    group by SupplierID showing top 3 each group

    order by SupplierID, ProductID, UnitPrice desc

    The hypothetical intent is to show the top 3 products (by price), in overall order of product ID, within each supplier

    Clearly there's no such syntax. I've occasionally done it like this:

    select SupplierID,

    (Select top 1 ProductID

    from Products p

    where s.SupplierID=p.SupplierID

    order by UnitPrice desc) as ProductID,

    (Select top 1 ProductName

    from Products p

    where s.SupplierID=p.SupplierID

    order by UnitPrice desc) as ProductName

    from Suppliers s

    Order by SupplierID

    This works, but only works for top 1, and is gross since the select has to be in there multiple times. Incidentally, it might be more meaningful if you think of this as the products sorted by price.

    I've tried doing various joins and temporary tables, I can usually find some gross way to get what I want. You can do this easily with cursors but that's horribly slow. But I bet I run across a similar need every week or two, and have this feeling I'm missing something obvious.

    Am I?

  • I'm guessing something like this might do the trick:

    select supplierid, productid,productname, unitprice

    from products a where productid in (select top 3 productid

    from Products

    where supplierid=a.supplierid

    order by SupplierID, UnitPrice desc )

    order by Supplierid, productid

    Gregory Larsen, DBA

    If you looking for SQL Server Examples check out my website at http://www.geocities.com/sqlserverexamples

    Gregory A. Larsen, MVP

  • Sorry for the delay, the first time I looked at it I thought that was all wrong, but I came back to it and it does seem to work, but makes my head hurt.

    I tested it on a customer file we have that might be more natural to think about. The test was to find the 3 customers with the alphabetically last city names within each state. I ran this on real data with about 60,000 customers and gave up after about 20 minutes (this is on a big, fast server). I ran it with about 300 and it took 13 seconds.

    Not that I have a better idea! If anyone does, one that runs a bit faster, let me know.

    Here's the code I last played with.

    select CustNm, ShipCityStr, ShipStateCd

    from Customer c

    where c.CustCd in (select top 3 CustCd

    from Customer c1

    where c.ShipStateCd=c1.ShipStateCd

    Order by ShipCityStr desc)

    order by 3, 2

  • Sometimes a cursor and a temp table is faster. Here is another stab at what you are looking for. Also note that I ran both these examples in SQL Server 7.0 and 2000. 2000 is much faster at returning results for the slow query. Although the cursor, and temp table are still considerable faster. I'm sure some TSQL guri, might even find a better way.

    SET NOCOUNT ON

    DECLARE @STATE INT

    DECLARE @city INT

    CREATE TABLE #T (STATECD INT,

    citycd INT,

    CUSTID int identity)

    SET @STATE = 0

    WHILE @STATE < 50

    BEGIN

    SET @STATE = @STATE + 1

    SET @City = 0

    WHILE @city < 1000

    BEGIN

    SET @city = @city + 1

    INSERT INTO #T VALUES (@STATE, @city)

    END

    END

    create table #y (statecd int,

    citycd int,

    custid int)

    select count(*) from #t

    declare str_list cursor for select statecd, citycd, custid from #t order by statecd, citycd desc

    declare @i int

    declare @x int

    declare @statecd int

    declare @sstatecd int

    declare @citycd int

    declare @custid int

    set @i = 0

    set @x = 3

    set @sstatecd = -1

    open str_list

    FETCH NEXT FROM str_list INTO @statecd, @citycd, @custid

    WHILE @@FETCH_STATUS = 0

    BEGIN

    if @sstatecd = @statecd

    set @i = @I + 1

    else

    begin

    set @i = 1

    set @sstatecd = @statecd

    end

    if @I < = @x

    begin

    insert into #y values(@statecd,@citycd,@custid)

    end

    FETCH NEXT FROM str_list INTO @statecd, @citycd, @custid

    END

    select * from #y

    close str_list

    deallocate str_list

    drop table #y

    -- slow method

    select Custid, Citycd, StateCd

    from #t c

    where c.Custid in (select top 3 Custid

    from #t c1

    where c.StateCd=c1.StateCd

    Order by Citycd desc)

    order by 3, 2

    drop table #t

    Gregory Larsen, DBA

    If you looking for SQL Server Examples check out my website at http://www.geocities.com/sqlserverexamples

    Gregory A. Larsen, MVP

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply