How to get top 10 records sum gropued by One colum (USERID)

  • Hi,

    I have a requirement to get the sum of top 10 score records grouped by user id.

    Here is the my table structure...

    UserID Score Event

    1 100 1

    1 200 2

    1 150 3

    .....

    .....

    .....

    2 120 1

    2 420 2

    2 320 3

    .....

    .....

    .....

    From the above table i need the to get the sum of top 10 score records group by user id.

    E.G:

    UserID Top10Sum

    1 5000

    2 4000

    How can i write query for this...

    Thanks,

    Vijay Bandari

  • Hi,

    try this

    declare @abc table ( UserID int,Score int, Event int)

    insert into @abc values (1,100,1)

    insert into @abc values (1,200,2)

    insert into @abc values (1,150,3)

    insert into @abc values (1,100,4)

    insert into @abc values (1,100,5)

    insert into @abc values (2,100,1)

    insert into @abc values (2,100,2)

    insert into @abc values (2,100,3)

    insert into @abc values (2,100,4)

    insert into @abc values (2,100,5)

    select UserID,sum(Score)Score

    from @abc

    where Event in(select top 2 Event from @abc)

    group by UserID

    --

    RESULT FOR TOP 2 ROWS

    UserIDScore

    1300

    2200

    ARUN SAS

  • Vijayd,

    You may want to see if you can more clearly state the requirements. I may be interpreting differently than Arun did. The way I took them:

    -- Each userID may have many, many scores.

    -- Although the scores are identified with an "even", your desired result does not depend on that column in any way

    -- You want the sum of the top ten scores for each UserID, regardless of the events in which they were earned.

    From this, you can see you'll need a SUM function and an ORDER BY somewhere to identify the top 10. Using Arun's test data (something you'll want to provide yourself in any future questions -- it makes it a lot easier for people to dive in and help), and the test case of top two scores (instead of ten), I would expect the results to be:

    [Code]

    UserID ScoreSum

    1 350 --------- Sum of 200 and 150, that user's top two scores

    2 200 --------- Sum of 100 and 100, that user's top two scores

    [/code]

    (It's a good idea to correlate your desired results to the given test data so people can better understand the requirements)

    So, the problem boils down to looking at the scores of each userid, putting them in descending order and adding the top two (or ten in your original problem). I would use the ROW_NUMBER() function to do this. The tricky part is using this function if you're not familiar with it. The basic operation is controlled in the "over" operation with an "order by" and an optional "partition by" that is analogous to group by.

    Neither ROW_NUMBER nor the other ranking functions, RANK(), DENSERANK() and NTILE() may be used in the WHERE clause, so you have to code an intermediate result set from which you simply select the rows in the range desired (top two or top ten or whatever). That may be a correlated subquery or a CTE. I've personally become enamored with the CTE form as a way to define this "tough" stuff first. In any case, here's a script that does it both ways. Again, thanks to Arun for the test data.

    declare @abc table ( UserID int,Score int, Event int)

    insert into @abc values (1,100,1)

    insert into @abc values (1,200,2)

    insert into @abc values (1,150,3)

    insert into @abc values (1,100,4)

    insert into @abc values (1,100,5)

    insert into @abc values (2,100,1)

    insert into @abc values (2,100,2)

    insert into @abc values (2,100,3)

    insert into @abc values (2,100,4)

    insert into @abc values (2,100,5)

    --

    -- Numbered result set defined in a CTE

    ;with numberedScores (Number, UserID, Score) as

    (Select Row_Number() over (partition by UserID order by score desc)

    ,UserID, Score

    from @abc)

    select UserID

    ,sum(Score) SumTopScores

    from numberedScores

    where Number <= 2

    group by UserID

    order by sum(score) desc

    --

    -- Same thing again. This time the numbered result set is in a subquery

    select UserID

    ,sum(Score) SumTopScores

    from (Select Number = Row_Number() over (partition by UserID order by score desc)

    ,UserID, Score

    from @abc) abc2

    where Number <= 2

    group by UserID

    order by sum(score) desc

  • John, it's all so cool, but nothing's gonna work in SQL2000.

    Sorry.

    _____________
    Code for TallyGenerator

  • Sergiy (3/20/2009)


    John, it's all so cool, but nothing's gonna work in SQL2000.

    Sorry.

    Arrrrggghhhh!!!!!

    My apologies... I saw the topic in the daily e-mail and lost track of which forum we're in.

    You're right that ROW_NUMBER() and CTE's are new with SQL 2005.

  • OK. Again, sorry for the SQL 2005 solution if you really need one for SQL 2000. I'm not real happy with this one either as it may not be the most efficient solution, but it does the job.

    Please do respond with your final results -- did this work for you?

    A new temp table (or table variable, as here) uses an IDENTITY field to provide a hook for limiting the scope of the SUM() function. Doing a Select min(rownbr) for each UserID is a bit RBAR*-ish, hence my concern about the efficiency of this code.

    = = = = = = = = = =

    declare @abc table ( UserID int,Score int, Event int)

    insert into @abc values (1,100,1)

    insert into @abc values (1,200,2)

    insert into @abc values (1,150,3)

    insert into @abc values (1,100,4)

    insert into @abc values (1,100,5)

    insert into @abc values (2,100,1)

    insert into @abc values (2,100,2)

    insert into @abc values (2,100,3)

    insert into @abc values (2,100,4)

    insert into @abc values (2,100,5)

    insert into @abc values (3,50,5)

    insert into @abc values (3,150,5)

    insert into @abc values (3,70,4)

    insert into @abc values (3,85,7)

    insert into @abc values (3,150,5)

    insert into @abc values (3,25,15)

    declare @range int

    set @range = 3

    declare @OrderedABC table (UserID int, Score int, RowNbr int identity)

    Insert into @OrderedABC

    select UserID, Score

    from @abc

    order by UserID, Score Desc

    Select * from @OrderedABC

    Select o.UserID, sum(o.score)

    from @OrderedABC o

    Where o.rownbr <= (Select min(rownbr) + (@range - 1)

    from @OrderedABC where userID = o.userid)

    group by o.UserID

    *RBAR is "Row By Agonizing Row", a term coined by stand-out SSC contributor Jeff Moden. If you can, do look for a set-based solution. Perhaps a bit of searching for Row_Number() emulation in SQL 2000 may turn something up.

  • john.arnott (3/25/2009)


    *RBAR is "Row By Agonizing Row", a term coined by stand-out SSC contributor Jeff Moden. If you can, do look for a set-based solution. Perhaps a bit of searching for Row_Number() emulation in SQL 2000 may turn something up.

    vijayd, speaking of RBAR, I believe this

    http://www.sqlservercentral.com/articles/Advanced+Querying/61716/ will do *exactly* what you need to do, and probably more. Be sure to read the caveats before you open the link under "Resources"

    Greg
    _________________________________________________________________________________________________
    The glass is at one half capacity: nothing more, nothing less.

  • Hey thanks to everyone for responding for my problem.

    john.arnott - Thanks for your solution.. after i digged into google for solution i myself found the solution using RowNumber (Sorry i am using sql 2005, i didn't observed that i am posting this thread in sql 2000 forum).

    I myself tried different approches using RowNumber and got the solutions same as John.Arnott 🙂

    I have created a view using following query

    SELECT

    UserID,

    ROW_NUMBER() OVER(ORDER BY SUM(TotalScore) DESC) AS Rank,

    SUM(TotalScore) Total10Score

    FROM Leaders L1 WHERE L1.ContestID IN (SELECT TOP 10 ContestID FROM Leaders WHERE UserID = L1.UserID ORDER BY TotalScore DESC)

    GROUP BY UserID

    Here Leader is the table which has user id, score and contestid

    Thanks,

    Vijay Bandari

  • Vijayd,

    Glad to have helped. (It is a bit ironic that you posted in the SQL2000 forum without noticing and that I responded here, equally oblivious to the greater context).

    Your solution looks pretty good. You've moved the Row_Number() function into the main query, which simplifies it a bit by no longer having a CTE or sub-query, but it now depends on the contestID being unique for each UserID. If someone has two scores for the same contestID, they will both be added into their reported total. This may be impossible for your situation, but you (and others who stumble upon this thread) should be aware of the possibility.

  • The Leaders is a view which holds score for a contest for particular user (UserID - ContestID - Unique).... 🙂

    I used ROW_NUMBER() in main query to get the rank of the user by top 10 total scores 🙂

    Admin - Please move this post to SQL 2005/2008 Category 🙂

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply