Ranking in tsql

  • Hi, Has anyone tried this before. I have a table with name and salary . Is it possible to rank the salary something similar as in the ex below

    AAA   50,000     2

    BBB    55,000     1

    CCC    35,000     3

    DDD     50,000    2

    The last column showing their ranking in terms of their salary. sql 2005 has rank() which solves this but is there anything similar that could be done in 2000. I would really appreciate the code if someone has tried this before. TIA

  • You could try something like

    SELECT EmployeeID, Salary, (SELECT COUNT(*)

                                           FROM Salaries AS S2

                                           WHERE S2.Salary < S1.Salary OR (

                                                     S2.Salary = S1.Salary AND S2.EmployeeID <= S1.EmployeeID) AS Rank

    FROM dbo.Salaries AS S1

    ORDER BY Salary, EmployeeID

    The issue here is that if the salaries are the same you need some way to prioritise one over the other I chose their EmployeeID.

    hth

     

    David

  • <<< edited to include the proper order by clause >>>

    This may do something like you are asking for...

    create

    table tblTestSalary (

    Name varchar(3),

    Salary

    int)

    go

    insert

    into tblTestSalary (Name,Salary)

    select

    'AAA','50000' UNION ALL

    select

    'BBB','55000' UNION ALL

    select

    'CCC','35000' UNION ALL

    select

    'DDD','50000'

    GO

    SELECT

    IDENTITY(int, 1,1) AS Rank,

    SELECT

    IDENTITY(int, 1,1) AS Rank,

    Salary

    INTO #RankedSalary

    FROM tblTestSalary

    GROUP BY Salary

    ORDER BY Salary DESC

    SELECT TS.Name,

    TS

    .Salary,

    RS

    .Rank

    FROM tblTestSalary TS

    INNER JOIN #RankedSalary RS ON RS.Salary = TS.Salary

     

    Enjoy

    Wayne

  • Be a bit leary of queries that have an inequality in the WHERE clause especially when a self-join is also present as it may form a "triangular join" which is a bit more than half of a full blown cross join... they tend to gag real bad on as little as 10,000 rows.

    The formula for the number of rows "touched" by <= and >= is (N2+N)/2 or, to see why they call it a "triangular join".... a 10 row self join <= needs to touch 55 rows...

       10  9  8  7  6  5  4  3  2  1

     1  x  x  x  x  x  x  x  x  x  x

     2  x  x  x  x  x  x  x  x  x

     3  x  x  x  x  x  x  x  x

     4  x  x  x  x  x  x  x

     5  x  x  x  x  x  x

     6  x  x  x  x  x

     7  x  x  x  x 

     8  x  x  x 

     9  x  x

    10  x

    The formula for the number of rows "touched" by < and > is almost as bad at (N2-N)/2 or 45 rows for a 10 row self join...

       10  9  8  7  6  5  4  3  2  1

     1  x  x  x  x  x  x  x  x  x 

     2  x  x  x  x  x  x  x  x 

     3  x  x  x  x  x  x  x  

     4  x  x  x  x  x  x 

     5  x  x  x  x  x 

     6  x  x  x  x 

     7  x  x  x   

     8  x  x  

     9  x 

    10 

    For a lousy 10,000 row <= self join?  50,005,000 rows will be "touched"... obviously, each row will be "touched" more than once.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • My eyes pop up whenever my where clause uses an OR. It can be a real performance killer.

  • Hi, I don't know if there's a difference in the t-sql in 2005 and 2000, but I know this works in SQL Server 2005:

    select employee, salary,

    rank

    () over (partition by salary order by salary) as salary_rank

    from

    x_table

    order

    by salary

     

    if it doesn't I apologize for wasting your time.

  • >>Hi, I don't know if there's a difference in the t-sql in 2005 and 2000

    There is - the RANK OVER PARTITION is a T-SQL enhancement in 2005, which is why various other techniques need to be employed in 2000. Speaking of which ..

    SELECT IDENTITY(int, 1,1) AS Rank, Salary

    INTO #RankedSalary

    FROM tblTestSalary

    GROUP BY Salary

    ORDER BY Salary DESC

    This will not do what you expect. Use of IDENTITY() with an ORDER BY in conjunction with table creation does not guarantee that the identity values generated are in the same order as your ORDER BY.

    You need to create the temp table first, then use an INSERT with SELECT ... ORDER BY.

     

  • I would store this kind of ranking in separate column and update this column using trigger when data in the table is changed.

    This update not suppose to happen to often, so once someone's salary has been changed trigger recalculates the rank and it's ready for you any moment.

    If you afraid to affect existing functionality with adding extra column (or just not allowed to do this) you may create new table with 2 columns: SalaryID, Rank.

    Then you setup trigger on table Salary to update table SalaryRank.

    Every time you need to take Rank into consideration you just join this new table to your query.

    _____________
    Code for TallyGenerator

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply