DENSE_RANK problems

  • Hi Guys,

    I am currently making hard work of a dense_rank function, I am trying to calculate a new column that I can use as a sort column in Power BI. Here is some ddl:

    if exists (select OBJECT_ID('tempdb..#t'))
    drop table #t
    go

    create table #t (id int, rating numeric, rating_description nvarchar(10), expected_rank int)
    go

    insert into #t values
    (1, 50.00, '50 KVA', 1),
    (2, 50.00, '50 KVA', 1),
    (3, 50.00, '50 KVA', 1),
    (4, 75.00, '75 KVA', 2),
    (5, 1250.00, '1250 KVA', 3),
    (6, 50.00, '50 KW', 4),
    (7, 1200.00, '1200 KW', 5),
    (8, 25.00, '25 AMP', 6),
    (9, 125.00, '125 AMP', 7),
    (10, 225.00, '225 AMP', 8),
    (11, 225.00, '225 AMP', 8);

    select
    *,
    dense_rank() over(partition by rating_description order by rating) actual_rank
    from #t
    order by id

    The expected_rank column is what I am trying to achieve, I need actual_rank to output the same

    Please help

  • Can you explain the ordering logic, please?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • I don't think SQL will be able to rank those the way you wanted them.  The reason why - if you are ordering it by "Rating", ID 8 has a rating of 25 which is lower than the 50's so, I would expect that one to be number 1.

    BUT as soon as you partition by "rating_description", your dense rank is going to give you "1" for every result as that is how the partitioning works (will be explained in more detail below).

    What I would try to do is order the data by a column other than ID to try to get your data in the order that you need for dense rank, then use your dense_rank() over (order by <order you determined in previous step>).  That will match things up how you like.  If you can't get it ordered the way you like by ignoring the ID column, dense_rank is not likely going to help you get those numbers you are looking for.  Unfortunately, I do not see a way to make that work with the data you have.

    The CLOSEST I am able to get with my testing is to order the dense_rank by "rating, rating_description" and remove the partition.  Something like:

    select 
    *,
    dense_rank() over(order by rating, rating_description) actual_rank
    from #t
    ORDER by id

    My numbers are different than yours (expected vs actual rank), but you will notice that where you have 3 1's, I have 3 2's; where you have 2 8's I have 2 6's... so the quantities for the rank match up, but the values do not.

    The way to think about how "dense_rank()" works (at least how I like to think about it) is if you run the query:

    select 
    *,
    dense_rank() over(order by rating, rating_description) actual_rank
    from #t
    ORDER by id

    the rank is determined by doing:

    SELECT distinct
    rating,
    rating_description
    from #t
    ORDER by rating, rating_description

    and if the columns returned are identical, it gets the same value for the rank.  For example, with the above, your first row is 25, 25 AMP, so it gets rating 1.  Next 3 rows are 50, 50 KVA, so they get rank 2 and so on.  BUT if you throw an extra column in for the ordering that is unique (such as the id), the ranking for the 50, 50 KVA rows becomes 1, 50, 50 KVA and 2, 50, 50 KVA and 3, 50, 50 KVA and those are 3 unique rows, so you get 3 distinct ranks.

    Now, when you throw "partition by" into the equation, you are now saying that the ranking number should restart at 1 every time that column value changes.  So using the above example again, if you were to partition by rating, 25, 25 AMP would get rank 1, 50, 50 KVA would get rank 1 as the rating changed from 25 to 50, but 50, 50 KW would get rank 2 because the rating didn't change BUT the unique row did.

    I hope that makes sense about dense rank, but in your scenario it will not work for how you need the output UNLESS my changes to it are acceptable.

    sorry for the lengthy reply, but I wanted to explain things.  I hope I didn't make it more complicated or confusing for you.

    NOTE - I do not work for Microsoft and have no back-end knowledge of how the dense_rank() function works.  The above is just based on my experience with it and from trial and error and the above helped me to rationalize how it is working.  Microsoft MAY (likely does) have a more optimized way of handling dense_rank() than my explanation above.  I just find that the above example shows how dense rank is determining the rank.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Mr. Brian Gale, thank you so much, this works perfectly.

    Also, thanks for the detailed reply, I will take some time to have a read over it.

  • I am glad I could help.  I also want to make a note, which is in my signature, but I want to make it very clear:

    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!

    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

    Now, my suggestions were ALL SELECT statements, so it is unlikely that the code would break anything, but I still don't trust running random code I find online unless I understand it.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply