Dense_Rank Challenging

  • I have a set of values in a temp table (stored procedure) I need to dense_rank but I do not have a unique value. I need to dense_rank or group. Dense_Rank column is the wish output. The logic has to do with the position ranging from 1 to 10. For the first 1 to 10 values for example first and second row 1 and 6 will be = to the first dense_rank = 1

    third and forth wor 1 and 2 will be = to the second dense_rank = 2 and so on...

    and so on...

    PositionStartDateTime Dense_Rank

    12005-08-14 01:18:00.0001

    62005-08-14 01:48:00.0001

    12005-08-14 02:48:00.0002

    22005-08-14 02:54:00.0002

    12005-08-14 04:36:00.0003

    12005-08-14 05:48:00.0004

    32005-08-14 06:00:00.0004

    42005-08-14 06:06:00.0004

    52005-08-14 06:12:00.0004

    62005-08-14 06:18:00.0004

    72005-08-14 06:24:00.0004

    82005-08-14 06:30:00.0004

    12005-08-14 06:54:00.0005

    22005-08-14 07:00:00.0005

    72005-08-14 07:30:00.0005

    CREATE TABLE #FinalValues

    (

    Position int,

    StartDateTime datetime

    )

    INSERT INTO #FinalValues

    VALUES(1,'2005-08-14 01:18:00.000')

    INSERT INTO #FinalValues

    VALUES(6,'2005-08-14 01:48:00.000')

    INSERT INTO #FinalValues

    VALUES(1,'2005-08-14 02:48:00.000')

    INSERT INTO #FinalValues

    VALUES(2,'2005-08-14 02:54:00.000')

    INSERT INTO #FinalValues

    VALUES(1,'2005-08-14 04:36:00.000')

    INSERT INTO #FinalValues

    VALUES(1,'2005-08-14 05:48:00.000')

    INSERT INTO #FinalValues

    VALUES(3,'2005-08-14 06:00:00.000')

    INSERT INTO #FinalValues

    VALUES(4,'2005-08-14 06:06:00.000')

    INSERT INTO #FinalValues

    VALUES(5,'2005-08-14 06:12:00.000')

    INSERT INTO #FinalValues

    VALUES(6,'2005-08-14 06:18:00.000')

    INSERT INTO #FinalValues

    VALUES(7,'2005-08-14 06:24:00.000')

    INSERT INTO #FinalValues

    VALUES(8,'2005-08-14 06:30:00.000')

    INSERT INTO #FinalValues

    VALUES(1,'2005-08-14 06:54:00.000')

    INSERT INTO #FinalValues

    VALUES(2,'2005-08-14 07:00:00.000')

    INSERT INTO #FinalValues

    VALUES(7,'2005-08-14 07:30:00.000')

    Thank you so much in advance!

  • I doubt I'm going to be the only one bewildered by this request.

    According to wikipedia:

    Dense ranking ("1223" ranking)

    In dense ranking, items that compare equal receive the same ranking number, and the next item(s) receive the immediately following ranking number. Equivalently, each item's ranking number is 1 plus the number of items ranked above it that are distinct with respect to the ranking order.

    Thus if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first"), B gets ranking number 2 ("joint second"), C also gets ranking number 2 ("joint second") and D gets ranking number 3 ("third").

    That makes sense. What doesn't make sense(to me at least) is the correlation of dense ranking to the data that you posted.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Hi Garadin,

    Well, I am very stressed out too.

    What I really need to do is create groups or some kind of value that I can group by the position values that fall in a sequence from 1 to 10 (positions) even if the records are not there; therefore, I thought that dense_rank could have beend the solution. 🙁

  • What I still don't understand:

    What you are grouping by to establish those "dense ranks" that you've given in your original data. The positions are all over the board in the ranks, the times don't follow any pattern that I can see with breakdown (no 1 hour or 2 hour interval fits your ranks). If there is some other field that you are "grouping" by, it's fairly important to include that with your sample data.

    What your end goal is here. Is this ranking supposed to help you generate some other report? If it is, please explain your end goal, and give us a full sample of the data you are working with so that we can understand the obstacle in front of you. As things stand right now, you really haven't provided enough information for us to help you.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Geradin,

    I do not have more data this is it. I do not have a grouping by. You are right, the time does not follow any patter. I need to make a group based on the position for every time the 1 start, I need to create a group. My final out put should look like this on reporting services:

    Position StartDateTime

    1 2005-08-14 01:18:00.000

    6 2005-08-14 01:48:00.000

    1 2005-08-14 02:48:00.000

    2 2005-08-14 02:54:00.000

    1 2005-08-14 04:36:00.000

    1 2005-08-14 05:48:00.000

    3 2005-08-14 06:00:00.000

    4 2005-08-14 06:06:00.000

    5 2005-08-14 06:12:00.000

    6 2005-08-14 06:18:00.000

    7 2005-08-14 06:24:00.000

    8 2005-08-14 06:30:00.000

    1 2005-08-14 06:54:00.000

    2 2005-08-14 07:00:00.000

    7 2005-08-14 07:30:00.000

  • you're dense-ranking by hour? How about -

    select *,

    dense_rank() over (order by dateadd(hour,datediff(hour,0,startdatetime),0))

    from #FinalValues

    Hmm - never mind - sounds like something entirely different.

    You should look at the "running total" solution by Jeff moden - it would work for this scenario

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • The running total solution is described here:

    http://www.sqlservercentral.com/articles/Advanced+Querying/61716/[/url]

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt,

    I tried that but that does not work because I need data that is out side the hour for example:

    1 2005-08-14 05:48:00.000

    3 2005-08-14 06:00:00.000

    4 2005-08-14 06:06:00.000

    5 2005-08-14 06:12:00.000

    6 2005-08-14 06:18:00.000

    7 2005-08-14 06:24:00.000

    8 2005-08-14 06:30:00.000

  • If you haven't read the article that Matt listed, you really need to go back and read it so you understand why the following works... and it works at a rate of over 7 million rows per minute...

    --drop table #finalvalues

    go

    --===== Your table with a new column and the right kind of index

    CREATE TABLE #FinalValues

    (

    Position int NOT NULL,

    StartDateTime datetime NOT NULL,

    jbmRank INT,

    PRIMARY KEY CLUSTERED (StartDateTime,Position)

    )

    --===== This just populates the data as before

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(1,'2005-08-14 01:18:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(6,'2005-08-14 01:48:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(1,'2005-08-14 02:48:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(2,'2005-08-14 02:54:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(1,'2005-08-14 04:36:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(1,'2005-08-14 05:48:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(3,'2005-08-14 06:00:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(4,'2005-08-14 06:06:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(5,'2005-08-14 06:12:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(6,'2005-08-14 06:18:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(7,'2005-08-14 06:24:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(8,'2005-08-14 06:30:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(1,'2005-08-14 06:54:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(2,'2005-08-14 07:00:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(7,'2005-08-14 07:30:00.000')

    --===== Declare a couple of necessary variables with obvious names

    DECLARE @PrevPosition INT,

    @PrevjbmRank INT

    SELECT @PrevPosition = 0,

    @PrevjbmRank = 0

    --===== Do the update for rank like in the running total article based on a decrease in position

    UPDATE #FinalValues

    SET @PrevjbmRank = jbmRank = CASE WHEN @PrevPosition < Position THEN @PrevjbmRank + 1 ELSE 1 END,

    @PrevPosition = Position

    FROM #FinalValues WITH (INDEX(0))

    --===== Show the final results...

    SELECT *

    FROM #FinalValues

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    This solution is close to what I need, but I need to create groups. I cannot create groups with jbmRank. 🙁

    PositionStartDateTime jbmRank

    12005-08-14 01:18:00.0001

    62005-08-14 01:48:00.0002

    12005-08-14 02:48:00.0001

    22005-08-14 02:54:00.0002

    12005-08-14 04:36:00.0001

    12005-08-14 05:48:00.0001

    32005-08-14 06:00:00.0002

    42005-08-14 06:06:00.0003

    52005-08-14 06:12:00.0004

    62005-08-14 06:18:00.0005

    72005-08-14 06:24:00.0006

    82005-08-14 06:30:00.0007

    12005-08-14 06:54:00.0001

    22005-08-14 07:00:00.0002

    72005-08-14 07:30:00.0003

    I need something like this:

    Position StartDateTime Group

    1 2005-08-14 01:18:00.000 1

    6 2005-08-14 01:48:00.000 1

    1 2005-08-14 02:48:00.000 2

    2 2005-08-14 02:54:00.000 2

    1 2005-08-14 04:36:00.000 3

    1 2005-08-14 05:48:00.000 4

    3 2005-08-14 06:00:00.000 4

    4 2005-08-14 06:06:00.000 4

    5 2005-08-14 06:12:00.000 4

    6 2005-08-14 06:18:00.000 4

    7 2005-08-14 06:24:00.000 4

    8 2005-08-14 06:30:00.000 4

    1 2005-08-14 06:54:00.000 5

    2 2005-08-14 07:00:00.000 5

    7 2005-08-14 07:30:00.000 5

    I will read that article. Thank you very much!

  • Ok, now I get what you're trying to do. Amazing how much more sense it made as soon as you put the lines in between your groups =). So the table is ordered by time, then position (which relates to who knows what) enters sequential values from 1-10. And you want to make a group correlating to each 1-10 restart. I haven't had time to look at the running total post in detail yet, but if you can't figure out how to apply that to your situation, maybe a piece of code I just wrote for another post would point you in the right direction. It's basically doing the same thing you're trying to do here. It's not going to be nearly as fast as the method already suggested here, but if you're only dealing with very small data sets, it may not have to be.

    By step 2 of this script(after slightly modifying it to fit your scenario), you should have your group field.

    http://www.sqlservercentral.com/Forums/Topic582878-338-1.aspx#bm582972

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • JohnDBA (10/8/2008)


    Jeff,

    This solution is close to what I need, but I need to create groups. I cannot create groups with jbmRank. 🙁

    I will read that article. Thank you very much!

    You bet. Sorry for the wrong answer... heh... I shouldn't post late at night.

    Anyway, the solution below uses the same principle as I previously did and, as you read in the article, isn't classified as a RBAR solution. It will resolve a million rows in something less than 7 seconds... I changed the code below...

    drop table #finalvalues

    go

    --===== Your table with a new column and the right kind of index

    CREATE TABLE #FinalValues

    (

    Position int NOT NULL,

    StartDateTime datetime NOT NULL,

    jbmGroup INT,

    PRIMARY KEY CLUSTERED (StartDateTime,Position)

    )

    --===== This just populates the data as before

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(1,'2005-08-14 01:18:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(6,'2005-08-14 01:48:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(1,'2005-08-14 02:48:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(2,'2005-08-14 02:54:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(1,'2005-08-14 04:36:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(1,'2005-08-14 05:48:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(3,'2005-08-14 06:00:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(4,'2005-08-14 06:06:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(5,'2005-08-14 06:12:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(6,'2005-08-14 06:18:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(7,'2005-08-14 06:24:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(8,'2005-08-14 06:30:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(1,'2005-08-14 06:54:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(2,'2005-08-14 07:00:00.000')

    INSERT INTO #FinalValues (Position,StartDateTime)

    VALUES(7,'2005-08-14 07:30:00.000')

    --===== Declare a couple of necessary variables with obvious names

    DECLARE @PrevPosition INT,

    @PrevjbmGroup INT

    SELECT @PrevPosition = 0,

    @PrevjbmGroup = 1

    --===== Do the update for rank like in the running total article based on a decrease in position

    UPDATE #FinalValues

    SET @PrevjbmGroup = jbmGroup = CASE WHEN @PrevPosition >= Position THEN @PrevjbmGroup + 1 ELSE @PrevjbmGroup END,

    @PrevPosition = Position

    FROM #FinalValues WITH (INDEX(0))

    --===== Show the final results...

    SELECT *

    FROM #FinalValues

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Thank you so much! That is perfect! 🙂

  • You bet. Thanks for the feedback.

    Just remember... if you can't wedge a clustered index in there like I did, this solution will not work correctly.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply