Dense_Rank Challenging

Question

Post reply

Dense_Rank Challenging

JohnDBA

SSC Eights!

Points: 812
More actions
October 8, 2008 at 6:27 pm

#382993

I have a set of values in a temp table (stored procedure) I need to dense_rank but I do not have a unique value. I need to dense_rank or group. Dense_Rank column is the wish output. The logic has to do with the position ranging from 1 to 10. For the first 1 to 10 values for example first and second row 1 and 6 will be = to the first dense_rank = 1
third and forth wor 1 and 2 will be = to the second dense_rank = 2 and so on...
and so on...
PositionStartDateTime Dense_Rank
12005-08-14 01:18:00.0001
62005-08-14 01:48:00.0001
12005-08-14 02:48:00.0002
22005-08-14 02:54:00.0002
12005-08-14 04:36:00.0003
12005-08-14 05:48:00.0004
32005-08-14 06:00:00.0004
42005-08-14 06:06:00.0004
52005-08-14 06:12:00.0004
62005-08-14 06:18:00.0004
72005-08-14 06:24:00.0004
82005-08-14 06:30:00.0004
12005-08-14 06:54:00.0005
22005-08-14 07:00:00.0005
72005-08-14 07:30:00.0005
CREATE TABLE #FinalValues
(
Position int,
StartDateTime datetime
)
INSERT INTO #FinalValues
VALUES(1,'2005-08-14 01:18:00.000')
INSERT INTO #FinalValues
VALUES(6,'2005-08-14 01:48:00.000')
INSERT INTO #FinalValues
VALUES(1,'2005-08-14 02:48:00.000')
INSERT INTO #FinalValues
VALUES(2,'2005-08-14 02:54:00.000')
INSERT INTO #FinalValues
VALUES(1,'2005-08-14 04:36:00.000')
INSERT INTO #FinalValues
VALUES(1,'2005-08-14 05:48:00.000')
INSERT INTO #FinalValues
VALUES(3,'2005-08-14 06:00:00.000')
INSERT INTO #FinalValues
VALUES(4,'2005-08-14 06:06:00.000')
INSERT INTO #FinalValues
VALUES(5,'2005-08-14 06:12:00.000')
INSERT INTO #FinalValues
VALUES(6,'2005-08-14 06:18:00.000')
INSERT INTO #FinalValues
VALUES(7,'2005-08-14 06:24:00.000')
INSERT INTO #FinalValues
VALUES(8,'2005-08-14 06:30:00.000')
INSERT INTO #FinalValues
VALUES(1,'2005-08-14 06:54:00.000')
INSERT INTO #FinalValues
VALUES(2,'2005-08-14 07:00:00.000')
INSERT INTO #FinalValues
VALUES(7,'2005-08-14 07:30:00.000')
Thank you so much in advance!

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Garadin One Orange Chip Points: 29613 More actions · Answer 1

I doubt I'm going to be the only one bewildered by this request.

According to wikipedia:

Dense ranking ("1223" ranking)
In dense ranking, items that compare equal receive the same ranking number, and the next item(s) receive the immediately following ranking number. Equivalently, each item's ranking number is 1 plus the number of items ranked above it that are distinct with respect to the ranking order.
Thus if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first"), B gets ranking number 2 ("joint second"), C also gets ranking number 2 ("joint second") and D gets ranking number 3 ("third").

That makes sense. What doesn't make sense(to me at least) is the correlation of dense ranking to the data that you posted.

Seth Phelabaum

Consistency is only a virtue if you're not a screwup. 😉

Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

JohnDBA SSC Eights! Points: 812 More actions · Answer 2

Hi Garadin,

Well, I am very stressed out too.

What I really need to do is create groups or some kind of value that I can group by the position values that fall in a sequence from 1 to 10 (positions) even if the records are not there; therefore, I thought that dense_rank could have beend the solution. 🙁

Garadin One Orange Chip Points: 29613 More actions · Answer 3

What I still don't understand:

What you are grouping by to establish those "dense ranks" that you've given in your original data. The positions are all over the board in the ranks, the times don't follow any pattern that I can see with breakdown (no 1 hour or 2 hour interval fits your ranks). If there is some other field that you are "grouping" by, it's fairly important to include that with your sample data.

What your end goal is here. Is this ranking supposed to help you generate some other report? If it is, please explain your end goal, and give us a full sample of the data you are working with so that we can understand the obstacle in front of you. As things stand right now, you really haven't provided enough information for us to help you.

Seth Phelabaum

Consistency is only a virtue if you're not a screwup. 😉

Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

JohnDBA SSC Eights! Points: 812 More actions · Answer 4

Geradin,

I do not have more data this is it. I do not have a grouping by. You are right, the time does not follow any patter. I need to make a group based on the position for every time the 1 start, I need to create a group. My final out put should look like this on reporting services:

Position StartDateTime

1 2005-08-14 01:18:00.000

6 2005-08-14 01:48:00.000

1 2005-08-14 02:48:00.000

2 2005-08-14 02:54:00.000

1 2005-08-14 04:36:00.000

1 2005-08-14 05:48:00.000

3 2005-08-14 06:00:00.000

4 2005-08-14 06:06:00.000

5 2005-08-14 06:12:00.000

6 2005-08-14 06:18:00.000

7 2005-08-14 06:24:00.000

8 2005-08-14 06:30:00.000

1 2005-08-14 06:54:00.000

2 2005-08-14 07:00:00.000

7 2005-08-14 07:30:00.000

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 5

you're dense-ranking by hour? How about -

select *,

dense_rank() over (order by dateadd(hour,datediff(hour,0,startdatetime),0))

from #FinalValues

Hmm - never mind - sounds like something entirely different.

You should look at the "running total" solution by Jeff moden - it would work for this scenario

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 6

The running total solution is described here:

http://www.sqlservercentral.com/articles/Advanced+Querying/61716/[/url]

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

JohnDBA SSC Eights! Points: 812 More actions · Answer 7

Matt,

I tried that but that does not work because I need data that is out side the hour for example:

1 2005-08-14 05:48:00.000

3 2005-08-14 06:00:00.000

4 2005-08-14 06:06:00.000

5 2005-08-14 06:12:00.000

6 2005-08-14 06:18:00.000

7 2005-08-14 06:24:00.000

8 2005-08-14 06:30:00.000

Jeff Moden SSC Guru Points: 1004412 More actions · Answer 8

If you haven't read the article that Matt listed, you really need to go back and read it so you understand why the following works... and it works at a rate of over 7 million rows per minute...

--drop table #finalvalues

go

--===== Your table with a new column and the right kind of index

CREATE TABLE #FinalValues

(

Position int NOT NULL,

StartDateTime datetime NOT NULL,

jbmRank INT,

PRIMARY KEY CLUSTERED (StartDateTime,Position)

)

--===== This just populates the data as before

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(1,'2005-08-14 01:18:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(6,'2005-08-14 01:48:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(1,'2005-08-14 02:48:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(2,'2005-08-14 02:54:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(1,'2005-08-14 04:36:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(1,'2005-08-14 05:48:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(3,'2005-08-14 06:00:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(4,'2005-08-14 06:06:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(5,'2005-08-14 06:12:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(6,'2005-08-14 06:18:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(7,'2005-08-14 06:24:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(8,'2005-08-14 06:30:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(1,'2005-08-14 06:54:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(2,'2005-08-14 07:00:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(7,'2005-08-14 07:30:00.000')

--===== Declare a couple of necessary variables with obvious names

DECLARE @PrevPosition INT,

@PrevjbmRank INT

SELECT @PrevPosition = 0,

@PrevjbmRank = 0

--===== Do the update for rank like in the running total article based on a decrease in position

UPDATE #FinalValues

SET @PrevjbmRank = jbmRank = CASE WHEN @PrevPosition < Position THEN @PrevjbmRank + 1 ELSE 1 END,

@PrevPosition = Position

FROM #FinalValues WITH (INDEX(0))

--===== Show the final results...

SELECT *

FROM #FinalValues

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

JohnDBA SSC Eights! Points: 812 More actions · Answer 9

Jeff,

This solution is close to what I need, but I need to create groups. I cannot create groups with jbmRank. 🙁

PositionStartDateTime jbmRank

12005-08-14 01:18:00.0001

62005-08-14 01:48:00.0002

12005-08-14 02:48:00.0001

22005-08-14 02:54:00.0002

12005-08-14 04:36:00.0001

12005-08-14 05:48:00.0001

32005-08-14 06:00:00.0002

42005-08-14 06:06:00.0003

52005-08-14 06:12:00.0004

62005-08-14 06:18:00.0005

72005-08-14 06:24:00.0006

82005-08-14 06:30:00.0007

12005-08-14 06:54:00.0001

22005-08-14 07:00:00.0002

72005-08-14 07:30:00.0003

I need something like this:

Position StartDateTime Group

1 2005-08-14 01:18:00.000 1

6 2005-08-14 01:48:00.000 1

1 2005-08-14 02:48:00.000 2

2 2005-08-14 02:54:00.000 2

1 2005-08-14 04:36:00.000 3

1 2005-08-14 05:48:00.000 4

3 2005-08-14 06:00:00.000 4

4 2005-08-14 06:06:00.000 4

5 2005-08-14 06:12:00.000 4

6 2005-08-14 06:18:00.000 4

7 2005-08-14 06:24:00.000 4

8 2005-08-14 06:30:00.000 4

1 2005-08-14 06:54:00.000 5

2 2005-08-14 07:00:00.000 5

7 2005-08-14 07:30:00.000 5

I will read that article. Thank you very much!

Garadin One Orange Chip Points: 29613 More actions · Answer 10

Ok, now I get what you're trying to do. Amazing how much more sense it made as soon as you put the lines in between your groups =). So the table is ordered by time, then position (which relates to who knows what) enters sequential values from 1-10. And you want to make a group correlating to each 1-10 restart. I haven't had time to look at the running total post in detail yet, but if you can't figure out how to apply that to your situation, maybe a piece of code I just wrote for another post would point you in the right direction. It's basically doing the same thing you're trying to do here. It's not going to be nearly as fast as the method already suggested here, but if you're only dealing with very small data sets, it may not have to be.

By step 2 of this script(after slightly modifying it to fit your scenario), you should have your group field.

http://www.sqlservercentral.com/Forums/Topic582878-338-1.aspx#bm582972

Seth Phelabaum

Consistency is only a virtue if you're not a screwup. 😉

Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

Jeff Moden SSC Guru Points: 1004412 More actions · Answer 11

JohnDBA (10/8/2008)
Jeff,
This solution is close to what I need, but I need to create groups. I cannot create groups with jbmRank. 🙁
I will read that article. Thank you very much!

You bet. Sorry for the wrong answer... heh... I shouldn't post late at night.

Anyway, the solution below uses the same principle as I previously did and, as you read in the article, isn't classified as a RBAR solution. It will resolve a million rows in something less than 7 seconds... I changed the code below...

drop table #finalvalues

go

--===== Your table with a new column and the right kind of index

CREATE TABLE #FinalValues

(

Position int NOT NULL,

StartDateTime datetime NOT NULL,

jbmGroup INT,

PRIMARY KEY CLUSTERED (StartDateTime,Position)

)

--===== This just populates the data as before

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(1,'2005-08-14 01:18:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(6,'2005-08-14 01:48:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(1,'2005-08-14 02:48:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(2,'2005-08-14 02:54:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(1,'2005-08-14 04:36:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(1,'2005-08-14 05:48:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(3,'2005-08-14 06:00:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(4,'2005-08-14 06:06:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(5,'2005-08-14 06:12:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(6,'2005-08-14 06:18:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(7,'2005-08-14 06:24:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(8,'2005-08-14 06:30:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(1,'2005-08-14 06:54:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(2,'2005-08-14 07:00:00.000')

INSERT INTO #FinalValues (Position,StartDateTime)

VALUES(7,'2005-08-14 07:30:00.000')

--===== Declare a couple of necessary variables with obvious names

DECLARE @PrevPosition INT,

@PrevjbmGroup INT

SELECT @PrevPosition = 0,

@PrevjbmGroup = 1

--===== Do the update for rank like in the running total article based on a decrease in position

UPDATE #FinalValues

SET @PrevjbmGroup = jbmGroup = CASE WHEN @PrevPosition >= Position THEN @PrevjbmGroup + 1 ELSE @PrevjbmGroup END,

@PrevPosition = Position

FROM #FinalValues WITH (INDEX(0))

--===== Show the final results...

SELECT *

FROM #FinalValues

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

JohnDBA SSC Eights! Points: 812 More actions · Answer 12

JohnDBA

SSC Eights!

Points: 812

October 9, 2008 at 5:53 am

#882815

Jeff,

Thank you so much! That is perfect! 🙂

Jeff Moden SSC Guru Points: 1004412 More actions · Answer 13

You bet. Thanks for the feedback.

Just remember... if you can't wedge a clustered index in there like I did, this solution will not work correctly.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)