how to get consecutive records grouped

Question

Post reply

how to get consecutive records grouped

the_justin

SSC Enthusiast

Points: 175
More actions
November 22, 2010 at 10:36 pm

#241577

hi all,
i'm trying to group consecutive record, anyone knows how to do this ?
this is my sample data:
/* what i have */
select dt, cat
from
(
select '2009-09-09 14:50:02.000' as dt,'OT' as cat
union all
select '2009-09-09 14:55:44.000' as dt,'OT' as cat
union all
select '2009-09-09 15:10:55.000' as dt,'OT' as cat
union all
select '2009-09-09 15:30:00.000' as dt,'OT' as cat
union all
select '2009-09-09 16:16:41.000' as dt,'OD' as cat
union all
select '2009-09-09 16:26:37.000' as dt,'OD' as cat
union all
select '2009-09-09 16:32:43.000' as dt,'OS' as cat
union all
select '2009-09-09 17:56:37.000' as dt,'OT' as cat
union all
select '2009-09-09 17:56:48.000' as dt,'OT' as cat
)
sub
/* what i want */
select dt, cat, gr
from
(
select '2009-09-09 14:50:02.000' as dt,'OT' as cat,1 as gr
union all
select '2009-09-09 14:55:44.000' as dt,'OT' as cat,1 as gr
union all
select '2009-09-09 15:10:55.000' as dt,'OT' as cat,1 as gr
union all
select '2009-09-09 15:30:00.000' as dt,'OT' as cat,1 as gr
union all
select '2009-09-09 16:16:41.000' as dt,'OD' as cat,2 as gr
union all
select '2009-09-09 16:26:37.000' as dt,'OD' as cat,2 as gr
union all
select '2009-09-09 16:32:43.000' as dt,'OS' as cat,3 as gr
union all
select '2009-09-09 17:56:37.000' as dt,'OT' as cat,4 as gr
union all
select '2009-09-09 17:56:48.000' as dt,'OT' as cat,4 as gr
)
sub
so basically, table 1 is my data, and table 2 is what i want table 1 to produce.
so far, the best i can come up is
with table1 as
(
select dt, cat
from
(
select '2009-09-09 14:50:02.000' as dt,'OT' as cat
union all
select '2009-09-09 14:55:44.000' as dt,'OT' as cat
union all
select '2009-09-09 15:10:55.000' as dt,'OT' as cat
union all
select '2009-09-09 15:30:00.000' as dt,'OT' as cat
union all
select '2009-09-09 16:16:41.000' as dt,'OD' as cat
union all
select '2009-09-09 16:26:37.000' as dt,'OD' as cat
union all
select '2009-09-09 16:32:43.000' as dt,'OS' as cat
union all
select '2009-09-09 17:56:37.000' as dt,'OT' as cat
union all
select '2009-09-09 17:56:48.000' as dt,'OT' as cat
)
sub
)
select
row_number() over (partition by cat order by dt) as rn1,
row_number() over (order by dt) as rn2,
row_number() over (order by dt) - row_number() over (partition by cat order by dt) as rn,
dt,
cat
from
table1
then group by rn. it works fine with small sample data, but when data gets bigger, the rn can get wrong result.
anybody has come across this problem before ? or has a solution for this ?
thank you
Forum Etiquette: How to post Reporting Services problems

[/url]

Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
[/url]

How to Post Performance Problems - by Gail Shaw
[/url]

Eralper SSCarpal Tunnel Points: 4438 More actions · Answer 1

Hello,

I guess what your are looking for is t-sql dense_rank() function

select

dense_rank() over (order by cat),

dt,

cat

from

table1

Eralper
SQL Server and T-SQL Tutorials and Articles
Microsoft Certification and Certification Exams

Sachin Nandanwar SSCertifiable Points: 7699 More actions · Answer 2

It's quite difficult to give a solution without actually looking at the records which are causing the problem.

My wild guess would be use a dense_rank instead of row_number.

select

dense_rank() over (partition by cat order by dt) as rn1,

row_number() over (order by dt) as rn2,

row_number() over (order by dt) - dense_rank() over (partition by cat order by dt) as rn,

dt,

cat

from

table1

--------------------------------------------------------------------------------------------------
I am just an another naive wannabe DBA trying to learn SQL Server

Sachin Nandanwar SSCertifiable Points: 7699 More actions · Answer 3

Eralper (11/22/2010)
Hello,
I guess what your are looking for is t-sql dense_rank() function
select
dense_rank() over (order by cat),
dt,
cat
from
table1

Oops sorry didnt know you already had posted the solution.

--------------------------------------------------------------------------------------------------
I am just an another naive wannabe DBA trying to learn SQL Server

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 4

Sachin Nandanwar (11/22/2010)
It's quite difficult to give a solution without actually looking at the records which are causing the problem.

What on Earth are you talking about? The OP gave both the sample data and the desired result. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Eralper SSCarpal Tunnel Points: 4438 More actions · Answer 5

Sometimes happens to me too, but I believe it is better to mention the correct solution more than once

Of course if it is really correct solution

Eralper
SQL Server and T-SQL Tutorials and Articles
Microsoft Certification and Certification Exams

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 6

Justin,

Sachin's code almost did it. It just needed a push over the edge to get it where you wanted it. Here's the push...

WITH

cteGroups AS

(

SELECT dt,

cat,

Sequence = ROW_NUMBER() OVER (ORDER BY dt, cat),

TheGroup = ROW_NUMBER() OVER (ORDER BY dt, cat)

- ROW_NUMBER() OVER (PARTITION BY cat ORDER BY dt, cat)

FROM dbo.Table1

)

,

cteGrouped AS

(

SELECT dt,

cat,

OrderedGroup = Sequence - ROW_NUMBER() OVER (PARTITION BY TheGroup ORDER BY Sequence)

FROM cteGroups

)

SELECT dt,

cat,

gr = DENSE_RANK() OVER (ORDER BY OrderedGroup)

FROM cteGrouped

ORDER BY dt, cat

And very well done on your original post. It's real handy having BOTH the test data and the expected result in such a readily usable format. Thanks for taking the time.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

the_justin SSC Enthusiast Points: 175 More actions · Answer 7

thank you for the reply guys, however dense_rank does not solve it. i have a better data that shows my problem:

with table1 as

(

select dt, cat

from

(

select '2009-09-09 06:18:16' as dt,'OD' as cat

union all

select '2009-09-09 06:18:51' as dt,'OD' as cat

union all

select '2009-09-09 06:18:55' as dt,'OD' as cat

union all

select '2009-09-09 06:35:31' as dt,'OT' as cat

union all

select '2009-09-09 06:45:25' as dt,'OD' as cat

union all

select '2009-09-09 06:45:37' as dt,'OT' as cat

union all

select '2009-09-09 06:46:35' as dt,'OD' as cat

union all

select '2009-09-09 06:46:38' as dt,'OT' as cat

union all

select '2009-09-09 06:47:16' as dt,'OD' as cat

union all

select '2009-09-09 06:47:20' as dt,'OT' as cat /*rn 6 1st occurrence*/

union all

select '2009-09-09 06:57:27' as dt,'OD' as cat

union all

select '2009-09-09 06:57:36' as dt,'OT' as cat

union all

select '2009-09-09 06:58:53' as dt,'OD' as cat

union all

select '2009-09-09 06:58:56' as dt,'OT' as cat

union all

select '2009-09-09 06:59:30' as dt,'OD' as cat /*rn 6 2nd occurrence*/

union all

select '2009-09-09 06:59:39' as dt,'OT' as cat

union all

select '2009-09-09 07:11:10' as dt,'OD' as cat

union all

select '2009-09-09 07:11:17' as dt,'OT' as cat

union all

select '2009-09-09 07:11:49' as dt,'OD' as cat

union all

select '2009-09-09 07:11:53' as dt,'OT' as cat /*rn 11 1st occurrence */

union all

select '2009-09-09 07:12:27' as dt,'OD' as cat

union all

select '2009-09-09 07:14:31' as dt,'OS' as cat

union all

select '2009-09-09 07:26:29' as dt,'OT' as cat

union all

select '2009-09-09 07:26:39' as dt,'OD' as cat /*rn 11 2nd occurrence*/

union all

select '2009-09-09 07:28:21' as dt,'OT' as cat

union all

select '2009-09-09 07:40:17' as dt,'OD' as cat

union all

select '2009-09-09 07:40:25' as dt,'OT' as cat

) sub

)

select

dense_rank() over (order by cat) as rcat,

row_number() over (partition by cat order by dt) as rn1,

row_number() over (order by dt) as rn2,

row_number() over (order by dt) - row_number() over (partition by cat order by dt) as rn,

dt,

cat

from

table1

order by

dt

rn 6 and 11 occurs more than once because of my flaw logic of grouping.

what i want is

with table1 as

(

select dt, cat, ds

from

(

select '2009-09-09 06:18:16' as dt,'OD' as cat, 1 as ds

union all

select '2009-09-09 06:18:51' as dt,'OD' as cat, 1 as ds

union all

select '2009-09-09 06:18:55' as dt,'OD' as cat, 1 as ds

union all

select '2009-09-09 06:35:31' as dt,'OT' as cat, 2 as ds

union all

select '2009-09-09 06:45:25' as dt,'OD' as cat, 3 as ds

union all

select '2009-09-09 06:45:37' as dt,'OT' as cat, 4 as ds

union all

select '2009-09-09 06:46:35' as dt,'OD' as cat, 5 as ds

union all

select '2009-09-09 06:46:38' as dt,'OT' as cat, 6 as ds

union all

select '2009-09-09 06:47:16' as dt,'OD' as cat, 7 as ds

union all

select '2009-09-09 06:47:20' as dt,'OT' as cat, 8 as ds

union all

select '2009-09-09 06:57:27' as dt,'OD' as cat, 9 as ds

union all

select '2009-09-09 06:57:36' as dt,'OT' as cat, 10 as ds

union all

select '2009-09-09 06:58:53' as dt,'OD' as cat, 11 as ds

union all

select '2009-09-09 06:58:56' as dt,'OT' as cat, 12 as ds

union all

select '2009-09-09 06:59:30' as dt,'OD' as cat, 13 as ds

union all

select '2009-09-09 06:59:39' as dt,'OT' as cat, 14 as ds

union all

select '2009-09-09 07:11:10' as dt,'OD' as cat, 15 as ds

union all

select '2009-09-09 07:11:17' as dt,'OT' as cat, 16 as ds

union all

select '2009-09-09 07:11:49' as dt,'OD' as cat, 17 as ds

union all

select '2009-09-09 07:11:53' as dt,'OT' as cat, 18 as ds

union all

select '2009-09-09 07:12:27' as dt,'OD' as cat, 19 as ds

union all

select '2009-09-09 07:14:31' as dt,'OS' as cat, 20 as ds

union all

select '2009-09-09 07:26:29' as dt,'OT' as cat, 21 as ds

union all

select '2009-09-09 07:26:39' as dt,'OD' as cat, 22 as ds

union all

select '2009-09-09 07:28:21' as dt,'OT' as cat, 23 as ds

union all

select '2009-09-09 07:40:17' as dt,'OD' as cat, 24 as ds

union all

select '2009-09-09 07:40:25' as dt,'OT' as cat, 25 as ds

) sub

)

select

ds,

dt,

cat

from

table1

order by

dt

basically what i want is (when thinking like a cursor), create a new group if current category <> prev category

thank you

the_justin SSC Enthusiast Points: 175 More actions · Answer 8

thanks Jeff for your script, however, it gives same problem as my first grouping logic.

Brigadur SSCrazy Points: 2097 More actions · Answer 9

Hi the_justin!

Maybe I am getting this wrong, but from your second output it seems that you want to group dt and cat with minutes precision on dt.

So these are in the same group:

|| 2009-09-09 06:18:16 || OD ||

|| 2009-09-09 06:18:51 || OD ||

|| 2009-09-09 06:18:55 || OD ||

but not these:

|| 2009-09-09 06:46:35 || OD ||

|| 2009-09-09 06:46:38 || OT ||

Is that right?

Cheers

Mark Cowne One Orange Chip Points: 26954 More actions · Answer 10

A minor mod to Jeff's code fixes it

WITH

cteGroups AS

(

SELECT dt,

cat,

Sequence = ROW_NUMBER() OVER (ORDER BY dt, cat),

TheGroup = ROW_NUMBER() OVER (ORDER BY dt, cat)

- ROW_NUMBER() OVER (PARTITION BY cat ORDER BY dt, cat)

FROM dbo.Table1

)

,

cteGrouped AS

(

SELECT dt,

cat,

OrderedGroup = Sequence - ROW_NUMBER() OVER (PARTITION BY TheGroup[highlight=""],cat[/highlight] ORDER BY Sequence)

FROM cteGroups

)

SELECT dt,

cat,

gr = DENSE_RANK() OVER (ORDER BY OrderedGroup)

FROM cteGrouped

ORDER BY dt, cat

Change PARTITION BY TheGroup

to PARTITION BY TheGroup,cat

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Dave Ballantyne SSC-Dedicated Points: 33667 More actions · Answer 11

Hmmm , possibly not the most efficient but does the job i think 😉

with table1 as

(

select dt, cat

from

(

select '2009-09-09 06:18:16' as dt,'OD' as cat

union all

select '2009-09-09 06:18:51' as dt,'OD' as cat

union all

select '2009-09-09 06:18:55' as dt,'OD' as cat

union all

select '2009-09-09 06:35:31' as dt,'OT' as cat

union all

select '2009-09-09 06:45:25' as dt,'OD' as cat

union all

select '2009-09-09 06:45:37' as dt,'OT' as cat

union all

select '2009-09-09 06:46:35' as dt,'OD' as cat

union all

select '2009-09-09 06:46:38' as dt,'OT' as cat

union all

select '2009-09-09 06:47:16' as dt,'OD' as cat

union all

select '2009-09-09 06:47:20' as dt,'OT' as cat /*rn 6 1st occurrence*/

union all

select '2009-09-09 06:57:27' as dt,'OD' as cat

union all

select '2009-09-09 06:57:36' as dt,'OT' as cat

union all

select '2009-09-09 06:58:53' as dt,'OD' as cat

union all

select '2009-09-09 06:58:56' as dt,'OT' as cat

union all

select '2009-09-09 06:59:30' as dt,'OD' as cat /*rn 6 2nd occurrence*/

union all

select '2009-09-09 06:59:39' as dt,'OT' as cat

union all

select '2009-09-09 07:11:10' as dt,'OD' as cat

union all

select '2009-09-09 07:11:17' as dt,'OT' as cat

union all

select '2009-09-09 07:11:49' as dt,'OD' as cat

union all

select '2009-09-09 07:11:53' as dt,'OT' as cat /*rn 11 1st occurrence */

union all

select '2009-09-09 07:12:27' as dt,'OD' as cat

union all

select '2009-09-09 07:14:31' as dt,'OS' as cat

union all

select '2009-09-09 07:26:29' as dt,'OT' as cat

union all

select '2009-09-09 07:26:39' as dt,'OD' as cat /*rn 11 2nd occurrence*/

union all

select '2009-09-09 07:28:21' as dt,'OT' as cat

union all

select '2009-09-09 07:40:17' as dt,'OD' as cat

union all

select '2009-09-09 07:40:25' as dt,'OT' as cat

) sub

),

withRown

as

(

select *, row_number() over (order by dt) as rown,

row_number() over (order by cat,dt desc) as grp

from table1

),

withGrouper

as(

select *,grp+rown as Grouper from withRown

),

withmin

as

(

select cat,Grouper,min(dt) as mindt,max(dt) as maxdt

from withGrouper

group by cat,Grouper

),

withgrouping

as

(

select *,Row_number() over(order by mindt) as Grouping from withmin

)

Select table1.cat,table1.dt,Grouping

from withgrouping

join table1

on table1.dt between withgrouping.mindt and withgrouping.maxdt

and table1.cat = withgrouping.cat

order by grouping

Clear Sky SQL
My Blog[/url]

mister.magoo SSC-Forever Points: 47174 More actions · Answer 12

This sounds like it could be solved by the "Quirky Update" method to me - as you need to compare each row to the previous row.

See Jeff's article about it here : http://www.sqlservercentral.com/articles/T-SQL/68467

And also see this post about how to make it a touch safer : http://www.sqlservercentral.com/Forums/FindPost981258.aspx

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 13

mister.magoo (11/23/2010)
This sounds like it could be solved by the "Quirky Update" method to me - as you need to compare each row to the previous row.
See Jeff's article about it here : http://www.sqlservercentral.com/articles/T-SQL/68467
And also see this post about how to make it a touch safer : http://www.sqlservercentral.com/Forums/FindPost981258.aspx

Heh... Ironic... I'm usually the one to push my own articles. I was actually tempted to use the quirky update but thought I'd take a different route because I thought I'd run into problems with the OP on using a Temp Table or modifying the original table. 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 14

@justin,

Are you all set now?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)