November 22, 2010 at 10:36 pm
hi all,
i'm trying to group consecutive record, anyone knows how to do this ?
this is my sample data:
/* what i have */
select dt, cat
from
(
select '2009-09-09 14:50:02.000' as dt,'OT' as cat
union all
select '2009-09-09 14:55:44.000' as dt,'OT' as cat
union all
select '2009-09-09 15:10:55.000' as dt,'OT' as cat
union all
select '2009-09-09 15:30:00.000' as dt,'OT' as cat
union all
select '2009-09-09 16:16:41.000' as dt,'OD' as cat
union all
select '2009-09-09 16:26:37.000' as dt,'OD' as cat
union all
select '2009-09-09 16:32:43.000' as dt,'OS' as cat
union all
select '2009-09-09 17:56:37.000' as dt,'OT' as cat
union all
select '2009-09-09 17:56:48.000' as dt,'OT' as cat
)
sub
/* what i want */
select dt, cat, gr
from
(
select '2009-09-09 14:50:02.000' as dt,'OT' as cat,1 as gr
union all
select '2009-09-09 14:55:44.000' as dt,'OT' as cat,1 as gr
union all
select '2009-09-09 15:10:55.000' as dt,'OT' as cat,1 as gr
union all
select '2009-09-09 15:30:00.000' as dt,'OT' as cat,1 as gr
union all
select '2009-09-09 16:16:41.000' as dt,'OD' as cat,2 as gr
union all
select '2009-09-09 16:26:37.000' as dt,'OD' as cat,2 as gr
union all
select '2009-09-09 16:32:43.000' as dt,'OS' as cat,3 as gr
union all
select '2009-09-09 17:56:37.000' as dt,'OT' as cat,4 as gr
union all
select '2009-09-09 17:56:48.000' as dt,'OT' as cat,4 as gr
)
sub
so basically, table 1 is my data, and table 2 is what i want table 1 to produce.
so far, the best i can come up is
with table1 as
(
select dt, cat
from
(
select '2009-09-09 14:50:02.000' as dt,'OT' as cat
union all
select '2009-09-09 14:55:44.000' as dt,'OT' as cat
union all
select '2009-09-09 15:10:55.000' as dt,'OT' as cat
union all
select '2009-09-09 15:30:00.000' as dt,'OT' as cat
union all
select '2009-09-09 16:16:41.000' as dt,'OD' as cat
union all
select '2009-09-09 16:26:37.000' as dt,'OD' as cat
union all
select '2009-09-09 16:32:43.000' as dt,'OS' as cat
union all
select '2009-09-09 17:56:37.000' as dt,'OT' as cat
union all
select '2009-09-09 17:56:48.000' as dt,'OT' as cat
)
sub
)
select
row_number() over (partition by cat order by dt) as rn1,
row_number() over (order by dt) as rn2,
row_number() over (order by dt) - row_number() over (partition by cat order by dt) as rn,
dt,
cat
from
table1
then group by rn. it works fine with small sample data, but when data gets bigger, the rn can get wrong result.
anybody has come across this problem before ? or has a solution for this ?
thank you
November 22, 2010 at 11:22 pm
Hello,
I guess what your are looking for is t-sql dense_rank() function
select
dense_rank() over (order by cat),
dt,
cat
from
table1
November 22, 2010 at 11:28 pm
It's quite difficult to give a solution without actually looking at the records which are causing the problem.
My wild guess would be use a dense_rank instead of row_number.
select
dense_rank() over (partition by cat order by dt) as rn1,
row_number() over (order by dt) as rn2,
row_number() over (order by dt) - dense_rank() over (partition by cat order by dt) as rn,
dt,
cat
from
table1
--------------------------------------------------------------------------------------------------
I am just an another naive wannabe DBA trying to learn SQL Server
November 22, 2010 at 11:34 pm
Eralper (11/22/2010)
Hello,I guess what your are looking for is t-sql dense_rank() function
select
dense_rank() over (order by cat),
dt,
cat
from
table1
Oops sorry didnt know you already had posted the solution.
--------------------------------------------------------------------------------------------------
I am just an another naive wannabe DBA trying to learn SQL Server
November 22, 2010 at 11:37 pm
Sachin Nandanwar (11/22/2010)
It's quite difficult to give a solution without actually looking at the records which are causing the problem.
What on Earth are you talking about? The OP gave both the sample data and the desired result. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
November 22, 2010 at 11:37 pm
Sometimes happens to me too, but I believe it is better to mention the correct solution more than once
Of course if it is really correct solution
November 22, 2010 at 11:41 pm
Justin,
Sachin's code almost did it. It just needed a push over the edge to get it where you wanted it. Here's the push...
WITH
cteGroups AS
(
SELECT dt,
cat,
Sequence = ROW_NUMBER() OVER (ORDER BY dt, cat),
TheGroup = ROW_NUMBER() OVER (ORDER BY dt, cat)
- ROW_NUMBER() OVER (PARTITION BY cat ORDER BY dt, cat)
FROM dbo.Table1
)
,
cteGrouped AS
(
SELECT dt,
cat,
OrderedGroup = Sequence - ROW_NUMBER() OVER (PARTITION BY TheGroup ORDER BY Sequence)
FROM cteGroups
)
SELECT dt,
cat,
gr = DENSE_RANK() OVER (ORDER BY OrderedGroup)
FROM cteGrouped
ORDER BY dt, cat
And very well done on your original post. It's real handy having BOTH the test data and the expected result in such a readily usable format. Thanks for taking the time.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 22, 2010 at 11:57 pm
thank you for the reply guys, however dense_rank does not solve it. i have a better data that shows my problem:
with table1 as
(
select dt, cat
from
(
select '2009-09-09 06:18:16' as dt,'OD' as cat
union all
select '2009-09-09 06:18:51' as dt,'OD' as cat
union all
select '2009-09-09 06:18:55' as dt,'OD' as cat
union all
select '2009-09-09 06:35:31' as dt,'OT' as cat
union all
select '2009-09-09 06:45:25' as dt,'OD' as cat
union all
select '2009-09-09 06:45:37' as dt,'OT' as cat
union all
select '2009-09-09 06:46:35' as dt,'OD' as cat
union all
select '2009-09-09 06:46:38' as dt,'OT' as cat
union all
select '2009-09-09 06:47:16' as dt,'OD' as cat
union all
select '2009-09-09 06:47:20' as dt,'OT' as cat /*rn 6 1st occurrence*/
union all
select '2009-09-09 06:57:27' as dt,'OD' as cat
union all
select '2009-09-09 06:57:36' as dt,'OT' as cat
union all
select '2009-09-09 06:58:53' as dt,'OD' as cat
union all
select '2009-09-09 06:58:56' as dt,'OT' as cat
union all
select '2009-09-09 06:59:30' as dt,'OD' as cat /*rn 6 2nd occurrence*/
union all
select '2009-09-09 06:59:39' as dt,'OT' as cat
union all
select '2009-09-09 07:11:10' as dt,'OD' as cat
union all
select '2009-09-09 07:11:17' as dt,'OT' as cat
union all
select '2009-09-09 07:11:49' as dt,'OD' as cat
union all
select '2009-09-09 07:11:53' as dt,'OT' as cat /*rn 11 1st occurrence */
union all
select '2009-09-09 07:12:27' as dt,'OD' as cat
union all
select '2009-09-09 07:14:31' as dt,'OS' as cat
union all
select '2009-09-09 07:26:29' as dt,'OT' as cat
union all
select '2009-09-09 07:26:39' as dt,'OD' as cat /*rn 11 2nd occurrence*/
union all
select '2009-09-09 07:28:21' as dt,'OT' as cat
union all
select '2009-09-09 07:40:17' as dt,'OD' as cat
union all
select '2009-09-09 07:40:25' as dt,'OT' as cat
) sub
)
select
dense_rank() over (order by cat) as rcat,
row_number() over (partition by cat order by dt) as rn1,
row_number() over (order by dt) as rn2,
row_number() over (order by dt) - row_number() over (partition by cat order by dt) as rn,
dt,
cat
from
table1
order by
dt
rn 6 and 11 occurs more than once because of my flaw logic of grouping.
what i want is
with table1 as
(
select dt, cat, ds
from
(
select '2009-09-09 06:18:16' as dt,'OD' as cat, 1 as ds
union all
select '2009-09-09 06:18:51' as dt,'OD' as cat, 1 as ds
union all
select '2009-09-09 06:18:55' as dt,'OD' as cat, 1 as ds
union all
select '2009-09-09 06:35:31' as dt,'OT' as cat, 2 as ds
union all
select '2009-09-09 06:45:25' as dt,'OD' as cat, 3 as ds
union all
select '2009-09-09 06:45:37' as dt,'OT' as cat, 4 as ds
union all
select '2009-09-09 06:46:35' as dt,'OD' as cat, 5 as ds
union all
select '2009-09-09 06:46:38' as dt,'OT' as cat, 6 as ds
union all
select '2009-09-09 06:47:16' as dt,'OD' as cat, 7 as ds
union all
select '2009-09-09 06:47:20' as dt,'OT' as cat, 8 as ds
union all
select '2009-09-09 06:57:27' as dt,'OD' as cat, 9 as ds
union all
select '2009-09-09 06:57:36' as dt,'OT' as cat, 10 as ds
union all
select '2009-09-09 06:58:53' as dt,'OD' as cat, 11 as ds
union all
select '2009-09-09 06:58:56' as dt,'OT' as cat, 12 as ds
union all
select '2009-09-09 06:59:30' as dt,'OD' as cat, 13 as ds
union all
select '2009-09-09 06:59:39' as dt,'OT' as cat, 14 as ds
union all
select '2009-09-09 07:11:10' as dt,'OD' as cat, 15 as ds
union all
select '2009-09-09 07:11:17' as dt,'OT' as cat, 16 as ds
union all
select '2009-09-09 07:11:49' as dt,'OD' as cat, 17 as ds
union all
select '2009-09-09 07:11:53' as dt,'OT' as cat, 18 as ds
union all
select '2009-09-09 07:12:27' as dt,'OD' as cat, 19 as ds
union all
select '2009-09-09 07:14:31' as dt,'OS' as cat, 20 as ds
union all
select '2009-09-09 07:26:29' as dt,'OT' as cat, 21 as ds
union all
select '2009-09-09 07:26:39' as dt,'OD' as cat, 22 as ds
union all
select '2009-09-09 07:28:21' as dt,'OT' as cat, 23 as ds
union all
select '2009-09-09 07:40:17' as dt,'OD' as cat, 24 as ds
union all
select '2009-09-09 07:40:25' as dt,'OT' as cat, 25 as ds
) sub
)
select
ds,
dt,
cat
from
table1
order by
dt
basically what i want is (when thinking like a cursor), create a new group if current category <> prev category
thank you
November 23, 2010 at 12:07 am
thanks Jeff for your script, however, it gives same problem as my first grouping logic.
November 23, 2010 at 2:10 am
Hi the_justin!
Maybe I am getting this wrong, but from your second output it seems that you want to group dt and cat with minutes precision on dt.
So these are in the same group:
|| 2009-09-09 06:18:16 || OD ||
|| 2009-09-09 06:18:51 || OD ||
|| 2009-09-09 06:18:55 || OD ||
but not these:
|| 2009-09-09 06:46:35 || OD ||
|| 2009-09-09 06:46:38 || OT ||
Is that right?
Cheers
November 23, 2010 at 2:38 am
A minor mod to Jeff's code fixes it
WITH
cteGroups AS
(
SELECT dt,
cat,
Sequence = ROW_NUMBER() OVER (ORDER BY dt, cat),
TheGroup = ROW_NUMBER() OVER (ORDER BY dt, cat)
- ROW_NUMBER() OVER (PARTITION BY cat ORDER BY dt, cat)
FROM dbo.Table1
)
,
cteGrouped AS
(
SELECT dt,
cat,
OrderedGroup = Sequence - ROW_NUMBER() OVER (PARTITION BY TheGroup[highlight=""],cat[/highlight] ORDER BY Sequence)
FROM cteGroups
)
SELECT dt,
cat,
gr = DENSE_RANK() OVER (ORDER BY OrderedGroup)
FROM cteGrouped
ORDER BY dt, cat
Change PARTITION BY TheGroup
to PARTITION BY TheGroup,cat
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537November 23, 2010 at 2:40 am
Hmmm , possibly not the most efficient but does the job i think 😉
with table1 as
(
select dt, cat
from
(
select '2009-09-09 06:18:16' as dt,'OD' as cat
union all
select '2009-09-09 06:18:51' as dt,'OD' as cat
union all
select '2009-09-09 06:18:55' as dt,'OD' as cat
union all
select '2009-09-09 06:35:31' as dt,'OT' as cat
union all
select '2009-09-09 06:45:25' as dt,'OD' as cat
union all
select '2009-09-09 06:45:37' as dt,'OT' as cat
union all
select '2009-09-09 06:46:35' as dt,'OD' as cat
union all
select '2009-09-09 06:46:38' as dt,'OT' as cat
union all
select '2009-09-09 06:47:16' as dt,'OD' as cat
union all
select '2009-09-09 06:47:20' as dt,'OT' as cat /*rn 6 1st occurrence*/
union all
select '2009-09-09 06:57:27' as dt,'OD' as cat
union all
select '2009-09-09 06:57:36' as dt,'OT' as cat
union all
select '2009-09-09 06:58:53' as dt,'OD' as cat
union all
select '2009-09-09 06:58:56' as dt,'OT' as cat
union all
select '2009-09-09 06:59:30' as dt,'OD' as cat /*rn 6 2nd occurrence*/
union all
select '2009-09-09 06:59:39' as dt,'OT' as cat
union all
select '2009-09-09 07:11:10' as dt,'OD' as cat
union all
select '2009-09-09 07:11:17' as dt,'OT' as cat
union all
select '2009-09-09 07:11:49' as dt,'OD' as cat
union all
select '2009-09-09 07:11:53' as dt,'OT' as cat /*rn 11 1st occurrence */
union all
select '2009-09-09 07:12:27' as dt,'OD' as cat
union all
select '2009-09-09 07:14:31' as dt,'OS' as cat
union all
select '2009-09-09 07:26:29' as dt,'OT' as cat
union all
select '2009-09-09 07:26:39' as dt,'OD' as cat /*rn 11 2nd occurrence*/
union all
select '2009-09-09 07:28:21' as dt,'OT' as cat
union all
select '2009-09-09 07:40:17' as dt,'OD' as cat
union all
select '2009-09-09 07:40:25' as dt,'OT' as cat
) sub
),
withRown
as
(
select *, row_number() over (order by dt) as rown,
row_number() over (order by cat,dt desc) as grp
from table1
),
withGrouper
as(
select *,grp+rown as Grouper from withRown
),
withmin
as
(
select cat,Grouper,min(dt) as mindt,max(dt) as maxdt
from withGrouper
group by cat,Grouper
),
withgrouping
as
(
select *,Row_number() over(order by mindt) as Grouping from withmin
)
Select table1.cat,table1.dt,Grouping
from withgrouping
join table1
on table1.dt between withgrouping.mindt and withgrouping.maxdt
and table1.cat = withgrouping.cat
order by grouping
November 23, 2010 at 3:01 am
This sounds like it could be solved by the "Quirky Update" method to me - as you need to compare each row to the previous row.
See Jeff's article about it here : http://www.sqlservercentral.com/articles/T-SQL/68467
And also see this post about how to make it a touch safer : http://www.sqlservercentral.com/Forums/FindPost981258.aspx
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
November 23, 2010 at 7:41 am
mister.magoo (11/23/2010)
This sounds like it could be solved by the "Quirky Update" method to me - as you need to compare each row to the previous row.See Jeff's article about it here : http://www.sqlservercentral.com/articles/T-SQL/68467
And also see this post about how to make it a touch safer : http://www.sqlservercentral.com/Forums/FindPost981258.aspx
Heh... Ironic... I'm usually the one to push my own articles. I was actually tempted to use the quirky update but thought I'd take a different route because I thought I'd run into problems with the OP on using a Temp Table or modifying the original table. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
November 23, 2010 at 7:42 am
Are you all set now?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply