May 29, 2012 at 1:01 am
Lets say I have a table as
MIDTime (Hh:Mm)
M110:22
M112:15
M113:22
M116:00
M117:50
I need to aggregate the counts for MID(M1) based on the condition "to aggregate values that are beyond 2 hrs window"
So in above case , the valid ones are row1, row 3, row 4 .
So total aggregated value for M1 =3
I know I can write a cursor/loop to do this , but considering that I have a huge volume of data (~20M), is there a better way to solve this ?
Thanks
May 29, 2012 at 2:38 am
hi im not 100 % sure on what your asking and what you expect in results, looking at your example table there is nothing to filter down on you can group by the mid or time but from your explanation im not sure how you come to rows 1 3 etc are relevant any way if you wanted to calculate the time difference in hours by each mid then you can use something as simple as
select MID, datediff(hh,min(ttm),max (ttm))
from table_name
group by mid
***The first step is always the hardest *******
May 29, 2012 at 2:42 am
sridhar_kola (5/29/2012)
Lets say I have a table asMIDTime (Hh:Mm)
M110:22
M112:15
M113:22
M116:00
M117:50
I need to aggregate the counts for MID(M1) based on the condition "to aggregate values that are beyond 2 hrs window"
So in above case , the valid ones are row1, row 3, row 4 .
So total aggregated value for M1 =3
I know I can write a cursor/loop to do this , but considering that I have a huge volume of data (~20M), is there a better way to solve this ?
Thanks
What do you mean by "to aggregate values that are beyond 2 hrs window"??
I am assuming that you want to select successive rows with time difference >= 2(in hours).
Going by this assumption....only Row 4 satisfies the requirement.
How do Row1 and Row3 satisfy this requirement??....I'm might be wrong in assuming what your requirement is.
Please explain the business logic a little more.
May 29, 2012 at 2:48 am
@sridhar_kola i must say im in agreement about how to work out where the 2 hours comes in:w00t:
***The first step is always the hardest *******
May 29, 2012 at 3:26 am
The logic seems to be to always select the first row (10:22 here), skip the second row because 12:15 is within 2 hours of 10:22, select 13:22 because it is over 2 hours from 10:22, select 16:00 because it is over two hours from 13:22, and skip 17:50 because it is within two hours of 16:00.
May 29, 2012 at 3:37 am
Can you paste in your logic and we may be in a better position to help you However,
have a look at this CTE from your explanation of the table this may help
;with CTE
as
(
select mid,row_number() over(partition by MID order by ttm) as id,ttm
from t1)
select * from cte a inner join cte b on a.id=b.id+1
where datediff(hh,a.ttm,b.ttm)>=2
***The first step is always the hardest *******
May 29, 2012 at 4:00 am
This is the first solution that springs to mind. Better ones are surely possible, I'll have another think about it shortly.
DECLARE @Example AS TABLE
(
MID character(2) NOT NULL,
Time time NOT NULL,
PRIMARY KEY (MID, Time)
);
INSERT @Example
(MID, Time)
VALUES
('M1', '10:22'),
('M1', '12:15'),
('M1', '13:22'),
('M1', '16:00'),
('M1', '17:50'),
('M2', '09:00'),
('M2', '10:59'),
('M2', '13:00'),
('M2', '15:30'),
('M2', '17:30');
WITH rCTE AS
(
-- First record per group
SELECT
e.MID,
MIN(e.Time) AS Time
FROM @Example AS e
GROUP BY
e.MID
UNION ALL
-- Recursive search
SELECT
x.MID,
x.Time
FROM
(
-- Find rows at least 2 hours away from the current one
-- And number (cannot use TOP in recursive CTE)
SELECT
e.MID,
e.Time,
rn = ROW_NUMBER() OVER (
ORDER BY e.Time)
FROM @Example AS e
JOIN rCTE ON
rCTE.MID = e.MID
AND rCTE.Time <= e.Time
AND rCTE.Time <= DATEADD(HOUR, -2, e.Time)
) AS x
WHERE
-- First row (translates to a TOP)
x.rn = 1
)
SELECT
rCTE.MID,
row_count = COUNT_BIG(*)
FROM rCTE
GROUP BY
rCTE.MID;
May 29, 2012 at 4:07 am
I think you have not understood the requirement. Your CTE will give the difference between previous and current time. But the requirement is to get the time difference from the last successful record.
Please see my example above
May 29, 2012 at 4:10 am
i think you need to define what is a successful record as i understand what you are asking for is what we have all completed and that is to check the time between the last record and the next.
***The first step is always the hardest *******
May 29, 2012 at 4:10 am
sridhar_kola (5/29/2012)
I think you have not understood the requirement. Your CTE will give the difference between previous and current time. But the requirement is to get the time difference from the last successful record.Please see my example above
That's what it does - it's a recursive CTE. Please run the code - it returns 3 for M1 and 4 for M2. I hope I did understand the requirement, because it's my explanation (from the original thread) that you posted here 🙂
May 29, 2012 at 4:16 am
sridhar_kola (5/29/2012)
I think you have not understood the requirement. Your CTE will give the difference between previous and current time. But the requirement is to get the time difference from the last successful record.Please see my example above
No, the OP has made it clear enough.
The requirement here is to:
1. Select the first record.(Last Successful Record)
2. Check the next record with the Last Successful Value.
3. Select the Second record only if it is 2 or more hours from the Last Successful Record Else Move to
the Next Record and repeat step 2.
Edit: Chris's method works fine. I just tested it.
May 29, 2012 at 4:58 am
Cheers Paul ! This is what I expected !
May 29, 2012 at 4:59 am
Cheers Paul !! This is what I expected !
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply