October 7, 2016 at 8:45 am
I have this test table:
row 1: start_date='2016-1-1 10:00:00' end_date='2016-1-1 15:00:00'
row 2: start_date='2016-1-1 13:00:00' end_date='2016-1-1 20:00:00'
In need to know the SUM of DATEDIFF of all rows, but WITHOUT the overlaps.
In this case the result should be 10 hours, and not 12 if you should use a SUM+DATEDIFF.
Anyone know this is possible in SQL?
October 7, 2016 at 9:21 am
This is based on the packing interval query by Itzik Ben-Gan, I don't have the URL to hand though
DECLARE @TestTable TABLE(start_date DATETIME, end_date DATETIME);
INSERT INTO @TestTable(start_date, end_date)
VALUES('2016-1-1 10:00:00','2016-1-1 15:00:00'),
('2016-1-1 13:00:00','2016-1-1 20:00:00');
--('2016-1-2 13:00:00','2016-1-2 14:00:00'),
--('2016-1-2 14:00:00','2016-1-2 15:00:00'),
--('2016-1-2 14:00:00','2016-1-2 20:00:00');
WITH CTE1 AS (
SELECT ts=start_date, Type = 1
,e = NULL
,s = (2*ROW_NUMBER() OVER (ORDER BY start_date))-1
FROM @TestTable
UNION ALL
SELECT ts=end_date, Type = -1
,e = (2*ROW_NUMBER() OVER (ORDER BY end_date))
,s = NULL
FROM @TestTable
),
CTE2(minval,maxval) AS (
SELECT MIN(ts),MAX(ts)
FROM (
SELECT ts,
diff = (ROW_NUMBER() OVER(ORDER BY ts)+1)/2
FROM (
SELECT ts, s, e,
se = ROW_NUMBER() OVER (ORDER BY ts, Type DESC)
FROM CTE1
) c1
WHERE se IN (e,s)
) C3
GROUP BY diff)
SELECT totalhours = SUM(DATEDIFF(Hour,minval,maxval))
FROM CTE2;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537October 7, 2016 at 9:29 am
This works just fine, thank you!
October 7, 2016 at 9:41 am
Here is an updated version that only requires a single scan of the table. There is a complete description of the logic at New Solution to the Packing Intervals Problem.
WITH C1 AS (
SELECT start_date, end_date, MAX(end_date) OVER(ORDER BY start_date, end_date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS prev_end_date
FROM @TestTable
)
, C2 AS (
SELECT start_date, end_date, SUM(is_start) OVER(ORDER BY start_date, end_date ROWS UNBOUNDED PRECEDING) AS grp
FROM C1
CROSS APPLY ( VALUES( CASE WHEN C1.start_date <= C1.prev_end_date THEN NULL ELSE 1 END )) AS a(is_start)
)
SELECT DATEDIFF(HOUR, MIN(C2.start_date), MAX(C2.end_date))
FROM C2
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 7, 2016 at 11:42 am
drew.allen (10/7/2016)
Here is an updated version that only requires a single scan of the table. There is a complete description of the logic at New Solution to the Packing Intervals Problem.
WITH C1 AS (
SELECT start_date, end_date, MAX(end_date) OVER(ORDER BY start_date, end_date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS prev_end_date
FROM @TestTable
)
, C2 AS (
SELECT start_date, end_date, SUM(is_start) OVER(ORDER BY start_date, end_date ROWS UNBOUNDED PRECEDING) AS grp
FROM C1
CROSS APPLY ( VALUES( CASE WHEN C1.start_date <= C1.prev_end_date THEN NULL ELSE 1 END )) AS a(is_start)
)
SELECT DATEDIFF(HOUR, MIN(C2.start_date), MAX(C2.end_date))
FROM C2
Drew
Based on the OP requirements that solution doesn't work. Test it out by including all of the values that were commented out. You'll notice Mark's solution correctly returns 17 hours. while yours returns 34.
October 7, 2016 at 2:11 pm
Y.B. (10/7/2016)
drew.allen (10/7/2016)
Here is an updated version that only requires a single scan of the table. There is a complete description of the logic at New Solution to the Packing Intervals Problem.
WITH C1 AS (
SELECT start_date, end_date, MAX(end_date) OVER(ORDER BY start_date, end_date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS prev_end_date
FROM @TestTable
)
, C2 AS (
SELECT start_date, end_date, SUM(is_start) OVER(ORDER BY start_date, end_date ROWS UNBOUNDED PRECEDING) AS grp
FROM C1
CROSS APPLY ( VALUES( CASE WHEN C1.start_date <= C1.prev_end_date THEN NULL ELSE 1 END )) AS a(is_start)
)
SELECT DATEDIFF(HOUR, MIN(C2.start_date), MAX(C2.end_date))
FROM C2
Drew
Based on the OP requirements that solution doesn't work. Test it out by including all of the values that were commented out. You'll notice Mark's solution correctly returns 17 hours. while yours returns 34.
Yeah, I would have caught that if there had been a larger sample size. Here's the fix.
WITH C1 AS (
SELECT start_date, end_date, MAX(end_date) OVER(ORDER BY start_date, end_date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS prev_end_date
FROM @TestTable
)
, C2 AS (
SELECT start_date, end_date, SUM(is_start) OVER(ORDER BY start_date, end_date ROWS UNBOUNDED PRECEDING) AS grp
FROM C1
CROSS APPLY ( VALUES( CASE WHEN C1.start_date <= C1.prev_end_date THEN NULL ELSE 1 END )) AS a(is_start)
)
, C3 AS (
SELECT DATEDIFF(HOUR, MIN(C2.start_date), MAX(C2.end_date)) AS grp_total
FROM C2
GROUP BY grp
)
SELECT SUM(grp_total)
FROM C3
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 8, 2016 at 2:58 am
The overlap problem can be solved in a much simpler manner, simply check if the current end is greater than the next start and if so use the next start otherwise the current end.
😎
USE TEEST;
GO
SET NOCOUNT ON;
DECLARE @TestTable TABLE(start_date DATETIME NOT NULL PRIMARY KEY CLUSTERED, end_date DATETIME);
INSERT INTO @TestTable(start_date, end_date)
VALUES('2016-1-1 10:00:00','2016-1-1 15:00:00'),
('2016-1-1 13:00:00','2016-1-1 20:00:00'),
('2016-1-2 13:00:00','2016-1-2 14:00:00'),
('2016-1-2 14:00:00','2016-1-2 15:00:00'),
('2016-1-2 15:00:00','2016-1-2 20:00:00');
;WITH BASE_DATA AS
(
SELECT
TT.start_date
,TT.end_date
,LEAD(TT.start_date,1,TT.end_date) OVER
(
ORDER BY TT.start_date ASC
) AS NEXT_START
FROM @TestTable TT
)
SELECT
SUM(DATEDIFF(HOUR,BD.start_date
,CASE
WHEN BD.end_date > BD.NEXT_START THEN BD.NEXT_START
ELSE BD.end_date
END)) AS SUM_HOUR
FROM BASE_DATA BD;
Pretty simple execution plan
October 8, 2016 at 3:25 am
This is not correct. If I insert the line
('2016-1-2 14:10:00','2016-1-2 14:20:00'),
into your sample rows the result changes, and this should not be the case as this timespan is already in the other rows...
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply