August 2, 2012 at 11:29 am
Hi,
Today I was asked what the fastest way was to find out if a set of date ranges intersect.
To clarify, for date ranges (DR1, DR2 , DR3);
Where DR1 INTERSECTS DR2 AND DR2 INTERSECTS DR3
But DR1 doen't INTERSECT DR3
RETURN False.
So, all DRn need to intersect/overlap for a positive.
Posted below is what I came up with. I've not done any serious testing on it (so it might even lie to you :w00t:)
But if you know of a better way, please let me know.
And it goes without saying that you should use it freely if it works for you... or works at all 😀
--1.Create table variable type
CREATE TYPE DateRanges AS TABLE
(
FromDateDATE,
ToDateDATE
)
GO
--2.CREATE function
CREATE FUNCTION dbo.isDatesOverlap
(
@DateRanges DateRanges READONLY
)
RETURNS BIT
AS
BEGIN
DECLARE @OverLap BIT = 1
;WITH Ranges AS
(
SELECT RangeID= ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
,FromDate
,ToDate
FROM @DateRanges
)
SELECT
@OverLap =
CASE
WHEN
SUM(
CASE
WHENT2.FromDate BETWEEN T1.FromDate AND T1.ToDate OR
T2.ToDate BETWEEN T1.FromDate AND T1.ToDate
THEN 1 ELSE 0
END
) = COUNT(*)
THEN 1 ELSE 0
END
FROMRangesT1
CROSS JOIN
RangesT2
WHERE T1.RangeID = 1 AND T2.RangeID > 1
RETURN @OverLap
END
GO
--3. Call example
DECLARE @DateRanges DateRanges
INSERT INTO @DateRanges (FromDate, ToDate)
SELECT '2000-01-01', '2000-01-15' UNION ALL
SELECT '2000-01-10', '2000-02-10' UNION ALL
SELECT '2000-01-02', '2000-01-03'
SELECT dbo.isDatesOverlap(@DateRanges)
Looking forward to your feedback!
Thank you.
August 2, 2012 at 12:42 pm
i like the cross apply however you can turn it into an Inline Scalar Function (an iTVF that returns 1 value, Thanks Jeff http://www.sqlservercentral.com/articles/T-SQL/91724/) by using the following function.
CREATE FUNCTION dbo.isDatesOverlap
(
@DateRanges DateRanges READONLY
)
RETURNS TABLE
AS
RETURN
WITH Ranges AS
(
SELECT RangeID= ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
,FromDate
,ToDate
FROM @DateRanges
)
SELECT
CASE
WHEN
SUM(
CASE
WHENT2.FromDate BETWEEN T1.FromDate AND T1.ToDate OR
T2.ToDate BETWEEN T1.FromDate AND T1.ToDate
THEN 1 ELSE 0
END
) = COUNT(*)
THEN 1 ELSE 0
END AS IsContig
FROMRangesT1
CROSS JOIN
RangesT2
WHERE T1.RangeID = 1 AND T2.RangeID > 1
GO
the difference is that i just return the result of the select instead of setting your @Overlap. you may pick up a decent speed boost if you are running any sort of large numbers.
as far as the actual code, the cross apply is how i would have gone as now you can pass in an unlimited amount of date pairs.
the usage becomes SELECT * FROM dbo.isDatesOverlap(@DateRanges)
instead of SELECT dbo.isDatesOverlap(@DateRanges)
however i have a question on the requirements. Given the date ranges:
INSERT INTO @DateRanges (FromDate, ToDate)
SELECT '2000-01-01', '2000-01-15' UNION ALL
SELECT '2000-01-10', '2000-02-10' UNION ALL
SELECT '2000-01-17', '2000-01-19' -- does only overlaps one range not both
should the results be a 0
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
August 2, 2012 at 1:29 pm
Very cool, thank you for the pointers and the link to Jeff's article capn.hector!
Will run some tests in the morning 🙂
August 3, 2012 at 12:30 am
diamondgm (8/2/2012)
Very cool, thank you for the pointers and the link to Jeff's article capn.hector!Will run some tests in the morning 🙂
@Capn; Sorry I missed your question in the post.
All ranges need a common intersect. If DR1 overlaps DR2 and DR3 overlaps DR2 but not DR1 then 0.
Also, I've run some (probably very primitive) tests and the results seem pretty much neck and neck between the functions you and I have posted.
When I have more time, I'd like to see what can be done with CROSS APPLY
I've got to get going to work so I'm just going to post the mess;
INSERT INTO [Sandbox].[dbo].[DateRanges]
([Date1]
,[Date2])
SELECT CAST(CAST(RAND(CAST(CAST(NEWID() AS VARBINARY)AS INT)) * 10000 AS DATETIME) AS DATE)
,CAST(CAST(RAND(CAST(CAST(NEWID() AS VARBINARY)AS INT)) * 10000 AS DATETIME) AS DATE)
GO 300000
DELETE FROM dbo.DateRanges
WHERE Date1>= Date2
SELECT COUNT(*)
FROM DateRanges
DECLARE @DateRanges DateRanges,
@Start DATETIME
INSERT INTO @DateRanges (FromDate, ToDate)
SELECT * FROM dbo.DateRanges --ORDER BY NEWID()
SET @Start = GETDATE()
SELECT dbo.isDatesOverlap(@DateRanges)
SELECT DATEDIFF(MS, @Start, GETDATE())
SET @Start = GETDATE()
SELECT * FROM dbo.isDatesOverlap2(@DateRanges)
SELECT DATEDIFF(MS, @Start, GETDATE())
August 3, 2012 at 7:53 am
diamondgm (8/3/2012)
diamondgm (8/2/2012)
Very cool, thank you for the pointers and the link to Jeff's article capn.hector!Will run some tests in the morning 🙂
@Capn; Sorry I missed your question in the post.
All ranges need a common intersect. If DR1 overlaps DR2 and DR3 overlaps DR2 but not DR1 then 0.
Also, I've run some (probably very primitive) tests and the results seem pretty much neck and neck between the functions you and I have posted.
When I have more time, I'd like to see what can be done with CROSS APPLY
I've got to get going to work so I'm just going to post the mess;
the problem with testing this is that in order to really get a feel you need large volumes of data and good random data may always yield a 0. to really test you need to set up a test bed as similar to your production environment as possible.
if only 3 or 4 values are ever going to be passed to this function you will have a very hard time finding any difference between the two and then it becomes a matter of which way you want to call the function (Select dbo.function() or SELECT * FROM dbo.function()) in the Jeff's write up he was running over 1 million rows of data the differences become clear but if you run the same tests on a small record set (a size of 101 lets say) here is what happens to the results:
-------------------------------------------------
Direct N*2 - Time: 0
--------------------------------------------------------------
Standard Scalar UDF N*2 - Time: 0
-----------------------------------------------------------------
Inline Scalar Function N*2 - Time: 0
As you can see if the input set is small enough even a very slow method that will not scale well will not have enough iterations to actually show the difference. its also the reason to test any thing on production size data to make sure that it will scale.
Here is the test bed i used (All of the function code is from Jeff Moden's Spackle article table and columns changed to match the test bed)
WITH e1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) x(N)),
e2(N) AS (Select 0 UNION ALL
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM e1 a, e1 b)
SELECT N INTO #TempTally FROM e2
DECLARE @Result INT
DECLARE @Start DATETIME
SELECT @Start = GETDATE()
SELECT @Result = N * 2
FROM #TempTally
SELECT 'Direct N*2 - Time: ' + CAST(DATEDIFF(ms,@Start,GETDATE()) AS VARCHAR)
SELECT @Start = GETDATE()
SELECT @Result = dbo.TimesTwo(N)
FROM #TempTally
SELECT 'Standard Scalar UDF N*2 - Time: ' + CAST(DATEDIFF(ms,@Start,GETDATE()) AS VARCHAR)
SELECT @Start = GETDATE()
SELECT @Result = ca.Doubled
FROM #TempTally
CROSS APPLY dbo.TimesTwoiSF(N) ca
SELECT 'Inline Scalar Function N*2 - Time: ' + CAST(DATEDIFF(ms,@Start,GETDATE()) AS VARCHAR)
References: http://www.sqlservercentral.com/articles/T-SQL/91724/
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
August 3, 2012 at 8:18 am
As Joe alluded to, there is a much simpler method of calculating intersections. The standard uses only two dates, but it easily be modified to account for multiple dates.
SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting
FROM @DateRanges
Here is the comparison data.
CROSS JOIN
Table '#2E8BB1C1'. Scan count 2, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
GROUP BY
Table '#2E8BB1C1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
As you can see, the GROUP BY only scans the table once, whereas the CROSS JOIN scans the table twice. As Capn said, the difference is negligible for small tables.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
August 3, 2012 at 12:38 pm
drew.allen (8/3/2012)
As Joe alluded to, there is a much simpler method of calculating intersections. The standard uses only two dates, but it easily be modified to account for multiple dates.
SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting
FROM @DateRanges
Perhaps I'm being obtuse, but how would this help with the problem as posed in my code?
CROSS JOIN
Table '#2E8BB1C1'. Scan count 2, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
GROUP BY
Table '#2E8BB1C1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
As you can see, the GROUP BY only scans the table once, whereas the CROSS JOIN scans the table twice. As Capn said, the difference is negligible for small tables.
Would you mind showing me how you would use this? I'm at a bit of a loss.
August 3, 2012 at 3:13 pm
FromDate ToDate
'2000-01-01' '2000-01-15'
'2000-01-10' '2000-02-10'
'2000-01-02' '2000-01-03'
Looking at the values above, you see the the Bolded value is greater than the value in italics. All three of these ranges do not overlap each other.
FromDate ToDate
'2000-01-01' '2000-01-15'
'2000-01-10' '2000-02-10'
'2000-01-02' '2000-01-10'
Now looking at them, you see that there is an overlap.
August 4, 2012 at 8:58 am
drew.allen (8/3/2012)
As Joe alluded to, there is a much simpler method of calculating intersections. The standard uses only two dates, but it easily be modified to account for multiple dates.
SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting
FROM @DateRanges
Here is the comparison data.
CROSS JOIN
Table '#2E8BB1C1'. Scan count 2, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
GROUP BY
Table '#2E8BB1C1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
As you can see, the GROUP BY only scans the table once, whereas the CROSS JOIN scans the table twice. As Capn said, the difference is negligible for small tables.
Drew
Gosh, Drew. I'm just not seeing how that works. In the following example where all of the dates are overlapping, we still get a zero.
-- drop table #testtable
SELECT FromDate = CAST(d.FromDate AS DATETIME),
ToDate = CAST(d.ToDate AS DATETIME)
INTO #TestTable
FROM (
SELECT '2002-01-01', '2001-01-01' UNION ALL
SELECT '2002-01-01', '2003-01-01' UNION ALL
SELECT '2002-01-01', '2005-01-01'
) d (FromDate,ToDate)
;
SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting
FROM #TestTable
--Jeff Moden
Change is inevitable... Change for the better is not.
August 4, 2012 at 12:20 pm
Jeff Moden (8/4/2012)
drew.allen (8/3/2012)
As Joe alluded to, there is a much simpler method of calculating intersections. The standard uses only two dates, but it easily be modified to account for multiple dates.
SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting
FROM @DateRanges
Here is the comparison data.
CROSS JOIN
Table '#2E8BB1C1'. Scan count 2, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
GROUP BY
Table '#2E8BB1C1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
As you can see, the GROUP BY only scans the table once, whereas the CROSS JOIN scans the table twice. As Capn said, the difference is negligible for small tables.
Drew
Gosh, Drew. I'm just not seeing how that works. In the following example where all of the dates are overlapping, we still get a zero.
-- drop table #testtable
SELECT FromDate = CAST(d.FromDate AS DATETIME),
ToDate = CAST(d.ToDate AS DATETIME)
INTO #TestTable
FROM (
SELECT '2002-01-01', '2001-01-01' UNION ALL
SELECT '2002-01-01', '2003-01-01' UNION ALL
SELECT '2002-01-01', '2005-01-01'
) d (FromDate,ToDate)
;
SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting
FROM #TestTable
Jeff, look at your first date range, the to date is less than the from date.
August 6, 2012 at 11:40 am
Lynn Pettis (8/4/2012)
Jeff, look at your first date range, the to date is less than the from date.
{facepalm} :blush:
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply