November 13, 2004 at 6:47 am
Good morning and weekend all, I'm at an internship for a new position, and really trying to give it all I've got and make a great impression. Want to help? 😉
Here's the columns in question, with sample date of the table in question:
ID | member_ID | start_dt | end_dt
1 000000001 10/01/2002 10/31/2002
2 000000001 11/01/2002 10/01/2003
3 000000001 12/01/2002 11/01/2003
4 000000001 11/30/2003 01/01/2004
There are multiple member_ID's, whereas ID is the tables Identity column.
The problem occurs when an end_dt is between a set of start_dt/end_dt's FOR that member_ID. i.e. each row's end_dt & start_dt must remain a separate entity from other date-ranges for that member_ID.
(in the sample data, this record to identify would be ID-2, because it is within the date range of ID-3)
I simply want to identify these records by returning the member_ID for any of these occurences.
The process I've written is bulky and uses nested while loops, and simply isn't delivering the results.
Funny after thinking on a fresh head..I'd focused on sorting by end_dt desc, and looking at current_end_dt between next_start_dt AND next_end_dt...but actually any end_dt that is between any OTHER start_dt AND end_dt would do the trick.
I've found some good info on overlapping dates on the web, but I'm having trouble incorporating these findings with my member_ID inclusion...
Any takers?
- Ben
November 13, 2004 at 11:31 pm
This will return both rows 2 and 3... Both are technically overlapping, so IMO that's correct behavior. If you want only one or the other it might be possible but you'll have to define the logic for which to pick.
SELECT *
FROM YourTable
WHERE EXISTS
(SELECT *
FROM YourTable Y1
WHERE Y1.member_id = YourTable.member_id
AND (Y1.start_dt BETWEEN YourTable.start_dt AND YourTable.end_dt
OR Y1.end_dt BETWEEN YourTable.start_dt AND YourTable.end_dt)
Good luck on your internship!
--
Adam Machanic
whoisactive
November 14, 2004 at 9:22 am
Fantastic thank you very much adam!
November 14, 2004 at 1:36 pm
Ben and Adam,
I ran into a problem with Adam's script returning all records instead of just the overlap records. I don't know if I did something wrong in changing the 'yourtable' to '#MyTemp' or maybe I added the missing ")" in the wrong spot.
Here's Adam's script as modified for my testing...
SELECT *
FROM #MyTemp
WHERE exists
(SELECT *
FROM #MyTemp Y1
WHERE Y1.member_id = #MyTemp.member_id
AND (Y1.start_dt BETWEEN #MyTemp.start_dt AND #MyTemp.end_dt
OR Y1.end_dt BETWEEN #MyTemp.start_dt AND #MyTemp.end_dt)) --<--<< added missing ")" here
Here's the setup for the data I used...
--===== If temporary table exists, drop it
IF OBJECT_ID('tempdb..#MyTemp') IS NOT NULL
DROP TABLE #MyTemp
--===== Create a place to hold the test data
CREATE TABLE #MyTemp
(
ID INT IDENTITY(1,1),
Member_ID VARCHAR(10),
Start_DT DATETIME,
End_DT DATETIME
)
--===== Insert test data into temp table
INSERT INTO #MyTemp
(Member_ID,Start_DT,End_DT)
SELECT '0000000001','01/01/2004','06/30/2004' UNION ALL --Part of an overlap
SELECT '0000000001','05/01/2004','12/30/2004' UNION ALL --Part of an overlap
SELECT '0000000001','06/01/2003','07/30/2004' UNION ALL --Part of an overlap
SELECT '0000000001','01/01/2005','02/01/2005' UNION ALL --No overlap
SELECT '0000000002','01/01/2004','05/30/2004' UNION ALL --No overlap
SELECT '0000000002','07/01/2004','12/31/2004' UNION ALL --No overlap
SELECT '0000000003','05/01/2004','05/30/2004' UNION ALL --Part of an overlap
SELECT '0000000003','01/01/2004','12/31/2005' UNION ALL --Part of an overlap
SELECT '0000000003','01/01/2003','02/01/2003' UNION ALL --No overlap
SELECT '0000000003','01/01/2006','02/01/2006' --No overlap
Here's the results I get when I run Adam's script (all records are returned instead of just the overlaps)...
2 0000000001 2004-05-01 00:00:00.000 2004-12-30 00:00:00.000
3 0000000001 2003-06-01 00:00:00.000 2004-07-30 00:00:00.000
4 0000000001 2005-01-01 00:00:00.000 2005-02-01 00:00:00.000
5 0000000002 2004-01-01 00:00:00.000 2004-05-30 00:00:00.000
6 0000000002 2004-07-01 00:00:00.000 2004-12-31 00:00:00.000
7 0000000003 2004-05-01 00:00:00.000 2004-05-30 00:00:00.000
8 0000000003 2004-01-01 00:00:00.000 2005-12-31 00:00:00.000
9 0000000003 2003-01-01 00:00:00.000 2003-02-01 00:00:00.000
10 0000000003 2006-01-01 00:00:00.000 2006-02-01 00:00:00.000
So, I wrote the following script (includes logic for date ranges that "straddle" one another) to return Member_ID's that had overlapping dates as well as the start and end dates of the entire overlapping range...
--===== Find everything that overlaps and display range
SELECT t1.Member_ID AS Member_ID,
MIN(t1.Start_DT) AS Start_DT,
MAX(t1.End_DT) AS End_DT
FROM #MyTemp t1,
#MyTemp t2
WHERE (
( t2.Start_DT >= t1.Start_DT --Start_DT in date range
AND t2.Start_DT <= t1.End_DT)
OR ( t2.End_DT >= t1.Start_DT --End_DT in date range
AND t2.End_DT <= t1.End_DT)
OR ( t2.Start_DT <= t1.Start_DT --One range "straddles" another
AND t2.End_DT >= t1.End_DT)
)
AND t1.Member_ID = t2.Member_ID
AND t1.ID <> t2.ID
GROUP BY t1.Member_ID
... which gives the following results (admittedly, the ID column is not included but I didn't need it or want it for what I was trying to do. Good thing because it would have been a bit more complicated to list the ID)...
0000000003 2004-01-01 00:00:00.000 2005-12-31 00:00:00.000
... then, I got a bit carried away and wrote the following because I saw a use for it at work...
SELECT t1.Member_ID AS Member_ID,
MIN(t1.Start_DT) AS Start_DT,
MAX(t1.End_DT) AS End_DT,
'Yes' AS OverLaps
FROM #MyTemp t1,
#MyTemp t2
WHERE (
( t2.Start_DT >= t1.Start_DT --Start_DT in date range
AND t2.Start_DT <= t1.End_DT)
AND t2.End_DT <= t1.End_DT)
AND t2.End_DT >= t1.End_DT)
)
AND t1.Member_ID = t2.Member_ID
AND t1.ID <> t2.ID
GROUP BY t1.Member_ID
SELECT t3.Member_ID AS Member_ID,
t3.Start_DT AS Start_DT,
t3.End_DT AS End_DT,
'No' AS Overlaps
FROM #MyTemp t3
WHERE t3.ID NOT IN
(SELECT t1.ID --Same select to find overlaps as above
FROM #MyTemp t1,
#MyTemp t2
WHERE (
( t2.Start_DT >= t1.Start_DT --Start_DT in date range
AND t2.Start_DT <= t1.End_DT)
AND t2.End_DT <= t1.End_DT)
AND t2.End_DT >= t1.End_DT)
)
AND t1.Member_ID = t2.Member_ID
AND t1.ID <> t2.ID
)
ORDER BY Member_ID, Start_DT, End_DT
... which returned the following...
0000000001 2005-01-01 00:00:00.000 2005-02-01 00:00:00.000 No
0000000002 2004-01-01 00:00:00.000 2004-05-30 00:00:00.000 No
0000000002 2004-07-01 00:00:00.000 2004-12-31 00:00:00.000 No
0000000003 2003-01-01 00:00:00.000 2003-02-01 00:00:00.000 No
0000000003 2004-01-01 00:00:00.000 2005-12-31 00:00:00.000 Yes
0000000003 2006-01-01 00:00:00.000 2006-02-01 00:00:00.000 No
--Jeff Moden
Change is inevitable... Change for the better is not.
November 15, 2004 at 1:22 pm
Joe,
on a sidenote. Have you ever tried to assign 23:59.59.99999 to a datetime?
declare @foolme datetime
set @foolme = '23:59.59.99999'
select @foolme
Server: Nachr.-Nr. 241, Schweregrad 16, Status 1, Zeile 2
Syntaxfehler beim Konvertieren einer Zeichenfolge in eine datetime-Zeichenfolge.
Now, even if you correct it to
declare @foolme datetime
set @foolme = '23:59:59.999'
select @foolme
it look suprising
------------------------------------------------------
1900-01-02 00:00:00.000
(1 row(s) affected)
So I guess the max value you can assign to the time portion of a datetime is something like:
declare @foolme datetime
set @foolme = '23:59:59.998'
select @foolme
------------------------------------------------------
1900-01-01 23:59:59.997
(1 row(s) affected)
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
November 15, 2004 at 2:53 pm
Thanks for explanation. I'll keep it in mind for future reference.
...and I always thought this fora here are kind of SQL Server's own country.
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
November 15, 2004 at 3:17 pm
Thanks for the work, all, Jeff's research did the trick in the end.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply