Find overlapping date ranges for more than 2 ranges

Question

Find overlapping date ranges for more than 2 ranges

coalesceuk

SSC Eights!

Points: 863
More actions
March 9, 2015 at 10:56 pm

#151408

Hi! Long time reader, but I think this is my first time posting.
I have a hypothetical scenario that I am struggling with and I'd really appreciate some ideas of what to try next.
Imagine I have a number of people assigned to a project for a set period of time. Those people get invited to meetings lasting a set period of time.
I want to work out the date ranges when all the people on a project are invited to meetings at the same time.
This is modelled like this - along with some sample data (that is unrealistic but tests the scenario):
create table #EmployeeProject
(
EmployeeId int
,ProjectId int
,ProjectStart datetime
,ProjectEnd datetime
)
create table #EmployeeMeeting
(
EmployeeId int
,ProjectId int
,MeetingStart datetime
,MeetingEnd datetime
)
insert into #EmployeeProject values (1, 1, '2014-01-01 09:00', '2014-04-30 17:00')
insert into #EmployeeProject values (2, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeProject values (3, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeProject values (4, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeMeeting values (1, 1, '2014-02-01 09:00', '2014-03-31 17:00')
insert into #EmployeeMeeting values (2, 1, '2014-06-01 09:00', '2014-10-31 17:00')
insert into #EmployeeMeeting values (3, 1, '2014-07-01 09:00', '2014-09-30 17:00')
insert into #EmployeeMeeting values (4, 1, '2014-08-01 09:00', '2014-08-31 17:00')
What I'd like to get as a result set is:
ProjectId, RangeStart, RangeEnd
1, 2014-02-01 09:00, 2014-03-31 17:00
1, 2014-08-01 09:00, 2014-08-31 17:00
There feels like there should be a relatively simple way to do this, but I can't come up with a working solution that isn't either restricted by the number of overlaps it can deal with, or that uses a time of day" dimension table that has a row for each minute of each day. Using my real-world data, rather than the example above, this does not scale well!
Please help!

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Dwain Camps SSC Guru Points: 86948 More actions · Answer 1

I'm not sure I really understand your requirement, nor do I see exactly how your input relates to your desired output. Nice job of providing DDL and sample data by the way.

But your title suggests to me that something here might be of use to you.

Calculating Gaps Between Overlapping Time Intervals in SQL[/url]

Note that the really heavy lifting is the code I used from Itzik Ben-Gan in that article. Unfortunately his original article (linked to in mine) has disappeared and along with it his excellent explanation.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Mark Cowne One Orange Chip Points: 26952 More actions · Answer 2

This works with your sample data

WITH StartsAndEnds(StartEnd,ProjectId) AS (

-- Start points which overlap the same number of meetings as projects

SELECT s.MeetingStart,s.ProjectId

FROM #EmployeeMeeting s

LEFT OUTER JOIN #EmployeeMeeting m ON m.ProjectId = s.ProjectId

AND s.MeetingStart BETWEEN m.MeetingStart AND m.MeetingEnd

LEFT OUTER JOIN #EmployeeProject p ON p.ProjectId = s.ProjectId

AND s.MeetingStart BETWEEN p.ProjectStart AND p.ProjectEnd

GROUP BY s.MeetingStart,s.ProjectId

HAVING COUNT(DISTINCT p.EmployeeId) = COUNT(DISTINCT m.EmployeeId)

UNION ALL

-- End points which overlap the same number of meetings as projects

SELECT s.MeetingEnd,s.ProjectId

FROM #EmployeeMeeting s

LEFT OUTER JOIN #EmployeeMeeting m ON m.ProjectId = s.ProjectId

AND s.MeetingEnd BETWEEN m.MeetingStart AND m.MeetingEnd

LEFT OUTER JOIN #EmployeeProject p ON p.ProjectId = s.ProjectId

AND s.MeetingEnd BETWEEN p.ProjectStart AND p.ProjectEnd

GROUP BY s.MeetingEnd,s.ProjectId

HAVING COUNT(DISTINCT p.EmployeeId) = COUNT(DISTINCT m.EmployeeId)

),

OrderedStartsAndEnds AS (

SELECT StartEnd,ProjectId,

ROW_NUMBER() OVER(PARTITION BY ProjectId ORDER BY StartEnd) AS rn

FROM StartsAndEnds

)

SELECT s.ProjectId,

s.StartEnd AS RangeStart,

e.StartEnd AS RangeEnd

FROM OrderedStartsAndEnds s

INNER JOIN OrderedStartsAndEnds e ON e.ProjectId = s.ProjectId

AND e.rn = s.rn + 1

WHERE EXISTS(SELECT * FROM #EmployeeProject p WHERE p.ProjectId = s.ProjectId

AND s.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd

AND e.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd);

Simpler version

WITH StartsAndEnds(ProjectId,StartEnd,rn) AS (

SELECT t.ProjectId,

ca.StartEnd,

ROW_NUMBER() OVER(PARTITION BY t.ProjectId ORDER BY ca.StartEnd)

FROM #EmployeeMeeting t

CROSS APPLY(VALUES(t.MeetingStart),(t.MeetingEnd)) AS ca(StartEnd)

LEFT OUTER JOIN #EmployeeMeeting m ON m.ProjectId = t.ProjectId

AND ca.StartEnd BETWEEN m.MeetingStart AND m.MeetingEnd

LEFT OUTER JOIN #EmployeeProject p ON p.ProjectId = t.ProjectId

AND ca.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd

GROUP BY ca.StartEnd,t.ProjectId

HAVING COUNT(DISTINCT p.EmployeeId) = COUNT(DISTINCT m.EmployeeId)

)

SELECT s.ProjectId,

s.StartEnd AS RangeStart,

e.StartEnd AS RangeEnd

FROM StartsAndEnds s

INNER JOIN StartsAndEnds e ON e.ProjectId = s.ProjectId

AND e.rn = s.rn + 1

WHERE EXISTS(SELECT * FROM #EmployeeProject p WHERE p.ProjectId = s.ProjectId

AND s.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd

AND e.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd);

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

coalesceuk SSC Eights! Points: 863 More actions · Answer 3

Many thanks for your responses - I've got some things to try now!

dwain.c, I probably didn't explain it too well, and my attempt to anonymise/simplify what I'm really trying to do into something that made sense probably failed!

I found Itzik Ben-Gan's "Intervals and Counts" articles that are linked at the bottom of your link after I posted this yesterday, and along with your article, they have helped me along a bit.

Using that technique, I'm now trying to create a dataset that has the following columns:

ProjectId, Timestamp, MeetingCounter, EmployeeCounter

With a row for each time something happens for a project, i.e. a meeting starts/ends, or an employee joins/leaves the project.

I then want the rows where MeetingCounter = EmployeeCounter to give me the Start Timestamp, and the next row to give me the End Timestamp.

When I've cracked it, I'll post a follow-up in case it helps anyone else.

Mark Cowne, Thanks also for your reply.

Unfortunately, I can't use the "simpler" version as I am using SQL 2005 because of VALUES in the CROSS APPLY (which also prevents me using LEAD in what I describe above).

As you point out, your solution works with my example data, but I think it may have problems with my real-world data, where employees may join/leave different meetings multiple times. I will try and create some better example data to illustrate/test.

Dwain Camps SSC Guru Points: 86948 More actions · Answer 4

coalesceuk (3/10/2015)
Many thanks for your responses - I've got some things to try now!
dwain.c, I probably didn't explain it too well, and my attempt to anonymise/simplify what I'm really trying to do into something that made sense probably failed!
I found Itzik Ben-Gan's "Intervals and Counts" articles that are linked at the bottom of your link after I posted this yesterday, and along with your article, they have helped me along a bit.
Using that technique, I'm now trying to create a dataset that has the following columns:
ProjectId, Timestamp, MeetingCounter, EmployeeCounter
With a row for each time something happens for a project, i.e. a meeting starts/ends, or an employee joins/leaves the project.
I then want the rows where MeetingCounter = EmployeeCounter to give me the Start Timestamp, and the next row to give me the End Timestamp.
When I've cracked it, I'll post a follow-up in case it helps anyone else.
Mark Cowne, Thanks also for your reply.
Unfortunately, I can't use the "simpler" version as I am using SQL 2005 because of VALUES in the CROSS APPLY (which also prevents me using LEAD in what I describe above).
As you point out, your solution works with my example data, but I think it may have problems with my real-world data, where employees may join/leave different meetings multiple times. I will try and create some better example data to illustrate/test.

If you're reading up on IBG's contributions that's quite likely to put you on the right track. He's the master in this domain, with the caveat that he does employ a lot of SQL 2008/2012 stuff in his solutions.

Definitely post back what you end up with. Amazing how some will take that as bait and try to improve on it.

My apologies for not having more time yesterday and today to study what you need carefully and offer a direct solution.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

coalesceuk SSC Eights! Points: 863 More actions · Answer 5

I think I have a solution, and by changing my example data to split one meeting into 2, I can break Mark Cowne's suggestion

IF OBJECT_ID('tempdb.dbo.#EmployeeProject') IS NOT NULL DROP TABLE #EmployeeProject

create table #EmployeeProject

(

EmployeeId int

,ProjectId int

,ProjectStart datetime

,ProjectEnd datetime

)

IF OBJECT_ID('tempdb.dbo.#EmployeeMeeting') IS NOT NULL DROP TABLE #EmployeeMeeting

create table #EmployeeMeeting

(

EmployeeId int

,ProjectId int

,MeetingStart datetime

,MeetingEnd datetime

)

insert into #EmployeeProject values (1, 1, '2014-01-01 09:00', '2014-04-30 17:00')

insert into #EmployeeProject values (2, 1, '2014-05-01 09:00', '2014-12-31 17:00')

insert into #EmployeeProject values (3, 1, '2014-05-01 09:00', '2014-12-31 17:00')

insert into #EmployeeProject values (4, 1, '2014-05-01 09:00', '2014-12-31 17:00')

insert into #EmployeeMeeting values (1, 1, '2014-02-01 09:00', '2014-03-31 17:00')

insert into #EmployeeMeeting values (2, 1, '2014-06-01 09:00', '2014-10-31 17:00')

insert into #EmployeeMeeting values (3, 1, '2014-07-01 09:00', '2014-09-30 17:00')

--insert into #EmployeeMeeting values (4, 1, '2014-08-01 09:00', '2014-08-31 17:00')

insert into #EmployeeMeeting values (4, 1, '2014-08-01 09:00', '2014-08-21 17:00')

insert into #EmployeeMeeting values (4, 1, '2014-08-21 17:00', '2014-08-31 17:00')

;with c1 as --Get all the events in a row

(

select --Start Date of Employees on Projects

p.ProjectId

,p.ProjectStart as t

,1 as ProjectIncrement

,0 as MeetingIncrement

from #EmployeeProject p

union all

select --End Date of Employees on Projects

p.ProjectId

,p.ProjectEnd

,-1

,0

from #EmployeeProject p

union all

select --Start Date of Employees in Meetings

m.ProjectId

,m.MeetingStart

,0

,1

from #EmployeeMeeting m

union all

select --End Date of Employees in Meetings

m.ProjectId

,m.MeetingEnd

,0

,-1

from #EmployeeMeeting m

)

,c2 as --Add a row number column for easier joining

(

select *

,row_number() over (partition by ProjectId ORDER BY t, ProjectIncrement, MeetingIncrement) as rn

from c1

)

,c3 as --Calculate running totals of Projects and Meetings

(

select c2_now.*

,sum(c2_prev.ProjectIncrement) as ProjectCount

,sum(c2_prev.MeetingIncrement) as MeetingCount

from c2 c2_now

left join c2 c2_prev

on c2_now.ProjectId = c2_prev.ProjectId

and c2_now.rn >= c2_prev.rn

group by

c2_now.ProjectId

,c2_now.t

,c2_now.ProjectIncrement

,c2_now.MeetingIncrement

,c2_now.rn

)

select

c3_now.ProjectId

,c3_now.t as RangeStart

,c3_next.t as RangeEnd

,c3_now.ProjectCount

,c3_now.MeetingCount

from c3 c3_now

left join c3 c3_next

on c3_now.ProjectId = c3_next.ProjectId

and c3_now.rn = c3_next.rn - 1

where c3_now.ProjectCount = c3_now.MeetingCount

and c3_now.ProjectCount > 0

order by c3_now.rn

If I had SQL 2012, this could be simplified greatly with SUM(ProjectCount) OVER(ProjectId)

I haven't stress-tested this to see how it performs, but it is quicker than the code I wrote that uses a time-of-day dimension.

Any suggestions for improvements would be very welcome. Thanks again for the pointers - I was stuck before I posted here!

Dwain Camps SSC Guru Points: 86948 More actions · Answer 6

You can probably save yourself a couple of scans by changing CTE c1 as follows:

with c1 as --Get all the events in a row

(

select --Start Date of Employees on Projects

p.ProjectId

,t

,ProjectIncrement

,MeetingIncrement

from #EmployeeProject p

CROSS APPLY

(

SELECT p.ProjectStart, 1, 0

UNION ALL

SELECT p.ProjectEnd, -1, 0

) x (t, ProjectIncrement, MeetingIncrement)

union all

select --Start Date of Employees in Meetings

m.ProjectId

,t

,ProjectIncrement

,MeetingIncrement

from #EmployeeMeeting m

CROSS APPLY

(

SELECT m.MeetingStart, 0, 1

UNION ALL

SELECT m.MeetingEnd, 0, -1

) x (t, ProjectIncrement, MeetingIncrement)

)

Note that this is not tested but I believe CROSS APPLY was introduced in SQL 2005.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Mark Cowne One Orange Chip Points: 26952 More actions · Answer 7

coalesceuk (3/10/2015)
I think I have a solution, and by changing my example data to split one meeting into 2, I can break Mark Cowne's suggestion
IF OBJECT_ID('tempdb.dbo.#EmployeeProject') IS NOT NULL DROP TABLE #EmployeeProject
create table #EmployeeProject
(
EmployeeId int
,ProjectId int
,ProjectStart datetime
,ProjectEnd datetime
)
IF OBJECT_ID('tempdb.dbo.#EmployeeMeeting') IS NOT NULL DROP TABLE #EmployeeMeeting
create table #EmployeeMeeting
(
EmployeeId int
,ProjectId int
,MeetingStart datetime
,MeetingEnd datetime
)
insert into #EmployeeProject values (1, 1, '2014-01-01 09:00', '2014-04-30 17:00')
insert into #EmployeeProject values (2, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeProject values (3, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeProject values (4, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeMeeting values (1, 1, '2014-02-01 09:00', '2014-03-31 17:00')
insert into #EmployeeMeeting values (2, 1, '2014-06-01 09:00', '2014-10-31 17:00')
insert into #EmployeeMeeting values (3, 1, '2014-07-01 09:00', '2014-09-30 17:00')
--insert into #EmployeeMeeting values (4, 1, '2014-08-01 09:00', '2014-08-31 17:00')
insert into #EmployeeMeeting values (4, 1, '2014-08-01 09:00', '2014-08-21 17:00')
insert into #EmployeeMeeting values (4, 1, '2014-08-21 17:00', '2014-08-31 17:00')
;with c1 as --Get all the events in a row
(
select --Start Date of Employees on Projects
p.ProjectId
,p.ProjectStart as t
,1 as ProjectIncrement
,0 as MeetingIncrement
from #EmployeeProject p
union all
select --End Date of Employees on Projects
p.ProjectId
,p.ProjectEnd
,-1
,0
from #EmployeeProject p
union all
select --Start Date of Employees in Meetings
m.ProjectId
,m.MeetingStart
,0
,1
from #EmployeeMeeting m
union all
select --End Date of Employees in Meetings
m.ProjectId
,m.MeetingEnd
,0
,-1
from #EmployeeMeeting m
)
,c2 as --Add a row number column for easier joining
(
select *
,row_number() over (partition by ProjectId ORDER BY t, ProjectIncrement, MeetingIncrement) as rn
from c1
)
,c3 as --Calculate running totals of Projects and Meetings
(
select c2_now.*
,sum(c2_prev.ProjectIncrement) as ProjectCount
,sum(c2_prev.MeetingIncrement) as MeetingCount
from c2 c2_now
left join c2 c2_prev
on c2_now.ProjectId = c2_prev.ProjectId
and c2_now.rn >= c2_prev.rn
group by
c2_now.ProjectId
,c2_now.t
,c2_now.ProjectIncrement
,c2_now.MeetingIncrement
,c2_now.rn
)
select
c3_now.ProjectId
,c3_now.t as RangeStart
,c3_next.t as RangeEnd
,c3_now.ProjectCount
,c3_now.MeetingCount
from c3 c3_now
left join c3 c3_next
on c3_now.ProjectId = c3_next.ProjectId
and c3_now.rn = c3_next.rn - 1
where c3_now.ProjectCount = c3_now.MeetingCount
and c3_now.ProjectCount > 0
order by c3_now.rn
If I had SQL 2012, this could be simplified greatly with SUM(ProjectCount) OVER(ProjectId)
I haven't stress-tested this to see how it performs, but it is quicker than the code I wrote that uses a time-of-day dimension.
Any suggestions for improvements would be very welcome. Thanks again for the pointers - I was stuck before I posted here!

Mine breaks due to a zero length interval which can be fixed by simply adding

AND s.StartEnd < e.StartEnd

to the end of the query.

Just for fun, here's a SQL Server 2012 version that runs quicker

WITH StartsAndEnds(ProjectId,RangeStart,RangeEnd) AS (

SELECT t.ProjectId,

ca.StartEnd,

LEAD(ca.StartEnd) OVER(PARTITION BY t.ProjectId ORDER BY ca.StartEnd)

FROM #EmployeeMeeting t

CROSS APPLY(VALUES(t.MeetingStart),(t.MeetingEnd)) AS ca(StartEnd)

INNER JOIN #EmployeeMeeting m ON m.ProjectId = t.ProjectId

AND ca.StartEnd BETWEEN m.MeetingStart AND m.MeetingEnd

INNER JOIN #EmployeeProject p ON p.ProjectId = t.ProjectId

AND ca.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd

GROUP BY ca.StartEnd,t.ProjectId

HAVING COUNT(DISTINCT p.EmployeeId) = COUNT(DISTINCT m.EmployeeId)

)

SELECT s.ProjectId,

s.RangeStart,

s.RangeEnd

FROM StartsAndEnds s

WHERE EXISTS(SELECT * FROM #EmployeeProject p WHERE p.ProjectId = s.ProjectId

AND s.RangeStart BETWEEN p.ProjectStart AND p.ProjectEnd

AND s.RangeEnd BETWEEN p.ProjectStart AND p.ProjectEnd);

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

coalesceuk SSC Eights! Points: 863 More actions · Answer 8

dwain.c, the cross apply does remove a lot of the table scans and works in 2005. I find it harder to understand/read what a cross apply is doing just by looking at the query, but in this case, I follow what it does and why it is better.

Mark, I have some other "real" data that I've tried with your 2005 friendly solution.

insert into #EmployeeProject values (1, 1, '2014-09-26 13:40:08.000', '2015-01-30 20:35:36.000')

insert into #EmployeeProject values (2, 1, '2014-10-02 16:13:01.000', '2015-02-04 18:52:58.000')

insert into #EmployeeMeeting values (1, 1, '2014-09-29 15:56:41.000', '2014-09-29 17:08:20.000')

insert into #EmployeeMeeting values (1, 1, '2014-12-03 09:51:54.000', '2015-01-30 15:43:30.000')

insert into #EmployeeMeeting values (2, 1, '2014-10-02 21:25:28.000', '2014-10-07 09:21:09.000')

insert into #EmployeeMeeting values (2, 1, '2014-11-14 12:09:50.000', '2014-12-03 09:45:33.000')

insert into #EmployeeMeeting values (2, 1, '2014-12-05 09:12:21.000', '2015-01-30 15:34:43.000')

I am expecting the following results:

ProjectIdRangeStartRangeEnd

12014-09-29 15:56:41.0002014-09-29 17:08:20.000

12014-12-05 09:12:21.0002015-01-30 15:34:43.000

However, I am getting an extra row returned (that isn't a result of a zero interval). What is going wrong in this case?

Thanks!

Dwain Camps SSC Guru Points: 86948 More actions · Answer 9

coalesceuk (3/11/2015)
dwain.c, the cross apply does remove a lot of the table scans and works in 2005. I find it harder to understand/read what a cross apply is doing just by looking at the query, but in this case, I follow what it does and why it is better.

Check my signature links about CROSS APPLY VALUES to UNPIVOT. Using the SELECT/UNION ALL SELECT like I did is the SQL 2005 version of VALUES (table row constructor) shown in that article.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Mark Cowne One Orange Chip Points: 26952 More actions · Answer 10

coalesceuk (3/11/2015)
dwain.c, the cross apply does remove a lot of the table scans and works in 2005. I find it harder to understand/read what a cross apply is doing just by looking at the query, but in this case, I follow what it does and why it is better.
Mark, I have some other "real" data that I've tried with your 2005 friendly solution.
insert into #EmployeeProject values (1, 1, '2014-09-26 13:40:08.000', '2015-01-30 20:35:36.000')
insert into #EmployeeProject values (2, 1, '2014-10-02 16:13:01.000', '2015-02-04 18:52:58.000')
insert into #EmployeeMeeting values (1, 1, '2014-09-29 15:56:41.000', '2014-09-29 17:08:20.000')
insert into #EmployeeMeeting values (1, 1, '2014-12-03 09:51:54.000', '2015-01-30 15:43:30.000')
insert into #EmployeeMeeting values (2, 1, '2014-10-02 21:25:28.000', '2014-10-07 09:21:09.000')
insert into #EmployeeMeeting values (2, 1, '2014-11-14 12:09:50.000', '2014-12-03 09:45:33.000')
insert into #EmployeeMeeting values (2, 1, '2014-12-05 09:12:21.000', '2015-01-30 15:34:43.000')
I am expecting the following results:
ProjectIdRangeStartRangeEnd
12014-09-29 15:56:41.0002014-09-29 17:08:20.000
12014-12-05 09:12:21.0002015-01-30 15:34:43.000
However, I am getting an extra row returned (that isn't a result of a zero interval). What is going wrong in this case?
Thanks!

My bad, needs this at the end of the query

AND EXISTS(SELECT * FROM #EmployeeMeeting m WHERE m.ProjectId = s.ProjectId

AND s.StartEnd BETWEEN m.MeetingStart AND m.MeetingEnd

AND e.StartEnd BETWEEN m.MeetingStart AND m.MeetingEnd)

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

coalesceuk SSC Eights! Points: 863 More actions · Answer 11

dwain.c Thanks - I will have a read!

Mark Thanks for the fix, which works with my example data.

I've been testing both solutions with my real-world data. Both take about the same time to execute based on about 200000 Projects and 3000 Meetings.

However, I've found another scenario in that data which breaks Mark's solution (which is probably because I've incompletely explained the scenario in my attempt to "simplify" it!).

insert into #EmployeeProject values (1, 3, '2011-05-06 12:17:38.000', '2012-05-21 15:06:44.000')

insert into #EmployeeProject values (2, 3, '2012-05-16 15:55:45.000', '2012-05-16 16:30:25.000')

insert into #EmployeeMeeting values (1, 3, '2011-05-10 13:59:56.000', '2012-05-16 17:44:12.000'

In this case I would be expecting the period to be returned where the employee is in a meeting on all the projects they are working on at the time, i.e.

ProjectIdRangeStartRangeEnd

32011-05-10 13:59:56.0002012-05-16 15:55:45.000

32012-05-16 16:30:25.0002012-05-16 17:44:12.000

What I'm getting with that solution is a missing part in the middle where the employee is allocated to the second project, but not in a meeting for it, i.e.

ProjectIdRangeStartRangeEnd

32011-05-10 13:59:56.0002012-05-16 17:44:12.000

I'm guessing this is because we are purely looking at Start and End points of meetings, and missing what might happen in the middle?

Mark Cowne One Orange Chip Points: 26952 More actions · Answer 12

coalesceuk (3/12/2015)
dwain.c Thanks - I will have a read!
Mark Thanks for the fix, which works with my example data.
I've been testing both solutions with my real-world data. Both take about the same time to execute based on about 200000 Projects and 3000 Meetings.
However, I've found another scenario in that data which breaks Mark's solution (which is probably because I've incompletely explained the scenario in my attempt to "simplify" it!).
insert into #EmployeeProject values (1, 3, '2011-05-06 12:17:38.000', '2012-05-21 15:06:44.000')
insert into #EmployeeProject values (2, 3, '2012-05-16 15:55:45.000', '2012-05-16 16:30:25.000')
insert into #EmployeeMeeting values (1, 3, '2011-05-10 13:59:56.000', '2012-05-16 17:44:12.000'
In this case I would be expecting the period to be returned where the employee is in a meeting on all the projects they are working on at the time, i.e.
ProjectIdRangeStartRangeEnd
32011-05-10 13:59:56.0002012-05-16 15:55:45.000
32012-05-16 16:30:25.0002012-05-16 17:44:12.000
What I'm getting with that solution is a missing part in the middle where the employee is allocated to the second project, but not in a meeting for it, i.e.
ProjectIdRangeStartRangeEnd
32011-05-10 13:59:56.0002012-05-16 17:44:12.000
I'm guessing this is because we are purely looking at Start and End points of meetings, and missing what might happen in the middle?

I suspect this means I've misunderstood the requirements, would probably need a re-write.

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537