March 9, 2015 at 10:56 pm
Hi! Long time reader, but I think this is my first time posting.
I have a hypothetical scenario that I am struggling with and I'd really appreciate some ideas of what to try next.
Imagine I have a number of people assigned to a project for a set period of time. Those people get invited to meetings lasting a set period of time.
I want to work out the date ranges when all the people on a project are invited to meetings at the same time.
This is modelled like this - along with some sample data (that is unrealistic but tests the scenario):
create table #EmployeeProject
(
EmployeeId int
,ProjectId int
,ProjectStart datetime
,ProjectEnd datetime
)
create table #EmployeeMeeting
(
EmployeeId int
,ProjectId int
,MeetingStart datetime
,MeetingEnd datetime
)
insert into #EmployeeProject values (1, 1, '2014-01-01 09:00', '2014-04-30 17:00')
insert into #EmployeeProject values (2, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeProject values (3, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeProject values (4, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeMeeting values (1, 1, '2014-02-01 09:00', '2014-03-31 17:00')
insert into #EmployeeMeeting values (2, 1, '2014-06-01 09:00', '2014-10-31 17:00')
insert into #EmployeeMeeting values (3, 1, '2014-07-01 09:00', '2014-09-30 17:00')
insert into #EmployeeMeeting values (4, 1, '2014-08-01 09:00', '2014-08-31 17:00')
What I'd like to get as a result set is:
ProjectId, RangeStart, RangeEnd
1, 2014-02-01 09:00, 2014-03-31 17:00
1, 2014-08-01 09:00, 2014-08-31 17:00
There feels like there should be a relatively simple way to do this, but I can't come up with a working solution that isn't either restricted by the number of overlaps it can deal with, or that uses a time of day" dimension table that has a row for each minute of each day. Using my real-world data, rather than the example above, this does not scale well!
Please help!
March 10, 2015 at 12:54 am
I'm not sure I really understand your requirement, nor do I see exactly how your input relates to your desired output. Nice job of providing DDL and sample data by the way.
But your title suggests to me that something here might be of use to you.
Calculating Gaps Between Overlapping Time Intervals in SQL[/url]
Note that the really heavy lifting is the code I used from Itzik Ben-Gan in that article. Unfortunately his original article (linked to in mine) has disappeared and along with it his excellent explanation.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
March 10, 2015 at 3:55 am
This works with your sample data
WITH StartsAndEnds(StartEnd,ProjectId) AS (
-- Start points which overlap the same number of meetings as projects
SELECT s.MeetingStart,s.ProjectId
FROM #EmployeeMeeting s
LEFT OUTER JOIN #EmployeeMeeting m ON m.ProjectId = s.ProjectId
AND s.MeetingStart BETWEEN m.MeetingStart AND m.MeetingEnd
LEFT OUTER JOIN #EmployeeProject p ON p.ProjectId = s.ProjectId
AND s.MeetingStart BETWEEN p.ProjectStart AND p.ProjectEnd
GROUP BY s.MeetingStart,s.ProjectId
HAVING COUNT(DISTINCT p.EmployeeId) = COUNT(DISTINCT m.EmployeeId)
UNION ALL
-- End points which overlap the same number of meetings as projects
SELECT s.MeetingEnd,s.ProjectId
FROM #EmployeeMeeting s
LEFT OUTER JOIN #EmployeeMeeting m ON m.ProjectId = s.ProjectId
AND s.MeetingEnd BETWEEN m.MeetingStart AND m.MeetingEnd
LEFT OUTER JOIN #EmployeeProject p ON p.ProjectId = s.ProjectId
AND s.MeetingEnd BETWEEN p.ProjectStart AND p.ProjectEnd
GROUP BY s.MeetingEnd,s.ProjectId
HAVING COUNT(DISTINCT p.EmployeeId) = COUNT(DISTINCT m.EmployeeId)
),
OrderedStartsAndEnds AS (
SELECT StartEnd,ProjectId,
ROW_NUMBER() OVER(PARTITION BY ProjectId ORDER BY StartEnd) AS rn
FROM StartsAndEnds
)
SELECT s.ProjectId,
s.StartEnd AS RangeStart,
e.StartEnd AS RangeEnd
FROM OrderedStartsAndEnds s
INNER JOIN OrderedStartsAndEnds e ON e.ProjectId = s.ProjectId
AND e.rn = s.rn + 1
WHERE EXISTS(SELECT * FROM #EmployeeProject p WHERE p.ProjectId = s.ProjectId
AND s.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd
AND e.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd);
Simpler version
WITH StartsAndEnds(ProjectId,StartEnd,rn) AS (
SELECT t.ProjectId,
ca.StartEnd,
ROW_NUMBER() OVER(PARTITION BY t.ProjectId ORDER BY ca.StartEnd)
FROM #EmployeeMeeting t
CROSS APPLY(VALUES(t.MeetingStart),(t.MeetingEnd)) AS ca(StartEnd)
LEFT OUTER JOIN #EmployeeMeeting m ON m.ProjectId = t.ProjectId
AND ca.StartEnd BETWEEN m.MeetingStart AND m.MeetingEnd
LEFT OUTER JOIN #EmployeeProject p ON p.ProjectId = t.ProjectId
AND ca.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd
GROUP BY ca.StartEnd,t.ProjectId
HAVING COUNT(DISTINCT p.EmployeeId) = COUNT(DISTINCT m.EmployeeId)
)
SELECT s.ProjectId,
s.StartEnd AS RangeStart,
e.StartEnd AS RangeEnd
FROM StartsAndEnds s
INNER JOIN StartsAndEnds e ON e.ProjectId = s.ProjectId
AND e.rn = s.rn + 1
WHERE EXISTS(SELECT * FROM #EmployeeProject p WHERE p.ProjectId = s.ProjectId
AND s.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd
AND e.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd);
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537March 10, 2015 at 7:31 pm
Many thanks for your responses - I've got some things to try now!
dwain.c, I probably didn't explain it too well, and my attempt to anonymise/simplify what I'm really trying to do into something that made sense probably failed!
I found Itzik Ben-Gan's "Intervals and Counts" articles that are linked at the bottom of your link after I posted this yesterday, and along with your article, they have helped me along a bit.
Using that technique, I'm now trying to create a dataset that has the following columns:
ProjectId, Timestamp, MeetingCounter, EmployeeCounter
With a row for each time something happens for a project, i.e. a meeting starts/ends, or an employee joins/leaves the project.
I then want the rows where MeetingCounter = EmployeeCounter to give me the Start Timestamp, and the next row to give me the End Timestamp.
When I've cracked it, I'll post a follow-up in case it helps anyone else.
Mark Cowne, Thanks also for your reply.
Unfortunately, I can't use the "simpler" version as I am using SQL 2005 because of VALUES in the CROSS APPLY (which also prevents me using LEAD in what I describe above).
As you point out, your solution works with my example data, but I think it may have problems with my real-world data, where employees may join/leave different meetings multiple times. I will try and create some better example data to illustrate/test.
March 10, 2015 at 7:40 pm
coalesceuk (3/10/2015)
Many thanks for your responses - I've got some things to try now!dwain.c, I probably didn't explain it too well, and my attempt to anonymise/simplify what I'm really trying to do into something that made sense probably failed!
I found Itzik Ben-Gan's "Intervals and Counts" articles that are linked at the bottom of your link after I posted this yesterday, and along with your article, they have helped me along a bit.
Using that technique, I'm now trying to create a dataset that has the following columns:
ProjectId, Timestamp, MeetingCounter, EmployeeCounter
With a row for each time something happens for a project, i.e. a meeting starts/ends, or an employee joins/leaves the project.
I then want the rows where MeetingCounter = EmployeeCounter to give me the Start Timestamp, and the next row to give me the End Timestamp.
When I've cracked it, I'll post a follow-up in case it helps anyone else.
Mark Cowne, Thanks also for your reply.
Unfortunately, I can't use the "simpler" version as I am using SQL 2005 because of VALUES in the CROSS APPLY (which also prevents me using LEAD in what I describe above).
As you point out, your solution works with my example data, but I think it may have problems with my real-world data, where employees may join/leave different meetings multiple times. I will try and create some better example data to illustrate/test.
If you're reading up on IBG's contributions that's quite likely to put you on the right track. He's the master in this domain, with the caveat that he does employ a lot of SQL 2008/2012 stuff in his solutions.
Definitely post back what you end up with. Amazing how some will take that as bait and try to improve on it.
My apologies for not having more time yesterday and today to study what you need carefully and offer a direct solution.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
March 10, 2015 at 11:53 pm
I think I have a solution, and by changing my example data to split one meeting into 2, I can break Mark Cowne's suggestion
IF OBJECT_ID('tempdb.dbo.#EmployeeProject') IS NOT NULL DROP TABLE #EmployeeProject
create table #EmployeeProject
(
EmployeeId int
,ProjectId int
,ProjectStart datetime
,ProjectEnd datetime
)
IF OBJECT_ID('tempdb.dbo.#EmployeeMeeting') IS NOT NULL DROP TABLE #EmployeeMeeting
create table #EmployeeMeeting
(
EmployeeId int
,ProjectId int
,MeetingStart datetime
,MeetingEnd datetime
)
insert into #EmployeeProject values (1, 1, '2014-01-01 09:00', '2014-04-30 17:00')
insert into #EmployeeProject values (2, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeProject values (3, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeProject values (4, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeMeeting values (1, 1, '2014-02-01 09:00', '2014-03-31 17:00')
insert into #EmployeeMeeting values (2, 1, '2014-06-01 09:00', '2014-10-31 17:00')
insert into #EmployeeMeeting values (3, 1, '2014-07-01 09:00', '2014-09-30 17:00')
--insert into #EmployeeMeeting values (4, 1, '2014-08-01 09:00', '2014-08-31 17:00')
insert into #EmployeeMeeting values (4, 1, '2014-08-01 09:00', '2014-08-21 17:00')
insert into #EmployeeMeeting values (4, 1, '2014-08-21 17:00', '2014-08-31 17:00')
;with c1 as --Get all the events in a row
(
select --Start Date of Employees on Projects
p.ProjectId
,p.ProjectStart as t
,1 as ProjectIncrement
,0 as MeetingIncrement
from #EmployeeProject p
union all
select --End Date of Employees on Projects
p.ProjectId
,p.ProjectEnd
,-1
,0
from #EmployeeProject p
union all
select --Start Date of Employees in Meetings
m.ProjectId
,m.MeetingStart
,0
,1
from #EmployeeMeeting m
union all
select --End Date of Employees in Meetings
m.ProjectId
,m.MeetingEnd
,0
,-1
from #EmployeeMeeting m
)
,c2 as --Add a row number column for easier joining
(
select *
,row_number() over (partition by ProjectId ORDER BY t, ProjectIncrement, MeetingIncrement) as rn
from c1
)
,c3 as --Calculate running totals of Projects and Meetings
(
select c2_now.*
,sum(c2_prev.ProjectIncrement) as ProjectCount
,sum(c2_prev.MeetingIncrement) as MeetingCount
from c2 c2_now
left join c2 c2_prev
on c2_now.ProjectId = c2_prev.ProjectId
and c2_now.rn >= c2_prev.rn
group by
c2_now.ProjectId
,c2_now.t
,c2_now.ProjectIncrement
,c2_now.MeetingIncrement
,c2_now.rn
)
select
c3_now.ProjectId
,c3_now.t as RangeStart
,c3_next.t as RangeEnd
,c3_now.ProjectCount
,c3_now.MeetingCount
from c3 c3_now
left join c3 c3_next
on c3_now.ProjectId = c3_next.ProjectId
and c3_now.rn = c3_next.rn - 1
where c3_now.ProjectCount = c3_now.MeetingCount
and c3_now.ProjectCount > 0
order by c3_now.rn
If I had SQL 2012, this could be simplified greatly with SUM(ProjectCount) OVER(ProjectId)
I haven't stress-tested this to see how it performs, but it is quicker than the code I wrote that uses a time-of-day dimension.
Any suggestions for improvements would be very welcome. Thanks again for the pointers - I was stuck before I posted here!
March 11, 2015 at 12:41 am
You can probably save yourself a couple of scans by changing CTE c1 as follows:
with c1 as --Get all the events in a row
(
select --Start Date of Employees on Projects
p.ProjectId
,t
,ProjectIncrement
,MeetingIncrement
from #EmployeeProject p
CROSS APPLY
(
SELECT p.ProjectStart, 1, 0
UNION ALL
SELECT p.ProjectEnd, -1, 0
) x (t, ProjectIncrement, MeetingIncrement)
union all
select --Start Date of Employees in Meetings
m.ProjectId
,t
,ProjectIncrement
,MeetingIncrement
from #EmployeeMeeting m
CROSS APPLY
(
SELECT m.MeetingStart, 0, 1
UNION ALL
SELECT m.MeetingEnd, 0, -1
) x (t, ProjectIncrement, MeetingIncrement)
)
Note that this is not tested but I believe CROSS APPLY was introduced in SQL 2005.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
March 11, 2015 at 3:54 am
coalesceuk (3/10/2015)
I think I have a solution, and by changing my example data to split one meeting into 2, I can break Mark Cowne's suggestion
IF OBJECT_ID('tempdb.dbo.#EmployeeProject') IS NOT NULL DROP TABLE #EmployeeProject
create table #EmployeeProject
(
EmployeeId int
,ProjectId int
,ProjectStart datetime
,ProjectEnd datetime
)
IF OBJECT_ID('tempdb.dbo.#EmployeeMeeting') IS NOT NULL DROP TABLE #EmployeeMeeting
create table #EmployeeMeeting
(
EmployeeId int
,ProjectId int
,MeetingStart datetime
,MeetingEnd datetime
)
insert into #EmployeeProject values (1, 1, '2014-01-01 09:00', '2014-04-30 17:00')
insert into #EmployeeProject values (2, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeProject values (3, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeProject values (4, 1, '2014-05-01 09:00', '2014-12-31 17:00')
insert into #EmployeeMeeting values (1, 1, '2014-02-01 09:00', '2014-03-31 17:00')
insert into #EmployeeMeeting values (2, 1, '2014-06-01 09:00', '2014-10-31 17:00')
insert into #EmployeeMeeting values (3, 1, '2014-07-01 09:00', '2014-09-30 17:00')
--insert into #EmployeeMeeting values (4, 1, '2014-08-01 09:00', '2014-08-31 17:00')
insert into #EmployeeMeeting values (4, 1, '2014-08-01 09:00', '2014-08-21 17:00')
insert into #EmployeeMeeting values (4, 1, '2014-08-21 17:00', '2014-08-31 17:00')
;with c1 as --Get all the events in a row
(
select --Start Date of Employees on Projects
p.ProjectId
,p.ProjectStart as t
,1 as ProjectIncrement
,0 as MeetingIncrement
from #EmployeeProject p
union all
select --End Date of Employees on Projects
p.ProjectId
,p.ProjectEnd
,-1
,0
from #EmployeeProject p
union all
select --Start Date of Employees in Meetings
m.ProjectId
,m.MeetingStart
,0
,1
from #EmployeeMeeting m
union all
select --End Date of Employees in Meetings
m.ProjectId
,m.MeetingEnd
,0
,-1
from #EmployeeMeeting m
)
,c2 as --Add a row number column for easier joining
(
select *
,row_number() over (partition by ProjectId ORDER BY t, ProjectIncrement, MeetingIncrement) as rn
from c1
)
,c3 as --Calculate running totals of Projects and Meetings
(
select c2_now.*
,sum(c2_prev.ProjectIncrement) as ProjectCount
,sum(c2_prev.MeetingIncrement) as MeetingCount
from c2 c2_now
left join c2 c2_prev
on c2_now.ProjectId = c2_prev.ProjectId
and c2_now.rn >= c2_prev.rn
group by
c2_now.ProjectId
,c2_now.t
,c2_now.ProjectIncrement
,c2_now.MeetingIncrement
,c2_now.rn
)
select
c3_now.ProjectId
,c3_now.t as RangeStart
,c3_next.t as RangeEnd
,c3_now.ProjectCount
,c3_now.MeetingCount
from c3 c3_now
left join c3 c3_next
on c3_now.ProjectId = c3_next.ProjectId
and c3_now.rn = c3_next.rn - 1
where c3_now.ProjectCount = c3_now.MeetingCount
and c3_now.ProjectCount > 0
order by c3_now.rn
If I had SQL 2012, this could be simplified greatly with SUM(ProjectCount) OVER(ProjectId)
I haven't stress-tested this to see how it performs, but it is quicker than the code I wrote that uses a time-of-day dimension.
Any suggestions for improvements would be very welcome. Thanks again for the pointers - I was stuck before I posted here!
Mine breaks due to a zero length interval which can be fixed by simply adding
AND s.StartEnd < e.StartEnd
to the end of the query.
Just for fun, here's a SQL Server 2012 version that runs quicker
WITH StartsAndEnds(ProjectId,RangeStart,RangeEnd) AS (
SELECT t.ProjectId,
ca.StartEnd,
LEAD(ca.StartEnd) OVER(PARTITION BY t.ProjectId ORDER BY ca.StartEnd)
FROM #EmployeeMeeting t
CROSS APPLY(VALUES(t.MeetingStart),(t.MeetingEnd)) AS ca(StartEnd)
INNER JOIN #EmployeeMeeting m ON m.ProjectId = t.ProjectId
AND ca.StartEnd BETWEEN m.MeetingStart AND m.MeetingEnd
INNER JOIN #EmployeeProject p ON p.ProjectId = t.ProjectId
AND ca.StartEnd BETWEEN p.ProjectStart AND p.ProjectEnd
GROUP BY ca.StartEnd,t.ProjectId
HAVING COUNT(DISTINCT p.EmployeeId) = COUNT(DISTINCT m.EmployeeId)
)
SELECT s.ProjectId,
s.RangeStart,
s.RangeEnd
FROM StartsAndEnds s
WHERE EXISTS(SELECT * FROM #EmployeeProject p WHERE p.ProjectId = s.ProjectId
AND s.RangeStart BETWEEN p.ProjectStart AND p.ProjectEnd
AND s.RangeEnd BETWEEN p.ProjectStart AND p.ProjectEnd);
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537March 11, 2015 at 10:29 pm
dwain.c, the cross apply does remove a lot of the table scans and works in 2005. I find it harder to understand/read what a cross apply is doing just by looking at the query, but in this case, I follow what it does and why it is better.
Mark, I have some other "real" data that I've tried with your 2005 friendly solution.
insert into #EmployeeProject values (1, 1, '2014-09-26 13:40:08.000', '2015-01-30 20:35:36.000')
insert into #EmployeeProject values (2, 1, '2014-10-02 16:13:01.000', '2015-02-04 18:52:58.000')
insert into #EmployeeMeeting values (1, 1, '2014-09-29 15:56:41.000', '2014-09-29 17:08:20.000')
insert into #EmployeeMeeting values (1, 1, '2014-12-03 09:51:54.000', '2015-01-30 15:43:30.000')
insert into #EmployeeMeeting values (2, 1, '2014-10-02 21:25:28.000', '2014-10-07 09:21:09.000')
insert into #EmployeeMeeting values (2, 1, '2014-11-14 12:09:50.000', '2014-12-03 09:45:33.000')
insert into #EmployeeMeeting values (2, 1, '2014-12-05 09:12:21.000', '2015-01-30 15:34:43.000')
I am expecting the following results:
ProjectIdRangeStartRangeEnd
12014-09-29 15:56:41.0002014-09-29 17:08:20.000
12014-12-05 09:12:21.0002015-01-30 15:34:43.000
However, I am getting an extra row returned (that isn't a result of a zero interval). What is going wrong in this case?
Thanks!
March 11, 2015 at 10:33 pm
coalesceuk (3/11/2015)
dwain.c, the cross apply does remove a lot of the table scans and works in 2005. I find it harder to understand/read what a cross apply is doing just by looking at the query, but in this case, I follow what it does and why it is better.
Check my signature links about CROSS APPLY VALUES to UNPIVOT. Using the SELECT/UNION ALL SELECT like I did is the SQL 2005 version of VALUES (table row constructor) shown in that article.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
March 12, 2015 at 5:29 am
coalesceuk (3/11/2015)
dwain.c, the cross apply does remove a lot of the table scans and works in 2005. I find it harder to understand/read what a cross apply is doing just by looking at the query, but in this case, I follow what it does and why it is better.Mark, I have some other "real" data that I've tried with your 2005 friendly solution.
insert into #EmployeeProject values (1, 1, '2014-09-26 13:40:08.000', '2015-01-30 20:35:36.000')
insert into #EmployeeProject values (2, 1, '2014-10-02 16:13:01.000', '2015-02-04 18:52:58.000')
insert into #EmployeeMeeting values (1, 1, '2014-09-29 15:56:41.000', '2014-09-29 17:08:20.000')
insert into #EmployeeMeeting values (1, 1, '2014-12-03 09:51:54.000', '2015-01-30 15:43:30.000')
insert into #EmployeeMeeting values (2, 1, '2014-10-02 21:25:28.000', '2014-10-07 09:21:09.000')
insert into #EmployeeMeeting values (2, 1, '2014-11-14 12:09:50.000', '2014-12-03 09:45:33.000')
insert into #EmployeeMeeting values (2, 1, '2014-12-05 09:12:21.000', '2015-01-30 15:34:43.000')
I am expecting the following results:
ProjectIdRangeStartRangeEnd
12014-09-29 15:56:41.0002014-09-29 17:08:20.000
12014-12-05 09:12:21.0002015-01-30 15:34:43.000
However, I am getting an extra row returned (that isn't a result of a zero interval). What is going wrong in this case?
Thanks!
My bad, needs this at the end of the query
AND EXISTS(SELECT * FROM #EmployeeMeeting m WHERE m.ProjectId = s.ProjectId
AND s.StartEnd BETWEEN m.MeetingStart AND m.MeetingEnd
AND e.StartEnd BETWEEN m.MeetingStart AND m.MeetingEnd)
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537March 12, 2015 at 7:56 pm
dwain.c Thanks - I will have a read!
Mark Thanks for the fix, which works with my example data.
I've been testing both solutions with my real-world data. Both take about the same time to execute based on about 200000 Projects and 3000 Meetings.
However, I've found another scenario in that data which breaks Mark's solution (which is probably because I've incompletely explained the scenario in my attempt to "simplify" it!).
insert into #EmployeeProject values (1, 3, '2011-05-06 12:17:38.000', '2012-05-21 15:06:44.000')
insert into #EmployeeProject values (2, 3, '2012-05-16 15:55:45.000', '2012-05-16 16:30:25.000')
insert into #EmployeeMeeting values (1, 3, '2011-05-10 13:59:56.000', '2012-05-16 17:44:12.000'
In this case I would be expecting the period to be returned where the employee is in a meeting on all the projects they are working on at the time, i.e.
ProjectIdRangeStartRangeEnd
32011-05-10 13:59:56.0002012-05-16 15:55:45.000
32012-05-16 16:30:25.0002012-05-16 17:44:12.000
What I'm getting with that solution is a missing part in the middle where the employee is allocated to the second project, but not in a meeting for it, i.e.
ProjectIdRangeStartRangeEnd
32011-05-10 13:59:56.0002012-05-16 17:44:12.000
I'm guessing this is because we are purely looking at Start and End points of meetings, and missing what might happen in the middle?
March 13, 2015 at 7:40 am
coalesceuk (3/12/2015)
dwain.c Thanks - I will have a read!Mark Thanks for the fix, which works with my example data.
I've been testing both solutions with my real-world data. Both take about the same time to execute based on about 200000 Projects and 3000 Meetings.
However, I've found another scenario in that data which breaks Mark's solution (which is probably because I've incompletely explained the scenario in my attempt to "simplify" it!).
insert into #EmployeeProject values (1, 3, '2011-05-06 12:17:38.000', '2012-05-21 15:06:44.000')
insert into #EmployeeProject values (2, 3, '2012-05-16 15:55:45.000', '2012-05-16 16:30:25.000')
insert into #EmployeeMeeting values (1, 3, '2011-05-10 13:59:56.000', '2012-05-16 17:44:12.000'
In this case I would be expecting the period to be returned where the employee is in a meeting on all the projects they are working on at the time, i.e.
ProjectIdRangeStartRangeEnd
32011-05-10 13:59:56.0002012-05-16 15:55:45.000
32012-05-16 16:30:25.0002012-05-16 17:44:12.000
What I'm getting with that solution is a missing part in the middle where the employee is allocated to the second project, but not in a meeting for it, i.e.
ProjectIdRangeStartRangeEnd
32011-05-10 13:59:56.0002012-05-16 17:44:12.000
I'm guessing this is because we are purely looking at Start and End points of meetings, and missing what might happen in the middle?
I suspect this means I've misunderstood the requirements, would probably need a re-write.
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply