October 5, 2016 at 8:54 am
For a report I'm trying to write I ideally need a recordset that shows me how many people were on a training course, by month. I have the day they started and the day they left the course, and need to show the monthly total of people actively on the course.
I'm having trouble getting my head around how I should construct this and suspect the answer isn't as difficult as I'm trying to make it.
Extremely simplified example data and a summary of what I expect the result to be:
CREATE TABLE #SampleTrainingData(
IDINT IDENTITY(1,1),
NameNVARCHAR(10),
StartDateDATE,
LeavingDateDATE
);
INSERT INTO #SampleTrainingData
( Name, StartDate, LeavingDate )
VALUES ( N'Fred', '01 Jan 2016', '05 Jan 2016'),
( N'John', '01 Jan 2016', NULL),
( N'Steve','01 Feb 2016', '03 Mar 2016'),
( N'Allan','03 Feb 2016', NULL),
( N'Mike', '01 Mar 2016', '05 Apr 2016');
SELECT * FROM #SampleTrainingData
--Required Results
--2016 Jan - 1 --(Fred started and left in the same month. John started in that month)
--2016 Feb - 3 --(John is still in training in this month. Steve started training in this month. Allan started training in this month)
--2016 Mar - 3 --(Steve left in this month. John is still in training for this month. Allan is still in training this month. Mike started training this month)
I would appreciate some help with this.
October 5, 2016 at 9:15 am
Does this meet your needs? You'll need to change the values of the @StartDate and @EndDate Variables to suit your needs though.
DECLARE @StartDate DATE = '01-Jan-2016',
@EndDate DATE = '30-Apr-2016';
WITH Months (MonthDate)
AS
(
SELECT @StartDate
UNION ALL
SELECT DATEADD(MONTH, 1, MonthDate)
FROM Months
WHERE DATEADD(MONTH, 1, MonthDate) <= @EndDate
)
SELECT FORMAT(M.MonthDate, 'yyyy MMM') as MonthDate,
(SELECT COUNT([Name])
FROM #SampleTrainingData sq
WHERE DATEADD(MONTH, DATEDIFF(MONTH, 0,sq.StartDate),0) <= M.MonthDate
AND ISNULL(sq.LeavingDate,DATEADD(Month, 1, M.MonthDate)) >= DATEADD(Month, 1, M.MonthDate)) AS Trainees
FROM Months M
ORDER BY M.MonthDate;
Edit:Forgot the variable decs! Whoops!
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
October 5, 2016 at 9:56 am
Something like this...
DECLARE
@BegDate DATE = '2016-01-05',
@EndDate DATE = '2016-06-22';
WITH
n (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_MonthStart (MonthBeg) AS (
SELECT TOP (DATEDIFF(mm, @BegDate, @EndDate) +1)
CAST(DATEADD(mm, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1, DATEADD(mm, DATEDIFF(mm, '1900-01-01', @BegDate), '1900-01-01')) AS DATE)
FROM n n1, n n2, n n3, n n4, n n5, n n6
),
cte_MonthBegEnd AS (
SELECT
ms.MonthBeg,
MonthEnd = EOMONTH(ms.MonthBeg)
FROM
cte_MonthStart ms
)
SELECT
TheYear = YEAR(mbe.MonthBeg),
TheMonth = MONTH(mbe.MonthBeg),
Attended = SUM(stdx.Cnt)
FROM
cte_MonthBegEnd mbe
CROSS APPLY (
SELECT
Cnt = 1
FROM
#SampleTrainingData std
WHERE
mbe.MonthBeg BETWEEN std.StartDate AND ISNULL(std.LeavingDate, '9999-12-31')
OR
mbe.MonthEnd BETWEEN std.StartDate AND ISNULL(std.LeavingDate, '9999-12-31')
OR
std.StartDate BETWEEN mbe.MonthBeg AND mbe.MonthEnd
OR
std.LeavingDate BETWEEN mbe.MonthBeg AND mbe.MonthEnd
) stdx
GROUP BY
mbe.MonthBeg
ORDER BY
mbe.MonthBeg;
Results...
TheYear TheMonth Attended
----------- ----------- -----------
2016 1 2
2016 2 3
2016 3 4
2016 4 3
2016 5 2
2016 6 2
October 5, 2016 at 10:11 am
Why are some people with partial months included and others excluded? Is there a minimum number of days required to be counted? Or are you only counting people who ended the month in training? Would Mal be included in February if he was in training starting FEB 27?
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 5, 2016 at 10:15 am
Thom A (10/5/2016)
Does this meet your needs? You'll need to change the values of the @StartDate and @EndDate Variables to suit your needs though.
DECLARE @StartDate DATE = '01-Jan-2016',
@EndDate DATE = '30-Apr-2016';
WITH Months (MonthDate)
AS
(
SELECT @StartDate
UNION ALL
SELECT DATEADD(MONTH, 1, MonthDate)
FROM Months
WHERE DATEADD(MONTH, 1, MonthDate) <= @EndDate
)
SELECT FORMAT(M.MonthDate, 'yyyy MMM') as MonthDate,
(SELECT COUNT([Name])
FROM #SampleTrainingData sq
WHERE DATEADD(MONTH, DATEDIFF(MONTH, 0,sq.StartDate),0) <= M.MonthDate
AND ISNULL(sq.LeavingDate,DATEADD(Month, 1, M.MonthDate)) >= DATEADD(Month, 1, M.MonthDate)) AS Trainees
FROM Months M
ORDER BY M.MonthDate;
Edit:Forgot the variable decs! Whoops!
This is horribly, horribly inefficient.
1) Using a CTE to create a sequence is horribly inefficient. Use a Tally table instead.
2) Using FORMAT is horribly inefficient. Just send a representative date and let the presentation layer worry about the formatting.
3) Using ISNULL in a WHERE clause is not SARGable. It's much better to use an OR or UNION.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 5, 2016 at 10:32 am
This one is purely for entertainment, it's not very efficient with 4 table scans and an expensive sort in the execution plan. It does on the other hand demonstrate some nice techniques using inline tally and calendar CTEs.
😎
USE TEEST;
GO
SET NOCOUNT ON;
IF OBJECT_ID(N'tempdb..#SampleTrainingData') IS NULL
BEGIN
CREATE TABLE #SampleTrainingData(
IDINT IDENTITY(1,1),
NameNVARCHAR(10),
StartDateDATE,
LeavingDateDATE
);
INSERT INTO #SampleTrainingData
( Name, StartDate, LeavingDate )
VALUES ( N'Fred', '01 Jan 2016', '05 Jan 2016'),
( N'John', '01 Jan 2016', NULL),
( N'Steve','01 Feb 2016', '03 Mar 2016'),
( N'Allan','03 Feb 2016', NULL),
( N'Mike', '01 Mar 2016', '05 Apr 2016');
END
;WITH CALENDAR_CONFIG(MIN_DATE,MAX_DATE,MONTH_COUNT) AS
(
SELECT
MIN(STD.StartDate) AS MIN_DATE
,COALESCE(MAX(STD.LeavingDate),MAX(STD.StartDate)) AS MAX_DATE
,DATEDIFF(MONTH,MIN(STD.StartDate),COALESCE(MAX(STD.LeavingDate),MAX(STD.StartDate))) + 1 AS MONTH_COUNT
FROM #SampleTrainingData STD
)
,T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))
,NUMS(N) AS (SELECT TOP(SELECT MONTH_COUNT FROM CALENDAR_CONFIG) ROW_NUMBER() OVER (ORDER BY @@VERSION) - 1 AS N FROM T T1,T T2,T T3)
,CALENDAR AS
(
SELECT
(YEAR(DATEADD(MONTH,NM.N,CC.MIN_DATE)) * 100) + MONTH(DATEADD(MONTH,NM.N,CC.MIN_DATE)) MONTH_START
FROM CALENDAR_CONFIG CC
CROSS APPLY NUMS NM
)
,BASE_DATA AS
(
SELECT
STD.ID
,STD.Name
,(YEAR(STD.StartDate) * 100) + MONTH(STD.StartDate) AS StartDate
,(YEAR(ISNULL(STD.LeavingDate,CC.MAX_DATE)) * 100) + MONTH(ISNULL(STD.LeavingDate,CC.MAX_DATE)) AS LeavingDate
FROM #SampleTrainingData STD
CROSS APPLY CALENDAR_CONFIG CC
)
SELECT
CAL.MONTH_START
,COUNT(*) TRAINEE_COUNT
FROM BASE_DATA BD
CROSS APPLY CALENDAR CAL
WHERE BD.StartDate < BD.LeavingDate
AND CAL.MONTH_START >= BD.StartDate
AND CAL.MONTH_START < BD.LeavingDate
GROUP BY CAL.MONTH_START;
Output
MONTH_START TRAINEE_COUNT
----------- -------------
201601 1
201602 3
201603 3
October 5, 2016 at 10:49 am
My sick and tired (literally, on both of those 🙁 ) brain is telling me that there is a very efficient single-pass solution here using SUM(CASE...). But sadly I don't have time to pursue it at the moment. Perhaps someone else can bring it to life in the mean time?
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
October 5, 2016 at 10:50 am
Thom A (10/5/2016)
Does this meet your needs? You'll need to change the values of the @StartDate and @EndDate Variables to suit your needs though.
DECLARE @StartDate DATE = '01-Jan-2016',
@EndDate DATE = '30-Apr-2016';
WITH Months (MonthDate)
AS
(
SELECT @StartDate
UNION ALL
SELECT DATEADD(MONTH, 1, MonthDate)
FROM Months
WHERE DATEADD(MONTH, 1, MonthDate) <= @EndDate
)
SELECT FORMAT(M.MonthDate, 'yyyy MMM') as MonthDate,
(SELECT COUNT([Name])
FROM #SampleTrainingData sq
WHERE DATEADD(MONTH, DATEDIFF(MONTH, 0,sq.StartDate),0) <= M.MonthDate
AND ISNULL(sq.LeavingDate,DATEADD(Month, 1, M.MonthDate)) >= DATEADD(Month, 1, M.MonthDate)) AS Trainees
FROM Months M
ORDER BY M.MonthDate;
Edit:Forgot the variable decs! Whoops!
The criteria for overlapping intervals is MUCH simpler than most people assume. It's easier to see this when you start from the criteria for when they do not overlap. They don't overlap when one ends before the other starts.
-- I'm ignoring NULL values here in order to keep the formulas simpler.
-- The logic is still the same.
-- two ranges that do not intersect
SELECT *
FROM a
INNER JOIN b
WHERE a.end_dt < b.begin_dt OR a.begin_dt > b.end_dt
-- Two ranges intersecting is just the inverse of the previous query.
SELECT *
FROM a
INNER JOIN b
WHERE NOT(a.end_dt < b.begin_dt OR a.begin_dt > b.end_dt)
-- We can then distribute the NOT.
-- Note that the OR becomes an AND in doing so.
SELECT *
FROM a
INNER JOIN b
WHERE NOT(a.end_dt < b.begin_dt) AND NOT(a.begin_dt > b.end_dt)
-- Then use the fact that NOT(x < y) is equivalent to x >= y.
SELECT *
FROM a
INNER JOIN b
WHERE a.end_dt >= b.begin_dt
AND a.begin_dt <= b.end_dt
Drew
PS: I am assuming closed intervals here. The logic is the same for open and half-closed intervals, it's just that the comparison operators are different.
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 5, 2016 at 10:52 am
Sorry... I missed the expected output...
A simple modification gets the desired results...
DECLARE
@BegDate DATE = '2016-01-05',
@EndDate DATE = '2016-06-22';
WITH
n (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_MonthStart (MonthBeg) AS (
SELECT TOP (DATEDIFF(mm, @BegDate, @EndDate) +1)
CAST(DATEADD(mm, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1, DATEADD(mm, DATEDIFF(mm, '1900-01-01', @BegDate), '1900-01-01')) AS DATE)
FROM n n1, n n2, n n3, n n4, n n5, n n6
),
cte_MonthBegEnd AS (
SELECT
ms.MonthBeg,
MonthEnd = EOMONTH(ms.MonthBeg)
FROM
cte_MonthStart ms
)
SELECT
TheYear = YEAR(mbe.MonthBeg),
TheMonth = MONTH(mbe.MonthBeg),
SUM(stdx.Cnt)
FROM
cte_MonthBegEnd mbe
CROSS APPLY (
SELECT
Cnt = 1
FROM
#SampleTrainingData std
WHERE
mbe.MonthEnd BETWEEN std.StartDate AND ISNULL(std.LeavingDate, '9999-12-31')
) stdx
GROUP BY
mbe.MonthBeg
ORDER BY
mbe.MonthBeg
Results...
TheYear TheMonth
----------- ----------- -----------
2016 1 1
2016 2 3
2016 3 3
2016 4 2
2016 5 2
2016 6 2
October 5, 2016 at 3:26 pm
I assumed a physical tally table, column name "N" (ugh!, but most common I guess), to avoid having to use an inline CTE:
DECLARE @start_date datetime
DECLARE @end_date datetime
SET @start_date = '20160101'
SET @end_date = '20160630'
SELECT DATEADD(MONTH, t.N, StartMonth) AS TrainingMonth,
COUNT(*) AS People_Count
FROM #SampleTrainingData std
CROSS APPLY (
SELECT CAST(DATEADD(MONTH, DATEDIFF(MONTH, 0, std.StartDate), 0) AS date) AS StartMonth
) AS assign_alias_names
INNER JOIN dbo.tally t ON t.N BETWEEN 0 AND
DATEDIFF(MONTH, std.StartDate,
CASE WHEN std.LeavingDate IS NULL OR std.LeavingDate > @end_date
THEN @end_date
ELSE DATEADD(MONTH, -1, std.LeavingDate) END)
WHERE std.StartDate <= @end_date AND
(std.LeavingDate IS NULL OR (std.LeavingDate >= @start_date))
GROUP BY DATEADD(MONTH, t.N, StartMonth)
ORDER BY DATEADD(MONTH, t.N, StartMonth)
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
October 6, 2016 at 2:07 am
drew.allen (10/5/2016)
Why are some people with partial months included and others excluded? Is there a minimum number of days required to be counted? Or are you only counting people who ended the month in training? Would Mal be included in February if he was in training starting FEB 27?Drew
Hi Drew,
That is a very good question and one I'll have to look into. As a bit of background, I'm working at a place where they've never had a team of DBAs and Develoipers before - having always bought in software and paid for some customisation. So now they're discovering the delights of running a team to produce their own software. Man, oh man I miss having a good BA around.
Little things like this get missed and it can lead to the delights of rework.
October 6, 2016 at 2:10 am
Thank you for the replies, people. They have helped enormously and shown me that what I was trying to do was ridiculously complex - which is what I thought.
And there are some nice little bits in the examples I can look at and learn from for other things too. Always a bonus.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply