I was recently asked by a charity for some help with a problem they had. They needed to add together periods of time such as one month and five days, and two months and twenty seven days. The answer needed to be expressed in years, months and days.
Working with dates is always tricky, but of course I am an expert and can handle such things – as you can, too, no doubt. However, this problem very soon becomes severely bogged down. How many days do you add to twenty seven before it becomes a month? The answer to this is regrettably somewhat variable, and if you base this answer on the month in which you ask the question, the answer will of course change over time. It became obvious that this problem needed to have some fixed aspects to it, so that it did not matter when you looked at it, the answers were always the same.
I had a look at the data that the query worked on. This was based on a RecordOfService table, and when I looked at the data, my eyes opened very wide and nearly popped out of my head. As I mentioned, the organisation was a charity, and there were folk who volunteered for periods of duty. So they might do perhaps two months volunteering, go away for a while, and then come back for another few months, and repeat this cycle over several years.
No problems with the concept. When the role that they performed changed, they would get a new record. So although they may be doing a three month stretch of volunteering, there may be several records making up those three months as they changed role during that time. Still no conceptual issues. Sometimes, one of these records ended on a particular day, and the next record started on the following day, but sometimes it was the same day. Not good. And then it appears that where a person was standing in for somebody else temporarily, there were two records for that period where they were filling a substantive role as well as an acting role. This is bad. Without trying to quantify how bad it was, it was easy to see that simply adding up the days would not work, as for a month’s service, you could easily get an answer like thirty five days, or some other impossible number.
The technical term for such a situation is Dog’s Breakfast, which may be conveniently abbreviated to DB. And perhaps the philosophers amongst us could provide some insights as to how it is that we work with DBs that so frequently resemble DBs - the only insight that I can offer is that there must have been somebody being kept pretty busy with capturing all of that data!
Back to the problem. I thought that I would produce a table of dates, and then join the RecordOfServiceTable to the DateTable, Group By the day to remove the duplicates, and then I would get an honest representation of how many days’ service were involved. Whilst there was no immediate requirement from this charity for any added value to a plain old date table, a crystal ball generally does not need much polishing to see the potential benefit from adding things like Public Holidays, and expanding the table to resemble a rather natty Date Dimension table.
And many apologies to the Normalisation Police – having some of these derived values stored in a table taking up space just makes later querying so much easier.
Calculating when Christmas comes is not too difficult, but here in the UK there are further complications. Should either Christmas or Boxing Day fall on a Saturday or Sunday, then we get the following Monday, and possibly Tuesday as Public Holidays. Trickier, but still not overly difficult to calculate. New Year’s Day, May Day, Spring Bank Holiday, and Summer Bank Holiday are likewise not too difficult. And periodically, we in the UK get extra Public Holidays when there are significant events in the Royal Family – many thanks to the Duke and Duchess of Cambridge for 29th April 2011, to celebrate their marriage. Predicting these dates in advance is somewhat short of a sure science, so we would have to wait for the legislation to pass, and then hard code any relevant dates into our calculation.
Still all pretty straight forward, but then we think about Easter, which is, as they say, something of a moveable feast. Fortunately, there are giants who have gone before us, and they have shared their code with us, enabling those of us who are of lesser stature to stand on their shoulders, and enjoy the view without the headache.
The script for my function to calculate UK Public Holidays is attached. And of course you would get the UK Public Holidays for 2012 by typing in this expression:
Select * from dbo.fnGetUKHolidays(2012)
The script to create my DimDate table is also attached.
With the DimDate table in place, I am now able to get an honest number of days’ service. But I still need to be able to express that not only honestly but consistently as well, in terms of years, months and days. The method that I chose to use was to take the first start date for the individual, and then count off the total number of day’s service from there, and express that period in terms of years months and days. To display the period in the correct format, I created the function fnDateDiffsYMD, the script for which is attached.
And finally, I have the query that the charity will use to display the lengths of service in the format they want:
--VolunteerLengthOfService --Written By PJR --4th February, 2017 Select T.VolunteerID , Convert(nvarchar(12), T.FromWhen, 105) As FromWhen , Convert(nvarchar(12), T.ToWhen, 105) As ToWhen , T2.NoOfWorkingDays , DateAdd(D, T2.NoOfWorkingDays - 1, T.FromWhen) As VirtualEndDate , dbo.fnDateDiffsYMD(T.FromWhen, DateAdd(D, T2.NoOfWorkingDays - 1, T.FromWhen)) As DateDiffAsYMD From (Select ROS.VolunteerID As VolunteerID , Min(Cast(ROS.DateFrom As Date)) As FromWhen , Max(Cast(IsNull(ROS.DateTo , GetDate()) As Date)) As ToWhen From RecordOfService As ROS Where DateFrom <= GetDate() Group by ROS.VolunteerID) As T Inner Join (Select T3.VolunteerID, Count(*) As NoOfWorkingDays From (Select ROS.VolunteerID, D.[Date] From dbo.DimDate As D Inner Join RecordOfService As ROS On D.[Date] Between Cast(ROS.DateFrom As Date) And Cast(IsNull(ROS.DateTo , GetDate()) As Date) Group By ROS.VolunteerID, D.[Date]) As T3 Group By T3.VolunteerID) As T2 On T.VolunteerID = T2.VolunteerID Order By T.VolunteerID
This is an elegant solution - the complexities are conveniently hidden away from the users, and the queries that they use should not by overly confusing, and would allow them to fit in any Where clauses etc. that were appropriate, without breaking the whole thing.