Normalizing Dates
Typically, in past projects, I have used datetime columns to correlate some date/time value with a transaction or entity's creation and modification.
For instance, if a customer where to place an order, I would insert a record into an Order table with the order number and a date/time for when the order was generated.
In another example, if I needed to dynamically schedule sending emails to customers on given dates, I would insert a date/time along with a customer id
into an EmailCustomer table to identify when to send the emails.
In a current project, I decided to experiment with how I worked with dates a bit. I had the idea: 'What if I normalized my dates?' Essentially, using some ideas from OLAP and
implementing them in an OLTP environment. What was I thinking?! Why would I do something like that?! Read on...
This project, I'll call 'project X' for the sake of this article is a heavy data collection and reporting application. Initially, I was looking at OLAP and Data Warehousing as an option for the reporting. In the end, for many non-technical reasons, OLAP was not the direction we took, but it did provide an introductory glance at cubes, dimensions and fact tables. As an OLAP newbie, I got started by stepping through the Analysis Services tutorial. (As an aside, the Analysis Services tutorial from a beginners standpoint is quite good. The only area I found missing was the methodology and best practices for creating fact tables.)
In an effort to get up to speed on the world of OLAP, I wrote a stored procedure to build a Time dimension for a cube. My time dimension would include, the date, hour of day, day of year, month of year, year, fiscal year, quarter, and several additional date/time groupings. After testing, I created a SQL job to automated the stored procedure to run daily and append the new day's date/time information to my dimension.
Once I learned that OLAP was not going to be an option, I put down the Analysis Services tutorial and continued in another direction. But soon after, I realized that I could leverage my Time dimension within my normalized, OLTP database. I added a few additional date/time attributes to the Time dimension and modified the stored procedure to pick up were it left off, even if it failed to run for some amount of time. In addition, I added the ability for the dates to forecast a year into the future.
Here's a look at the stored procedure:
CREATE PROCEDURE dbo.UpdateTimeTable AS Declare @LastDate smalldatetime Declare @StartDate smalldatetime Declare @CurrentDate smalldatetime Declare @Day varchar(20) Declare @Month varchar(20) Declare @Year char(4) Declare @MinuteOfHour tinyint Declare @HourOfDay tinyint Declare @DayOfMonth tinyint Declare @DayOfYear smallint Declare @MonthOfYear tinyint Declare @WeekOfYear tinyint Declare @Quarter tinyint Declare @FiscalYear varchar(20) set nocount on --see if there are any records in the table if exists(select top 1 thedate from timetable with (nolock) order by thedate desc) begin --get last date from table select top 1 @LastDate = thedate from timetable with (nolock) order by thedate desc --set start date set @StartDate = dateadd(dd, 1, @LastDate) end else begin --set default start date set @StartDate = cast('1/1/1985' as smalldatetime) end --set current date equal to the start date set @CurrentDate = @StartDate --while the current date is less than or equal to today's date + 1 year while (datediff(dd,@CurrentDate, dateadd(yy, 1, getdate())) >= 0) begin --set minute default set @MinuteOfHour = 15 --set hour default set @HourOfDay = 1 --get day set @Day = datename(weekday,@CurrentDate) --get month set @Month = datename(month,@CurrentDate) --get year set @Year = datepart(yyyy,@CurrentDate) --get day of month set @DayOfMonth = datepart(dd, @CurrentDate) --get day of year set @DayOfYear = datepart(dy,@CurrentDate) --get week of year set @WeekOfYear = datepart(wk,@CurrentDate) --get month of year set @MonthOfYear = datepart(mm,@CurrentDate) --get quarter set @Quarter = datepart(qq,@CurrentDate) --get fiscal year if @MonthOfYear > 9 begin set @FiscalYear = @Year + 1 end else begin set @FiscalYear = @Year end --generate hours for day while (@HourOfDay < 25)
begin
--generate 15 minute increments for each hour
while @MinuteOfHour < 61
begin
--insert record into timetable
insert into timetable
(thedate, theday, themonth, theyear, minuteofhour, hourofday, dayofmonth, dayofyear, values (@CurrentDate, @Day, @Month, @Year, @MinuteOfHour, @HourOfDay, @DayOfMonth, @DayOfYear, @WeekOfYear, @MonthOfYear, @Quarter, @FiscalYear) --increment by 15 minutes set @MinuteOfHour = @MinuteOfHour + 15 end --reset minutes for next hour set @MinuteOfHour = 15 --increment hour by 1 set @HourOfDay = @HourOfDay + 1 end --increment current date by 1 set @CurrentDate = dateadd(dd,1,@CurrentDate) end GO |
Here's a look at the Time table:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
table continued… | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Currently, the Time table has 689,088 rows. This spans 15 minute intervals from 1/1/1985 - 8/26/2004. Daily, the table would increase by 96 rows. As you may guess, this is not ideal for performance when doing joins to determine when something has happened. To resolve this, I created several indexed views to allow me the ease of querying and additional reporting leverage. By doing so, I can see one of many snapshots in time.
Here are a couple of the views:
|
Now that I had my Time table and views, I could easily normalize out certain actions based in time in my database. For example:
In Project X, electric meters are read on certain intervals (15 minute, monthly, etc.). If a meter was read monthly, the meter reading dates could span from 1/25/2003 - 2/20/2003. For reporting and accounting purposes, the readings for the above date range need to be applied to a single month, in this instance; it would be February, 2003. Using a month/year view, I simply join the table where meter readings reside to my Time table. This linked my meter reading to a specific month and year. It also linked my meter reading to a specific fiscal year and quarter. I used this same logic to associate monthly electric invoices with a specific month and year.
Because the Time table contained dates up to one year in the future, it also allowed me to forecast energy consumption in the same way as meter readings and invoices. It also allowed me to normalize future electric rates that were posted to us prior to their effective dates.
For reporting purposes, I could now easily query based on fiscal year, quarter or even 15 minute intervals. It allowed me large amounts of flexibility without having to write long winded stored procedures to group and aggregate data by certain date ranges, etc. My time table has done all of the work for me up front.
This project has yet to reach completion, so my perspective and scheme for attacking date/time issues within the application may change. Tell me what you think.
For articles and information on SQL’s Analysis Services and OLAP click here:
For articles and information on SQL’s date parsing functions click here: