Often times it is necessary to display or join to a range of dates. There are many ways to do this. You could use a while loop to generate the dates every time you need one. But while loops are often slow and consume valuable resources. You could use a numbers table (see util_nums article) to generate dates on the fly which will be faster, but still requires using SQL Server functions to do date calculations.
What is a dates table?
A dates table is just that, a table that contains dates. But what kind. We'll go over some of the possible uses of a dates table, but first let's talk about the structure. And just like with the util_nums table, we will create the table, populate it, then apply the indexes.
createtable [dbo].[util_time_coordinate] (date_id int not null identity ,theDimension varchar(7) not null ,theCoordinate datetime not null ) go alter table [dbo].[util_time_coordinate] add constraint PK_util_dates primary key nonclustered (date_id)with fillfactor = 100 create unique clustered index uidx_util_dates_dimension_theDate on [dbo].[util_time_coordinate] (theDimension,theCoordinate) with fillfactor = 100
Pretty simple huh! But let's break it down.
- date_id: This is just a surrogate key in case you find it useful to reference this date table later for aggregation. Believe me, there are uses for it.
- theDimension: This is the datepart. Later we will be populating this with the following values: Year, Quarter, Month, Day and Hour.
- theCoordinate: This is where the actual time coordinate will be stored. You will be able to see sample data below.
Populating the table
Ok. Before we get into HOW to use it, we better get it populated. We will be using the util_nums table to generate this data. But we will need more than 1 million records of we want to create a large date coordinate span. For our example, we are going to create all coordinates for each of the dimensions between January 1, 1900 to December 31, 2099. Seems like overkill right? Perhaps. You will need to determine what your @BaseDate and @StopDate values should be. As for storage, this one is going to be a little more costly than the util_nums table. It requires approximately 500KB per year, for every year you wish to create in this table. So the 200 years we will be creating will require 100MB (including index and data).
The below code will generate the following record counts per dimension.
theDimension Count ------------ ----------- HOUR 1753176 DAY 73049 MONTH 2400 QUARTER 800 YEAR 200
Different population options were explored for the util_nums table. But there really is no point here since we this entire process takes only 45 seconds *1.
set nocount on declare @BaseDate datetime ,@StopDate datetime set@BaseDate = converT(datetime,'1900.01.01',102) set@StopDate = converT(datetime,'2100.01.01',102) declare @dateParts TABLE (Id int ,theDatePart varchar(7) ) insert into @DateParts values(1,'YEAR') insert into @DateParts values(2,'QUARTER') insert into @DateParts values(3,'MONTH') insert into @DateParts values(4,'DAY') insert into @DateParts values(5,'HOUR') declare @ID int ,@theDatePart varchar(7) ,@parmlist nvarchar(1000) ,@SQL nvarchar(1000) select@ID = min(ID) from@DateParts while@ID is not null begin select @theDatePart = theDatePart from @dateparts where id = @id set@parmlist = N'@basedate datetime , @StopDate datetime , @thedatepart varchar(7)' set @SQL = N'select @theDatePart, @baseDate union all select @theDatePart, dateadd(' + @theDatePart +', n, @baseDate) from [dbo].[util_nums] where n < datediff(' + @theDatePart + ', @baseDate,@StopDate) ' insert into [dbo].[util_time_coordinate] (theDimension,theCoordinate) exec sp_executeSQL @sql, @parmlist, @basedate=@basedate, @theDatePart=@theDatePart, @StopDate = @StopDate select top 1 @id = id from @dateparts where id > @id if @@rowcount=0 break end GO
Now let's go over some of the possible uses.
Time tracking:
If you have an employee time table, how would you find a list of dates the employee did NOT work? You can loop through, distinct date by date, OR:
create table #employees (employee_id int ,employee_name varchar(32) ) go insert into #employees values(1,'Brandon Galderisi') insert into #employees values(2,'Donald Duck') insert into #employees values(3,'Mickey Mouse') go create table #EmployeeTime (employee_id int ,work_date datetime ,StartDatetime datetime ,endDateTime datetime ) GO insert into #EmployeeTime select 1,convert(datetime,convert(char(10),getdate(),101),101),getdate(),dateadd(hh,8,getdate()) insert into #EmployeeTime select 1,convert(datetime,convert(char(10),getdate()-1,101),101),getdate()-1,dateadd(hh,8,getdate()-1) insert into #EmployeeTime select 1,convert(datetime,convert(char(10),getdate()-3,101),101),getdate()-3,dateadd(hh,8,getdate()-3) insert into #EmployeeTime select 2,convert(datetime,convert(char(10),getdate(),101),101),getdate(),dateadd(hh,8,getdate()) insert into #EmployeeTime select 2,convert(datetime,convert(char(10),getdate()-1,101),101),getdate()-1,dateadd(hh,8,getdate()-1) insert into #EmployeeTime select 3,convert(datetime,convert(char(10),getdate()-3,101),101),getdate()-3,dateadd(hh,8,getdate()-3) GO ;with Employee_Days as (select e.employee_id,t.thecoordinate from #employees e cross join [dbo].[util_time_coordinate] t where t.theDimension = 'DAY' and theCoordinate >= getdate()-14 and theCoordinate <= getdate()) select e.* ,convert(char(5),dateadd(n,isnull(datediff(n, startdatetime,enddatetime),0),0),108) Time_Worked fromEmployee_Days e left outerjoin #EmployeeTime t on e.employee_id = t.employee_id and e.theCoordinate = t.work_date order by e.employee_id,e.thecoordinate GO drop table #EmployeeTime drop table #employees
Let's break it down!
The first 4 sections are simple. We are creating some sample tables and populating some data.
Next: We use a cross join to map all employees to all dates for the range we want to check for. Notice the use of the t.theDimension. Since certain coordinates will have multiple dimensions, we need to specify what dimension we want. I've represented it as a common table expression for making it's reference simpler in the following section.
;with Employee_Days as (select e.employee_id,t.thecoordinate from #employees e cross join [dbo].[util_time_coordinate] t where t.theDimension = 'DAY' and theCoordinate >= getdate()-14 and theCoordinate <= getdate())
As I just explained, the Employee_Days is a mapping of all employees to all days for the time specified. This will ensure that we can represent days where data does not exist.
select e.* ,convert(char(5),dateadd(n,isnull(datediff(n, startdatetime,enddatetime),0),0),108) Time_Worked fromEmployee_Days e left outerjoin #EmployeeTime t on e.employee_id = t.employee_id and e.theCoordinate = t.work_date order by e.employee_id,e.thecoordinate
The output:
employee_id thecoordinate Time_Worked ----------- ----------------------- ----------- 1 2008-09-25 00:00:00.000 00:00 ..edited 1 2008-10-04 00:00:00.000 00:00 1 2008-10-05 00:00:00.000 08:00 1 2008-10-06 00:00:00.000 00:00 1 2008-10-07 00:00:00.000 08:00 1 2008-10-08 00:00:00.000 08:00 2 2008-09-25 00:00:00.000 00:00 ..edited 2 2008-10-06 00:00:00.000 00:00 2 2008-10-07 00:00:00.000 08:00 2 2008-10-08 00:00:00.000 08:00 3 2008-09-25 00:00:00.000 00:00 3 2008-09-26 00:00:00.000 00:00 ..edited 3 2008-10-05 00:00:00.000 08:00 3 2008-10-06 00:00:00.000 00:00 3 2008-10-07 00:00:00.000 00:00 3 2008-10-08 00:00:00.000 00:00
I removed the output, but the sample set included 42 records. Which is one for each employee for each of the 14 days requested.
Time Scheduling:
The same principal to time tracking can be applied to time scheduling. You can count the number of people scheduled to work on a given day, and represent those days that have 0 people scheduled.
What's next:
The next topic I will cover is string parsing and manipulation.
If you would like any additional information about this or to request a future topic, please feel free to email me at Brandon@Galderisi.com.
*1 – All times may vary.
Test Machine:
XEON 3060
4GB Ram (500MB allocted to SQL)
SQL Server 2005 Express Edition