When all you have is a hammer, everything looks like a nail. When someone hands you a screwdriver, suddenly you want everything to be a screw. This probably explains my recent desire to do everything with a Tally table. Until I read Jeff Moden's excellent article on the Tally table right here on SQL Server Central.com, I'd never heard of this technique. I owe Jeff a huge thank you as this technique has helped me out a few times since then. Thank You, Mr. Moden.
In his article, he states there are "Dozens of Other Uses" for the Tally table. Here are two I've found pretty useful so far. They're variations on uses he showed, but I'd like to share these somewhat more complex uses. The first is a series of date calculations to create a fiscal year calendar, the second is string parsing. The string parsing example allowed me to see what was actually happening in a problem I was having instead of just blindly hacking until I had the right result, but not knowing why. Since each explanation is lengthy, I'll be splitting this into two parts, one for each use. This article covers the Fiscal data calculations.
Fiscal Year, Period, Week and Day
I need to create a fiscal year that not only starts on a date other than the start of the calendar year, but also only has 52 weeks of 7 days per week. That's 364 days a year, not 365 and we make no exception for leap years. I need 12 Periods in the year, but they don't equate to months. There are 4 quarters in the year and each has 3 periods. The first period in each quarter is 5 weeks, the second and third have 4 weeks. Finally, our week starts on a Friday.
First, let's set up our Tally table. I'm going to use the method shown by Mr. Moden in the article I referenced earlier.
SELECT Top 20000 IDENTITY(INT,1,1) as N INTO Tally FROM master.dbo.syscolumns SC1, master.dbo.syscolumns SC2
This uses a SELECT INTO with an implied CROSS JOIN to rapidly create 20,000 integers in a new table. Setting STATISTICS TIME ON gives us the following times for creating and populating the table: CPU time = 16 ms, elapsed time = 648 ms. 20,000 entries is probably overkill, as the article points out 11,000 gives you over 30 years worth of dates. For some reason, I like the number, but feel free to change it to whatever you like. I'd suggest not going below a few thousand, at least for the purposes of these exercises.
Then add a primary key to the table
ALTER TABLE Tally ADD CONSTRAINT PK_Tally_N PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR=100
Next, we'll set up the table where we'll store the Fiscal Year information
CREATE TABLE FDate (FiscalDateID int NOT NULL IDENTITY(1,1), FiscalDate date NOT NULL, FiscWeekDay tinyint NOT NULL, FiscWeek tinyint NOT NULL, FiscPeriod tinyint NOT NULL, FiscQuarter tinyint NOT NULL, FiscYear smallint NOT NULL )
Any time you start dealing with dates you start making columns that are reserved words. I prefer to just prepend or append an identifying abbreviation and not have to worry about it.
FiscalDateID is just an identity column, useful if you want to start joining this table to other tables. If you'd like, you can just insert the Tally table values here instead of using an Identity, but the method I'm using doesn't return non-modified values, so we'll use the IDENTITY function. FiscalDate is a date field, one of my favorite new data types in SQL Server 2008. There's no time aspect muddying it up and it goes back to the year one. This will hold the actual dates. FiscWeekDay is a tinyint and will tell us where we are in the week, numbering from 1 to 7. FiscWeek is also a tinyint and will tell us the week of the fiscal year from 1 to 52. FiscPeriod is another tinyint and tells us the Fiscal Period from 1 to 12. Periods 1, 4, 7 and 10 will be 5 weeks each, the rest will be 4. FiscQuarter is our last tinyint and will tell us the fiscal quarter we're in. FiscYear is a smallint and tells us the fiscal year. This will frequently not match the calendar year.
Since we want to see recurring series in most of our fields and we have a long, non-recurring series to use to populate these fields, the best method is math! We'll use the modulo operator and populate these fields with remainders from division.
Let's start with our base FROM section.
SELECT TOP 3640 CASE N%364 WHEN 0 THEN 364 ELSE N%364 END as YearDay, N FROM Tally
This will return enough integers in column N to generate 10 fiscal years of dates. Yearday is our first modulo column. We get the remainder of dividing by 364 and use a case statement to replace the 0 results with 364 as the last day of the year. We're doing this in our innermost FROM because we'll be using that value in later calculations. Each step will use the previous step as a subquery.
The next step is
SELECT Cast(Ceiling(Cast(YearDay as real)/Cast(7 as real))as int) as FiscWeek, Ceiling(Cast(YearDay as real)/Cast(91 as real)) as FiscQuarter, N FROM (SELECT TOP 3640 CASE N%364 WHEN 0 THEN 364 ELSE N%364 END as YearDay, N FROM Tally) Y
This level uses another method of generating ranges, straight division with a CEILING function to round up. This requires more Casts, but avoids the need for a CASE statement. We use the YearDay column from the subquery to make sure we only get ranges that fit within each year. It wouldn't do to have a 5th or 56th quarter or weeks 100 through 200. To get the Fiscal Week, we divide by 7, the number of days in a week. To get the Fiscal Quarter, we divide by 91, the number of days in a fiscal quarter. One of the problems with straight division and ceilings is that we have to CAST everything as reals to get the proper decimal results.
The next step is to determine the periods.
SELECT DATEADD(DD, N, '8/31/2006') as FiscDate, DATEPART(DW, DATEADD(dd, N, '8/31/2006')) as FiscWeekDay, CASE WHEN FiscWeek%13 BETWEEN 1 and 5 THEN ((FiscQuarter - 1) * 4) + 1 - (FiscQuarter - 1) WHEN FiscWeek %13 BETWEEN 6 and 9 THEN ((FiscQuarter - 1) * 4) + 2 - (FiscQuarter - 1) WHEN FiscWeek %13 > 9 THEN ((FiscQuarter - 1) * 4) + 3 - (FiscQuarter - 1) WHEN FiscWeek %13 = 0 THEN ((FiscQuarter - 1) * 4) + 3 - (FiscQuarter - 1) END as Period, FiscWeek, FiscQuarter FROM (SELECT Cast(Ceiling(Cast(YearDay as real)/Cast(7 as real))as int) as BusWeek, Ceiling(Cast(YearDay as real)/Cast(91 as real)) as BusQuarter, N FROM (SELECT TOP 3640 CASE N%364 WHEN 0 THEN 364 ELSE N%364 END as YearDay, N FROM Tally) Y) W
First there is some nice, simple date math. To get the Fiscal Date, just a generation of a standard date range, we use a DATEADD, pick a start date and add N to it through the entire set. To get the day of the week or Fiscal Weekday, we do the same date calculation, but run it through a DATEPART. If you wanted to keep from processing the date math twice, you could move this calculation to the top level (there's only one more after this) and run the DATEPART calculation on FiscDate. I believe I included it at this part because when I wrote the query, I hadn't realized I'd need one more level.
I'd mentioned in the requirements for this table that our week starts on a Friday. To make sure we get the right information for the FiscWeekDay column, we could do convoluted addition and subtraction of the normal day of the week to get the results we want or we could run a statement before we start the main query. A nice, simple
SET DATEFIRST 5
I'll include this in its proper place when I post the full query.
The other column obtained at this level is the most complicated part. Determining the Period. Since they're irregular, they require a CASE statement. We use a modulo 13 because that's how many weeks are in a quarter and we're slicing up the quarter to figure out our periods. We use BETWEEN 1 and 5 to make the first period of each quarter 5 weeks long, BETWEEN 6 and 9 grabs the middle 4 weeks for the second period and a combination of >9 and =0 gives us our last 4 weeks for the final period of each quarter. We use the combination because modulo will return 0 when there's no remainder and there won't be one for the last week.
The math for the THEN portion is a bit odd, but it works. First we take the quarter and subtract 1 from it. Multiply that result by 4 and add 1. For the first quarter, that give us 1 - 1 = 0 * 4 = 0 + 1 = 1 Great! For the second quarter, that gives us 2 - 1 = 1 * 4 = 4 + 1 = 5 Not great. So we subtract 1 from the quarter and subtract that result from our total. 2 - 1 = 1 5 - 1 = 4. In the first quarter it zeroes out, so we don't lose the correct result, does it work for the third quarter? 3-1 = 2 * 4 = 8 + 1 = 9. 3 - 1 = 2. 9 - 2 = 7. It works! If someone has better math to fit in here, I'd love to have it.
All that's left is making sure we have the year right. We want to start the example at 9/1/2006, that's the beginning of our fiscal calendar. We use 8/31/2006 in the DATEADD functions because we're using a 1 based Tally table. When displaying the year, we want it to be the same number for the entire fiscal year, so we bump it up to 2007 for the last 4 months of the calendar year 2006.
INSERT INTO FDate (FiscalDate, FiscWeekDay, FiscWeek, FiscPeriod, FiscQuarter, FiscYear) SELECT FiscDate, FiscWeekDay, FiscWeek, Period, FiscQuarter, CASE WHEN DATEPART(MM, FiscDate) > Period THEN DATEPART(YYYY, FiscDate) + 1 ELSE DATEPART(YYYY, FiscDate) END as FiscYear FROM (SELECT DATEADD(DD, N, '8/31/2006') as FiscDate, DATEPART(DW, DATEADD(dd, N, '8/31/2006')) as FiscWeekDay, CASE WHEN FiscWeek%13 BETWEEN 1 and 5 THEN ((FiscQuarter - 1) * 4) + 1 - (FiscQuarter - 1) WHEN FiscWeek%13 BETWEEN 5 and 9 THEN ((FiscQuarter - 1) * 4) + 2 - (FiscQuarter - 1) WHEN FiscWeek%13 > 9 THEN ((FiscQuarter - 1) * 4) + 3 - (FiscQuarter - 1) WHEN FiscWeek%13 = 0 THEN ((FiscQuarter - 1) * 4) + 3 - (FiscQuarter - 1) END as Period, FiscWeek, FiscQuarter FROM (SELECT Cast(Ceiling(Cast(YearDay as real)/Cast(7 as real))as int) as FiscWeek, Ceiling(Cast(YearDay as real)/Cast(91 as real)) as FiscQuarter, N FROM (SELECT TOP 3640 CASE N%364 WHEN 0 THEN 364 ELSE N%364 END as YearDay, N FROM Tally) Y) W) D
Since this is our last level, this is also where we'll do our INSERT INTO. The calculation to "round up" the year is pretty simple, we just check if the current month is greater than the period. If it is, add 1 to the year. If it isn't, don't.
There are a few other little quirks to the data. For reporting, we use an integer that is just yearmonth. So Fiscal Year 2009, 11th period would be 200911. To create this we use the simple formula of FiscYear * 10 + FiscPeriod.
Another quirk is that this fiscal calendar has a tendency to get more and more out of synch with the actual calendar. We lose 5 days every 4 years, thanks to leap year. As a result, the last period of the year occasionally gets a week added to it. That's part of the full query I'll post now. The problem here is that though we've generated dates well into the future, we're likely going to have to blow them away at some point, add a week to a period and generate again from that point. If you have a similar situation, remember to save your query, document it well and make it obvious where you're storing it for future use.
The full code is
SET STATISTICS TIME ON SET DATEFIRST 5 INSERT INTO FDate (FiscalDate, FiscWeekDay, FiscWeek, FiscPeriod, FiscQuarter, FiscYear) SELECT FiscDate, FiscWeekDay, FiscWeek, Period, FiscQuarter, CASE WHEN DATEPART(MM, FiscDate) > Period THEN DATEPART(YYYY, FiscDate) + 1 ELSE DATEPART(YYYY, FiscDate) END as FiscYear FROM (SELECT DATEADD(DD, N, '8/26/2004') as FiscDate, DATEPART(DW, DATEADD(dd, N, '8/26/2004')) as FiscWeekDay, CASE WHEN FiscWeek%13 BETWEEN 1 and 5 THEN ((FiscQuarter - 1) * 4) + 1 - (FiscQuarter - 1) WHEN FiscWeek%13 BETWEEN 5 and 9 THEN ((FiscQuarter - 1) * 4) + 2 - (FiscQuarter - 1) WHEN FiscWeek%13 > 9 THEN ((FiscQuarter - 1) * 4) + 3 - (FiscQuarter - 1) WHEN FiscWeek%13 = 0 THEN ((FiscQuarter - 1) * 4) + 3 - (FiscQuarter - 1) END as Period, FiscWeek, FiscQuarter FROM (SELECT Cast(Ceiling(Cast(YearDay as real)/Cast(7 as real))as int) as FiscWeek, Ceiling(Cast(YearDay as real)/Cast(91 as real)) as FiscQuarter, N FROM (SELECT TOP 728 CASE N%364 WHEN 0 THEN 364 ELSE N%364 END as YearDay, N FROM Tally) Y) W) D INSERT INTO FDate (FiscalDate, FiscWeekDay, FiscWeek, FiscPeriod, FiscQuarter, FiscYear) VALUES('2006-08-25', 1, 53, 12, 4, 2006), ('2006-08-26', 2, 53, 12, 4, 2006), ('2006-08-27', 3, 53, 12, 4, 2006), ('2006-08-28', 4, 53, 12, 4, 2006), ('2006-08-29', 5, 53, 12, 4, 2006), ('2006-08-30', 6, 53, 12, 4, 2006), ('2006-08-31', 7, 53, 12, 4, 2006) INSERT INTO FDate (FiscalDate, FiscWeekDay, FiscWeek, FiscPeriod, FiscQuarter, FiscYear) SELECT FiscDate, FiscWeekDay, FiscWeek, Period, FiscQuarter, CASE WHEN DATEPART(MM, FiscDate) > Period THEN DATEPART(YYYY, FiscDate) + 1 ELSE DATEPART(YYYY, FiscDate) END as FiscYear FROM (SELECT DATEADD(DD, N, '8/31/2006') as FiscDate, DATEPART(DW, DATEADD(dd, N, '8/31/2006')) as FiscWeekDay, CASE WHEN FiscWeek%13 BETWEEN 1 and 5 THEN ((FiscQuarter - 1) * 4) + 1 - (FiscQuarter - 1) WHEN FiscWeek%13 BETWEEN 5 and 9 THEN ((FiscQuarter - 1) * 4) + 2 - (FiscQuarter - 1) WHEN FiscWeek%13 > 9 THEN ((FiscQuarter - 1) * 4) + 3 - (FiscQuarter - 1) WHEN FiscWeek%13 = 0 THEN ((FiscQuarter - 1) * 4) + 3 - (FiscQuarter - 1) END as Period, FiscWeek, FiscQuarter FROM (SELECT Cast(Ceiling(Cast(YearDay as real)/Cast(7 as real))as int) as FiscWeek, Ceiling(Cast(YearDay as real)/Cast(91 as real)) as FiscQuarter, N FROM (SELECT TOP 3640 CASE N%364 WHEN 0 THEN 364 ELSE N%364 END as YearDay, N FROM Tally) Y) W) D SET STATISTICS TIME OFF
As you can see, making an adjustment in the middle just means figuring out how many years you want before the adjustment and modifying the SELECT TOP in the innermost subquery to that number of years *364. Then you do a few manual INSERTs. Here's another great feature of SQL Server 2008, just separate your arrays with commas and you can keep inserting with one statement. No more need for UNION after UNION or repeated INSERT statements. After you've done your manual inserts, start your automatic insert over again at the next date. The only things you need to change are the seed date in your DATEADD and your SELECT TOP number, if you want a different number of years, that is.
So how long did it take to create this table?
47ms total cpu time and 261 ms elapsed time. I think that's a pretty good time, certainly fast enough that you can rerun it whenever you need without worrying about it. When they tell you they've added a week to a period so their calendar doesn't slip too much, you can just do a "DELETE where FiscDate >", add the extra days and run this from your new start, getting back up and running in very little time.
In Part II we will look at text parsing.