February 24, 2012 at 2:12 am
I have been doing this for few years now and was wondering if I was the only person doing it
It is good to see that many have come to the same logical reasoning
I also import and set the Public holidays of every year in to the system which becomes usefull for payroll type of applications
February 24, 2012 at 2:20 am
Worth mentioning that this technique works well on any data subject area where there is a defined set of answers.
For example, permutation/combination figures, significance tables.
I was quite proud of coming up with efficient formulae for handling nPr, nCr stuff in http://www.sqlservercentral.com/articles/Troubleshooting/75990/ but suffered the DOH! moment when someone pointed out I could actually store the values in a table instead!
February 25, 2012 at 8:14 am
Excellent article! Thank you.
Now I have a question for you more experienced guys and gals. 😀
If I use a tally table like the one below would that be alright? I'm
trying to use a single tally table for numbers and dates as well.
IF OBJECT_ID('dbo.Tally', 'U') IS NOT NULL
DROP TABLE dbo.Tally;
GO
DECLARE @StartDate DATETIME = '18501231';
SELECT Tab.Number,
@StartDate + Tab.Number AS Date,
MONTH(@StartDate + Tab.Number) AS Month,
YEAR(@StartDate + Tab.Number) AS Year
INTO dbo.Tally
FROM (SELECT TOP(100000)
ROW_NUMBER() OVER (ORDER BY C1.column_id)
FROM model.sys.columns AS C1
CROSS JOIN model.sys.columns AS C2
CROSS JOIN model.sys.columns AS C3) AS Tab(Number);
ALTER TABLE dbo.Tally
ALTER COLUMN Number INT NOT NULL;
GO
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally PRIMARY KEY CLUSTERED(Number)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
FILLFACTOR = 100);
GO
CREATE NONCLUSTERED INDEX IX_Tally_Date ON dbo.Tally(Date)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
FILLFACTOR = 100);
GO
CREATE NONCLUSTERED INDEX IX_Tally_Year ON dbo.Tally(Year)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
FILLFACTOR = 100);
GO
EXEC sp_spaceused 'dbo.Tally';
-- Tally100000 8512 KB3328 KB4432 KB752 KB
Or should I just remove the Number column and calculate the ROW_NUMBER everytime I use the table? I'm trying to think performance-wise. Have a great weekend y'all!
Best regards,
Andre Guerreiro Neto
Database Analyst
http://www.softplan.com.br
MCITPx1/MCTSx2/MCSE/MCSA
February 27, 2012 at 11:50 am
Yes, I commonly use the yearmonth (YYYYMM) scenario. However, I use the fieldname TimeKey so as not to argue with the system about any reserved words. Additionally, I use an integer field in the table to allow queries easily to span segments that include differing years. The integer opens the door to arithmetic operations for query construction.
Example:
SequenceNumber TimeKey
11 201111
12 201112
13 201201
14 201212 & C.
In addition, I use the fields StartKey and StopKey (with a default 999999 on the second) as a handy way of identifying SCD lookups for historical data. This system is very useful when working with accounting systems where discreet monthly data is used.
February 27, 2012 at 4:08 pm
codebyo (2/25/2012)
Excellent article! Thank you.Now I have a question for you more experienced guys and gals. 😀
If I use a tally table like the one below would that be alright? I'm
trying to use a single tally table for numbers and dates as well.
IF OBJECT_ID('dbo.Tally', 'U') IS NOT NULL
DROP TABLE dbo.Tally;
GO
DECLARE @StartDate DATETIME = '18501231';
SELECT Tab.Number,
@StartDate + Tab.Number AS Date,
MONTH(@StartDate + Tab.Number) AS Month,
YEAR(@StartDate + Tab.Number) AS Year
INTO dbo.Tally
FROM (SELECT TOP(100000)
ROW_NUMBER() OVER (ORDER BY C1.column_id)
FROM model.sys.columns AS C1
CROSS JOIN model.sys.columns AS C2
CROSS JOIN model.sys.columns AS C3) AS Tab(Number);
ALTER TABLE dbo.Tally
ALTER COLUMN Number INT NOT NULL;
GO
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally PRIMARY KEY CLUSTERED(Number)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
FILLFACTOR = 100);
GO
CREATE NONCLUSTERED INDEX IX_Tally_Date ON dbo.Tally(Date)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
FILLFACTOR = 100);
GO
CREATE NONCLUSTERED INDEX IX_Tally_Year ON dbo.Tally(Year)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
FILLFACTOR = 100);
GO
EXEC sp_spaceused 'dbo.Tally';
-- Tally100000 8512 KB3328 KB4432 KB752 KB
Or should I just remove the Number column and calculate the ROW_NUMBER everytime I use the table? I'm trying to think performance-wise. Have a great weekend y'all!
Part of the "speed" factor of a Tally table is that it's densely packed with only the information it needs. Adding the columns you have will make it necessary to read at least twice as many pages to do any job that uses it. Further, you've also just increased the "in memory" size of the table when it does cache.
My recommendation is to never mix the Tally Table with a Calendar table of any type. It's not good for the Tally side of it all.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 27, 2012 at 5:09 pm
Jeff Moden (2/27/2012)
Part of the "speed" factor of a Tally table is that it's densely packed with only the information it needs. Adding the columns you have will make it necessary to read at least twice as many pages to do any job that uses it. Further, you've also just increased the "in memory" size of the table when it does cache.My recommendation is to never mix the Tally Table with a Calendar table of any type. It's not good for the Tally side of it all.
I see what you mean. My tally/calendar table would behave just like an ordinary table being joined, right? No real gain there. Lesson learned. Thank you again, Jeff.
Best regards,
Andre Guerreiro Neto
Database Analyst
http://www.softplan.com.br
MCITPx1/MCTSx2/MCSE/MCSA
March 25, 2012 at 1:14 am
Great article and technique.
How about dates stored in UTC? How do you configure the date table then? Do you create a date table that is also in UTC? Does this even make a difference? UTC would be the same as the dates then, right?
March 25, 2012 at 9:09 am
michaelm-746562 (2/27/2012)
Additionally, I use an integer field in the table to allow queries easily to span segments that include differing years. The integer opens the door to arithmetic operations for query construction.Example:
SequenceNumber TimeKey
11 201111
12 201112
13 201201
14 201212 & C.
Sorry for the late response but I lost track of this thread. A recent post to it "found" it for me.
What kind of "arithmetic operations" have you actually used against that column and what do they do?
--Jeff Moden
Change is inevitable... Change for the better is not.
June 23, 2012 at 9:31 pm
Sorry for the late response... I'm just getting caught up on some newsletters where I've been too busy to read.
I think that you did a great job in this article, except in one place (and I'm somewhat surprised that Jeff didn't mention this).
So, in all of this work to build a calendar table to create faster, more efficient set-based queries, you end up using a loop instead of a set-based method to generate your table... here's a set-based method to generate the calendar table:
WITH Tens (N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
Thousands (N) AS (SELECT t1.N FROM Tens t1, Tens t2, Tens t3),
Millions (N) AS (SELECT t1.N FROM Thousands t1, Thousands t2),
Tally (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM Millions),
Months (N, MonthStart) AS (SELECT TOP (25*12)
N, DATEADD(MONTH, N, '1999-12-01T00:00:00')
FROM Tally)
SELECT N,
MonthStart,
NextMonth = DATEADD(MONTH, 1, MonthStart),
MonthDescr = CONVERT(CHAR(3), caMonth.MonthName) + '-' +
RIGHT(CONVERT(CHAR(4), YEAR(MonthStart)),2),
caMonth.MonthName,
YearMonth = (YEAR(MonthStart)*100) + caMonth.MonthNbr,
caMonth.MonthNbr
FROM Months
CROSS APPLY (SELECT MonthNbr = DATEPART(MONTH, Months.MonthStart),
MonthName = DATENAME(MONTH, Months.MonthStart)) caMonth
ORDER BY Months.MonthStart;
You can use SET IDENTITY_INSERT to insert N into the Identity column. And to address the Y2K issue brought up (running out of dates), just change the "25" to 5000 to take it up to year 7000. Now, if queries based on this table are still in use then, I think we've got bigger problems to worry about... :-D:w00t:
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
June 24, 2012 at 1:14 pm
WayneS (6/23/2012)
I think that you did a great job in this article, except in one place (and I'm somewhat surprised that Jeff didn't mention this).
Including the WHILE loop, there are good many things about Todd's Calendar Table code that I would have someone change to pass a production code review but didn't mention because I didn't want to take any focus away from what Todd was trying to say in his good article. The point he's trying to get across is much more important.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 8, 2013 at 1:03 pm
Is there an error in the last code section of your article?
snippit:
-- This returns the first day of the current month.
SET @StartYear2 = DATEADD(YY, -1, DATEADD(MM, DATEDIFF(MM, 0, GETDATE()), 0));
-- Get the MonthID for this month.
SELECT
@StartMonth = MonthID
FROM CalMonth
WHERE MonthStart = @StartYear2;
-- Set to same month the year before.
SET @StartMonth = @StartMonth - 12
I would think that the @StartMonth should be -24 instead of -12. (2 years of data)
It also fixes the last 5 lines of SecondYearMonth(same as FirstYearMonth) and Revenue=0.00.
FirstYearMonth Year1Revenue SecondYearMonth Year2Revenue MonthOrd
JUN-08 814652.29 JUN-09 815557.55 0
JUL-08 850882.41 JUL-09 831792.32 1
AUG-08 844237.14 AUG-09 837465.42 2
SEP-08 836915.91 SEP-09 833296.03 3
OCT-08 838332.34 OCT-09 855292.19 4
NOV-08 819412.58 NOV-09 827802.45 5
DEC-08 838095.67 DEC-09 851131.06 6
JAN-09 858064.48 JAN-09 0.00 7
FEB-09 771925.00 FEB-09 0.00 8
MAR-09 855450.64 MAR-09 0.00 9
APR-09 821391.97 APR-09 0.00 10
MAY-09 853181.60 MAY-09 0.00 11
January 31, 2014 at 1:33 pm
This article is an awesome start. I assume that anyone can take this knowledge and tailor it to fit their needs. I have a few questions though:
1) What are the cons of using this table?
2) Does this table take into account different time zones, holidays, leap year etc..
or would that be conditions that the developer would have to add to their queries?
3) What are some things to be aware of when querying this table?
Thanks for a great article/lesson. Cheers!
Mark
February 14, 2014 at 5:44 am
I've used calendar tables for all manufacturing sites; they always contained one row per day.
In my opinion the very biggest reason to create a calendar table is to know the cross referencing of all the following columns for any given date:
Julian date
gregorian date
accounting month,
accounting week of month,
serial day in accounting month,
serial day in accounting year,
daylight savings time indicator
company holiday indicator
manufacturing workday indicator,
First, second, and third shift hours,
staff workday indicator,
Day of week
Knowing all of these makes it easy to calculate the time to complete a manufacturing order and which accounting periods are affected. And it is easy to convert other types of dates depending on whom you are reporting to.
February 14, 2014 at 5:51 am
I've used tables containing base data for calculating TimeZones and Daylight Savings Times for any year. But, let me tell you they were a royal mess to deal with. There are so many exceptions, it all breaks down in complexity. Going forward, I will always create a table which contains everything resolved. It will be much faster, just like Intel burns precalculated cosine tables into CPUs for simplicity and speed.
February 14, 2014 at 7:31 am
I've done something similar for a couple of databases. Because the reporting systems are billing and HR related, the table also contains pre-calculated columns indicating weekday_abbr (Mon - Fri), is_business_day (0 or 1), is_month_business_day_1 (0 or 1), is_holiday (0 or 1), etc. This is one of those table design techniques common in the dimensional modeling realm that can be leveraged in the OLTP and DW realm.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
Viewing 15 posts - 31 through 45 (of 46 total)
You must be logged in to reply to this topic. Login to reply