August 29, 2003 at 12:00 am
Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/jwiner/normalizingdates.asp
September 5, 2003 at 4:38 am
Jon,
1) I've done the same thing for the same problem.
It's also an easy way to detect missing samples, i.e. you expect a value every 1 hour, but the one at 23:00 yesterday is missing.
2) It's fast. My clients DB is 4 GB, and it just flies.
3) Joe Celko has an article about a calendar that is worth reading.
4) what are you doing about Day-Light-Saving ? Every year there are missing data in the spring, and extra data in the Autumn. I have no solid solution for this problem, yet.
Regards,
Henrik Staun Poulsen
Stovi Software
Denmark
September 5, 2003 at 5:12 am
Great Article! I write SP's to manuipulate gas usage data for my company and you gave me some good ideas. The daylight savings issue is a big issue but I just sum the data for the extra hour and leave the missing hour blank.
September 5, 2003 at 6:11 am
Glad to hear others have encountered similar problems and attacked them with similar solutions...was never sure if I was off base or not with my approach. Its always nice to be validated.
Honestly, I never considered the daylight savings time issue until the two of you mentioned it...thanks for the heads up. Though the more I think about, I'm not sure if collecting data in the traditional way and tagging it with a datetime value would resolve the daylight savings time either.
Do you happen to have a link to the Joe Celko Calendar ariticle...I wouldn't mind giving it a read.
Thanks again for the input.
September 5, 2003 at 6:27 am
Sorry don't have a link to that article. Your right about traditional data collection, the DST issue is a tricky one. I have a DST proceedure that runs parrallel with the data collection\append process.
September 5, 2003 at 8:12 am
Maybe I'm totally off here, but if you add another datetime column defaulting to getutcdate() (instead of getdate()) you could solve your DST problems by using the change in time difference to separate the doubled up hour records.
The blank hour is still going to be blank because there was no such hour!
September 5, 2003 at 10:21 am
pshotts,
I don't know for the utc date.
We have a third party helpdesk application which stores the dates as UTC dates.
Now I have to query the past data and I just simply do not know when a specific ticket was open. my query works fine for the actual data but before and after the daylight saving change it cause an error.
As an example: I'm checking how meny tickets are opened per hour and per day and also when my customers starts calling my helpdesk.
Right now I do not know if in march really they started to work at 8 or (as usual) at seven., because my report shows me 7 o'clock.
So UTC or not UTC.....
Bye
Gabor
Bye
Gabor
September 5, 2003 at 12:56 pm
Jon;
I like the idea of having the extra columns available for utility purposes. I think, however, that it's a mistake to define the columns and then populate them as your stored proc is doing.
This technique opens you up to data errors. What prevents an incorrect value in any of the derived date variables?
Perhaps a better approach would be to use computed columns or a VIEW that calculates all of the derived date information.
TroyK
September 5, 2003 at 1:06 pm
Hi Troy,
Not sure I follow you in regards to incorrect data within the date variables. Could you elaborate?
September 5, 2003 at 1:45 pm
Jon,
I agree with you that there should be no problem with your table.
The only reason for using calculated date(part)s would be if you need to use relative date ranges ie Previous Quarter, This Quarter or whatever.
You might like to take a look at my date ranges function:
http://www.sqlservercentral.com/scripts/contributions/864.asp
(There is a new improved version waiting for approval - Mail me directly if you are interested)
It only goes down to the level of a whole day, but there is no reason not to expand it down to the quarter hour level you are interested in.
September 8, 2003 at 12:35 pm
quote:
Hi Troy,Not sure I follow you in regards to incorrect data within the date variables. Could you elaborate?
Hi Jon;
Here's a simplified script that demonstrates the problem I see:
-------- BEGIN SQL SCRIPT --------
SET NOCOUNT ON
-- *********** Original version ************
CREATE TABLE dbo.TimeTable_1(
TimeID int NOT NULL PRIMARY KEY, -- A guess on the PK
TheDate datetime NOT NULL,
-- Dependent columns, but defined independently:
-- erroneous data can be inserted...
TheDay varchar(20),
TheMonth varchar(20),
TheYear char(4)
)
GO
CREATE PROC dbo.UpdateTimeTable_1
AS
UPDATE dbo.TimeTable_1
SET TheDay = DATENAME(weekday,TheDate),
TheMonth = DATENAME(month,TheDate),
TheYear = YEAR(TheDate)
GO
INSERT INTO dbo.TimeTable_1(TimeID, TheDate)
SELECT 1, getdate() UNION ALL
SELECT 2, getdate()
GO
-- Check data
SELECT TheDate, TheDay, TheMonth, TheYear FROM TimeTable_1
-- Need an extra call to the sproc to get data corrected...
EXEC dbo.UpdateTimeTable_1
-- Check data
SELECT TheDate, TheDay, TheMonth, TheYear FROM TimeTable_1
-- A rogue update: nothing is preventing this...
UPDATE TimeTable_1
SET TheMonth = 'June' WHERE TimeID = 1
-- A more blatant example
UPDATE TimeTable_1
SET TheMonth = 'FooBar' WHERE TimeID = 2
GO
-- Check data
SELECT TheDate, TheDay, TheMonth, TheYear FROM TimeTable_1
GO
-- Clean up test box
DROP PROC UpdateTimeTable_1
DROP TABLE TimeTable_1
GO
-- *********** Now, try with computed columns ************
CREATE TABLE dbo.TimeTable_2(
TimeID int NOT NULL PRIMARY KEY,
TheDate datetime NOT NULL,
-- Computed columns: no incorrect data allowed
TheDay AS DATENAME(weekday,TheDate),
TheMonth AS DATENAME(month,TheDate),
TheYear AS YEAR(TheDate)
)
GO
INSERT INTO dbo.TimeTable_2(TimeID, TheDate)
SELECT 1, getdate() UNION ALL
SELECT 2, getdate()
GO
-- Check data
SELECT TheDate, TheDay, TheMonth, TheYear FROM TimeTable_2
-- Try rogue update with TimeTable_2: DBMS prevents the bad data...
UPDATE TimeTable_2
SET TheMonth = 'FooBar' WHERE TimeID = 2
GO
SELECT TheDate, TheDay, TheMonth, TheYear FROM TimeTable_2
GO
-- Clean up test box
DROP TABLE TimeTable_2
GO
-------- END SQL SCRIPT --------
How do you prevent the possibility of users introducing data errors in the TimeTable_1 sample from the above script?
TroyK
September 8, 2003 at 1:05 pm
I think I understand where you are going...
The only way anything can be written to the production TimeTable is via the stored procedure UpdateTimeTable. No applications or users have write permissions to this table. There is only execute permissions on the stored procedure. This stored procedure gets called from a SQL job that is scheduled to run daily. The stored procedure is never reference anywhere in the web application.
So, the short of it, is that no one except the DBA or myself has rights to the sql server to run this stored procedure. There is really no way for any 'bad' data to get into the table unless the dba or myself were to intentionally put it in there.
September 8, 2003 at 3:44 pm
quote:
I think I understand where you are going...The only way anything can be written to the production TimeTable is via the stored procedure UpdateTimeTable. No applications or users have write permissions to this table. There is only execute permissions on the stored procedure. This stored procedure gets called from a SQL job that is scheduled to run daily. The stored procedure is never reference anywhere in the web application.
So, the short of it, is that no one except the DBA or myself has rights to the sql server to run this stored procedure. There is really no way for any 'bad' data to get into the table unless the dba or myself were to intentionally put it in there.
But why define a table that even has the possibility of bad data in the first place?
Would you define two tables that have a FK-to-PK relationship, but not explicitely declare the relationship, hoping that all applications programmed against them correctly enforce the data integrity externally?
These are the kinds of problems that the Relational Model was invented to solve.
Do you have other exceptions like this in your system? Are they all documented? What happens when you or the DBA leave the company?
Here's an article by Joe Celko that goes into more detail:
http://www.intelligententerprise.com/030531/609celko1_1.shtml?/database
If that link won't work, try googling for "Celko" and "Logic of Failure".
TroyK
September 9, 2003 at 6:14 am
Troy,
I'm not sure we're are going to be able to come to an agreement on what is wrong with the table and stored procedure (I don't see an issue)...
You're view that bad data can get introduced to the table as an extremely remote possibility is true of any table. As to your relationship (primary key/foreign key) question: The timetable has an auto identity column used as the primary key. Any tables that reference the timetable with foreign keys do so using the identity column from the timetable. The relationships are explicit, foreign key constraints exist throughout the database, including the timetable.
I read the article you pointed me to, but I don't see how it applies. Referential integrity is enforced in the my DB, not the application.
Does anyone else care to chime in to help me clarify what Troy is trying to get at?
September 9, 2003 at 9:34 am
I'd be interested to hear other points of view on this as well.
Some thoughts:
- Im not opposed to using an indexed view or computed column, it's often a good way to maintain extra data without the hassle of triggers.
- I agree with Jon that since all access is gated through the stored proc and given that the proc is documented, I don't see his approach as being dangerous or hard to maintain. After all, this table does represent a good sized chunk of what he is doing, not a minor piece that might be overlooked.
Andy
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply