August 18, 2014 at 1:08 pm
HI, all
I'm working on setup where we have period like 201401 ( jan 2014), etc... And it'ts INT, however it doesnt help if you need to some datediff operation.
I can go to ref table and get relative first day for 2 periods and do datediff on them, but I would like to make something more beautiful, i.e. preset PeriodID in some INT value = let say number of month since 1900 and do operation directly, is it possible.
Declare @pera = 201310;
@PerB = 201402
select @PerB - @pera --to be 4 month??
Thanks
Mario
August 18, 2014 at 1:15 pm
That's why your dates should be stored as date data types (date, time, datetime, smaldatetime, datetime2). Otherwise, you'll get on complex calculations like this:
Declare @pera int = 201310,
@PerB int = 201402
select DATEDIFF( MM, CONVERT( date, CONVERT( char(6), @pera) + '01', 112),
CONVERT( date, CONVERT( char(6), @PerB) + '01', 112))
August 18, 2014 at 1:25 pm
Thanks, Luis
There is table that has corresponding date as date for each periodID, but I think your solution is more lighter and faster.
Taken.
M
August 18, 2014 at 2:06 pm
First let me strongly agree that you should not store data in forms you can't work with. This isn't a date, it's not really an integer, it's a year number and a month number that have been combined into some human-readable compound form. It violates the rules of database normalization. But if you're stuck with it, there are many ways to do the conversion. You may want to create a scalar user-defined function to do the conversion, so you don't have to repeat the code in every query.
The simplest forms use conversion to/from strings, but involve more CPU overhead. Using nothing but integer arithmetic is the most efficient method, but the expressions are complicated and will be harder to debug and maintain.
Declare @pera INT = 201310, @PerB INT = 201402;
-- Compute integer month numbers since 1/1/1900, then subtract
SELECT ((@PerB / 100 - 1900) * 12 + @PerB % 100 - 1)
- ((@PerA / 100 - 1900) * 12 + @pera % 100 - 1)
-- Convert to dates, use DATEDIFF
-- Using SQL 2012 functions
SELECT DATEDIFF(MONTH,
DATEFROMPARTS(@PerA / 100, @pera % 100, 1),
DATEFROMPARTS(@PerB / 100, @PerB % 100, 1))
SELECT DATEDIFF(MONTH,
CAST(CONCAT(@PerA, '01') AS DATE),
CAST(CONCAT(@PerB, '01') AS DATE))
-- Using earlier SQL versions
SELECT DATEDIFF(MONTH,
DATEADD(MONTH, @pera % 100 - 1, DATEADD(YEAR, @pera / 100 - 1900, 0)),
DATEADD(MONTH, @PerB % 100 - 1, DATEADD(YEAR, @PerB / 100 - 1900, 0)))
SELECT DATEDIFF(MONTH,
CAST(CAST(@PerA AS CHAR(6)) + '01' AS DATE),
CAST(CAST(@PerB AS CHAR(6)) + '01' AS DATE))
-- The last one can be shortened with implicit conversions
SELECT DATEDIFF(MONTH, LTRIM(@PerA) + '01', LTRIM(@PerB) + '01')
August 19, 2014 at 7:32 am
There really is no reason to have surrogate keys for dates; not in a fully normalized and not in a star-schema database. A Date datatype consumes only 3 bytes, and it's representation never changes over time. Even when something like a 201401 period is required, this can be coded using type Date with value 2014/01/01. You can then query the table by date and perform datediff operations without joining to a date lookup table.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
August 19, 2014 at 7:42 am
In our env we have all by Period= calendar month, so there is still good biz reason to have single column as PeriodID. Surer y enough in it's ref table we have start/end day for each PeriodID.
How you can refer to single period ID in other way, I think we can also by First day of each month? But I think it will be too confusing for single users.
Thanks
Mario
August 19, 2014 at 11:01 am
CELKO (8/19/2014)
I'm working on setup where we have period like 201401 ( jan 2014), etc... And it is INTEGER , however it doesn't help if you need to some datediff operation.
Since SQL is a database language, we prefer to do look ups and not calculations. They can be optimized while temporal math messes up optimization. A useful idiom is a report period calendar that everyone uses so there is no way to get disagreements in the DML.
The report period table gives a name to a range of dates that is common to the entire enterprise.
CREATE TABLE Something_Report_Periods
(something_report_name CHAR(10) NOT NULL PRIMARY KEY
CHECK (something_report_name LIKE <pattern>),
something_report_start_date DATE NOT NULL,
something_report_end_date DATE NOT NULL,
CONSTRAINT date_ordering
CHECK (something_report_start_date <= something_report_end_date),
etc);
These report periods can overlap or have gaps. I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantages are that it will sort with the ISO-8601 data format required by Standard SQL and it is language independent. The pattern for validation is '[12][0-9][0-9][0-9]-00-00' and '[12][0-9][0-9][0-9]-[01][0-9]-00'
Thank you for passing by Joe.
Your assertions imply an absolute truth and you can't guarantee that. A couple of days ago, I improved the performance of a query by using date calculations instead of a look up table. The query using the look up table ran for over 10 minutes without returning results while the calculation completed in 3 seconds (including display time).
I'm not saying that you're wrong, I'm just saying that you're not completely right. So please, count me out of your "we" statement because I (and many others) prefer to use the best tool for each case.
August 19, 2014 at 12:56 pm
I was just trying to group a table by weeks. My manager told me to use a table like the following (I've changed the names and I have no control on this structure so please don't ask me to change it):
CREATE TABLE [dbo].[Weeks](
[PeriodYear] [int] NOT NULL,
[PeriodWeek] [int] NOT NULL,
[StartDate] [datetime] NOT NULL,
[EndDate] [datetime] NOT NULL
)
CREATE CLUSTERED INDEX [IX_Weeks] ON [dbo].[Weeks]
(
[StartDate] ASC,
[EndDate] ASC
)
GO
The table has 17 years with just 52 or 53 rows per year.
The problem was that a bad execution plan was generated caused by the join on this table. As you can see, the rows are short, the range is short as well, the key is ordered but it still won't be the best option for that particular case.
Sure, calendar tables are important and sometimes vital. I just wanted to point out that the best option will depend on each scenario.
August 19, 2014 at 2:00 pm
CELKO (8/19/2014)
I'm working on setup where we have period like 201401 ( jan 2014), etc... And it is INTEGER , however it doesn't help if you need to some datediff operation.
Since SQL is a database language, we prefer to do look ups and not calculations. They can be optimized while temporal math messes up optimization. A useful idiom is a report period calendar that everyone uses so there is no way to get disagreements in the DML.
The report period table gives a name to a range of dates that is common to the entire enterprise.
CREATE TABLE Something_Report_Periods
(something_report_name CHAR(10) NOT NULL PRIMARY KEY
CHECK (something_report_name LIKE <pattern>),
something_report_start_date DATE NOT NULL,
something_report_end_date DATE NOT NULL,
CONSTRAINT date_ordering
CHECK (something_report_start_date <= something_report_end_date),
etc);
These report periods can overlap or have gaps. I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantages are that it will sort with the ISO-8601 data format required by Standard SQL and it is language independent. The pattern for validation is '[12][0-9][0-9][0-9]-00-00' and '[12][0-9][0-9][0-9]-[01][0-9]-00'
Mr. Celko,
Unfortunately YYYY-MM-00 and YYYY-00-00 are not dates. Also this is MS SQL Server not Oracle MySQL. And also, aren't you the one who insists EVERYTHING should be ANSI STANDARD?
August 19, 2014 at 5:58 pm
CELKO (8/19/2014)
5. Keep the rows short by using computed columns that are not materialized until needed for somethings. Overloading rows in a Calendar is a common problem.
Careful now... "It Depends" on what is being calculated and how it will be used. Non "materialized" (i.e. non "PERSISTED") computed columns cannot be indexed and, if someone makes the mistake of joining to such a column, I believe that the whole table will need to be read so that the entire column can be calculated before the join takes place. I believe it would perform as poorly as a non-SARGable date calculation on a column in a WHERE clause.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply