Changing dates to Integers

Question

Post reply

Changing dates to Integers

SW_Lindsay

SSCommitted

Points: 1983
More actions
June 4, 2010 at 11:49 am

#225350

Is there any value to change date and time fields into integers.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

divyanth SSCarpal Tunnel Points: 4303 More actions · Answer 1

what do you mean by a value?? Are you looking for something similar to this SQL statement below in ssis.

Select Convert(Varchar,Getdate(),109)

Lowell SSC Guru Points: 323518 More actions · Answer 2

depends on what "kind" of integer you want:

select convert(int,getdate())

--40332 days since 01-01-1990, SQL's internal start date

select YEAR(getdate()) * 10000

+ MONTH(getdate()) * 100

+ DAY(getdate())

--an integer "20100604"

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

SW_Lindsay SSCommitted Points: 1983 More actions · Answer 3

advantage. Is it easier to use? Take up less space? Create better indexes?

so instead of storing '06/04/2010 14:06' as a date time or small date time or date and time.

converting the 06/04/2010 to 20100604 and putting it into an int column and 14:06 to 1406 and placing it in a int column.

SW_Lindsay SSCommitted Points: 1983 More actions · Answer 4

lowell the second one 20100604 as an integer over just a datetime column

Lowell SSC Guru Points: 323518 More actions · Answer 5

there is absolutely no advantage to storing dates as anything except dates.

never store datetime as any other datetype except datetime.

what is the date of the integer 20105404?

internally, SQL stores it as numbers anyway...how it is displayed is up to your regional settings.

if you save it as anything except datetime, you'll have to convert back and forth, and it will not be easy to use the builtin tools like DATEADD DATEDIFF and the BETWEEN operator without converting stuff.

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

jvanderberg SSCertifiable Points: 6762 More actions · Answer 6

You really don't need to turn you dates into integer "date ids". SQL Server is highly optimized to use dates as indexes. Internally, they're stored as integers anyways, so the comparison is very quick. Any conversion you add is just extra manipulation that will have to occur.

--J

SW_Lindsay SSCommitted Points: 1983 More actions · Answer 7

our bi team insists on converting all dates to integers and they say they are easier to use. I as the DBA thought what you just wrote but I wondered if I was missing something.

Our bi is based on Kimball Method.

jvanderberg SSCertifiable Points: 6762 More actions · Answer 8

The Kimball method is great (and it's probably what most of us use), but it's general purpose and system independant. SQL Server has been optimized to the point where this is not necessary. Trust me, there's nothing easy about using date ids. The fact that you're on a forum trying to figure out how to do it supports this.

Personally, I used date ids in a data warehouse I set up some time ago, and I regret it. Not only is it a hassle on the ETL side, it's also a hassle to use. I'm sure it can't possibly be more efficient to have to join just to get a date. Also, I do sometime query the warehouse directly. When I do, I have to use conversion functions on my dates before making comparisons (and pretty much every query against a data warehouse has a date in the WHERE clause). If I had it to do over again, I'd use smalldatetime.

As we said, dates are stored as integers any ways. Furthermore, they're stored as sequential numbers; so tomorrow will be stored as a larger number than today. Internally, SQL Server can perform comparison operations and BETWEEN without converting the data in any way.

I think you should make the case that they're not necessary. Do a little research and I think you'll find a lot of information out there that supports what we are saying. You and your team will be much happier if you just use the datetime or smalldatetime fields.

--J

Jeff Moden SSC Guru Points: 1004561 More actions · Answer 9

It's real simple.... a thousand professional opinions aren't worth a hoot. A single well written test will tell all. Have the BI "experts" prove their hypothesis using a small million row example. Only the code knows the truth.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

SW_Lindsay SSCommitted Points: 1983 More actions · Answer 10

That was my first reaction - show me the benchmark.

Koen Verbeeck SSC Guru Points: 259155 More actions · Answer 11

jvanderberg (6/4/2010)
The Kimball method is great (and it's probably what most of us use), but it's general purpose and system independant. SQL Server has been optimized to the point where this is not necessary. Trust me, there's nothing easy about using date ids. The fact that you're on a forum trying to figure out how to do it supports this.
Personally, I used date ids in a data warehouse I set up some time ago, and I regret it. Not only is it a hassle on the ETL side, it's also a hassle to use. I'm sure it can't possibly be more efficient to have to join just to get a date. Also, I do sometime query the warehouse directly. When I do, I have to use conversion functions on my dates before making comparisons (and pretty much every query against a data warehouse has a date in the WHERE clause). If I had it to do over again, I'd use smalldatetime.
As we said, dates are stored as integers any ways. Furthermore, they're stored as sequential numbers; so tomorrow will be stored as a larger number than today. Internally, SQL Server can perform comparison operations and BETWEEN without converting the data in any way.
I think you should make the case that they're not necessary. Do a little research and I think you'll find a lot of information out there that supports what we are saying. You and your team will be much happier if you just use the datetime or smalldatetime fields.
--J

What if your date is actually a surrogate key (for example, a surrogate key in a fact table that links with the time dimension). Do you still use the datetime datatype? Or just a random integer? (Or a meaningful integer, which is the topic of this thread)

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Patrick Seegers SSC Veteran Points: 281 More actions · Answer 12

[font="Courier New"]Hello,

I have found storing the date as an integer to be useful as an integer surrogate key to a date dimension table with attributes you can not easily get from tsql, for example business events tied to specific dates. (ex. a HolidayFlag or something descriptive text about the day) (Kimball Method as mentioned by others above)

I concede that the date storage is as efficient as an integer, but propose that converting to an integer (Year(getDate())*100000+month(getDate()*100+day(getdate()) is not any more difficult than truncating the date to reflect a consistent date key surrogate value (dateadd(dd,0, datediff(dd,0, getDate())).

The consistency of always using integer surrogate keys for dimension tables seems beneficial...

One practice is the use of -1 as a key for an 'Unknown' date...

There is additional value in using a smart surrogate key for a date dimension as documented by Kimball et al.

Are there any opposing views on this usage?

Thank you

[/font]

Jeff Moden SSC Guru Points: 1004561 More actions · Answer 13

Patrick Seegers (6/9/2010)
[font="Courier New"]Are there any opposing views on this usage?
Thank you
[/font]

Sure... read the rest of the posts on this thread.

Hmmmm.... "unknown date".... sounds like the perfect use for NULL unless it's an EndDate in which case the "unknown date" should be 9999-12-31.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Cris E Ten Centuries Points: 1255 More actions · Answer 14

There is additional value in using a smart surrogate key for a date dimension as documented by Kimball et al.

Are there any opposing views on this usage?

Patrick, on the internet there are always opposing views, but I'm with you on the integer surrogate key practice. All of the benefits that the native functions bestow are pretty much irrelevant when you're talking about a key field, and in a date dimension you'll also have date fields where those functions can be used.

Cris E

[font="Arial"]Are you lost daddy? I asked tenderly.
Shut up he explained.[/font]
- Ring Lardner

Changing dates to Integers

Cookies on SQLServerCentral