December 15, 2009 at 3:53 am
Hi,
I have an Inventory cube which captures the Expiry Date of stock, as well as a completely unrelated Patient Admissions cube which captures the dates when patients get admitted to a hospital. Both these cubes share the same time dimension: DimTime. For the Admissions cube, the Admission Dates are restricted only to dates in the past (I run the ETL packages after 12 midnight), whereas the Inventory cube stores dates that extend into the future (Eg: 40 nos. of Item A expire on 31st December 2020).
Now, technically, I can either create the time dimension on the server (SSAS), or I can manually create one during the ETL process. Looking at both scenarios, I would like to know the following:
1.CREATING TIME DIMENSION ON SERVER (SSAS): The first time I build a time dimension in SSAS, I'm asked for a date range to populate the server time dimension. Is there any automated way to update the date range in this dimension? I know that an obvious solution is to make sure the date range is huge during time dimension creation, but is there any way I can extend the date range from, say, SSIS, whenever there is a record which has an Expiry Date that does not fall under the date range of the time dimension? If not, what's the best way to achieve this automatically?
2.CREATING TIME DIMENSION DURING ETL (SSIS): Let's take for example that there are Patient Admissions for the years 2007, 2008 and 2009, and Inventory Expiry Dates has records for the years 2007, 2008, 2009, 2010, 2020 and 2025. Now, how do I load the Time Dimension? There are stored procedures which populate the Time Dimension based on FromYear and ToYear parameters which I use to load the Time Dimension. In this case, FromYear could be 2007, and ToYear would be 2025. Do I actually have to populate the table with records for all the years between 2007 and 2025, or do I load only years for which there are records?
3.If I create a Time Dimension on server, the dimension key is a datetime field. Is there any significant performance hit if I link fact and dimension table by datetime columns, as against an INT column which stores the date in the YYYYMMDD format?
4.Is it possible to manually add a column to the Server Time Dimension (something like a named calculation in the DSV) which converts the date into a YYYYMMMDD format?
5.Seriously, which is better? Server time dimension or regular Time Dimension populated through ETL?
I'm sure these questions may seem quite elementary for those experienced in DW/BI, but being new and having worked alone with no guidance except for forums such as this, you guys are my only help. Do revert!
December 16, 2009 at 11:12 pm
Someone please help!
March 14, 2010 at 8:58 am
I read your post but a bit confused on what you are asking.
As for date and time dimensions we have one of each used across all datamarts.
The time dimension has 1 row per second so 86,400 records that is static we don't have a process to load or update this. manually created it one time using a script.
The date dimension is similar in that we created it once it extends from 1980 - 2025 that was mostly an arbitrary decision, as long as it covered all dates in the past by enough years.
The keys for both tables are integers for the time its HHMMSS midnight is 0 and the last second before midnight is 115959.
Date table is similar YYYYMMDD so today is 20100314. This also has the benefit of being readable.
In addition to the key field each has a datetime type that stores the date or time for that record then the breakdown of each date part and other fields such as accounting period, fiscal qtr. etc.....
I actually would like to include moon phase in my date dim, might make for some interesting analysis.
Any date or time field is converted to this integer format for storage in the data marts. Any datetime field is split into 2 fields both in the correct integer format. There is no actual link to the date or time dim tables since the converted integer value I know exists in the dim tables so no lookup is required.
One thing we also do and this is for the benefit of analysis is we added fields to the dim table called, Today, Yesterday, Tomorrow, Last month, last 90 days, and a few others.
We run a stored procedure at midnight that updates these fields based on the current date. when this happens the field that has the word tomorrow is Nulled out and the next day now has the word tomorrow, so on and so fourth....
hth - tom
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply