May 24, 2012 at 10:38 pm
I would like to get advice from some battle tested SSAS developers on how to best manage dates in SSAS. I am lucky enough to be involved in the cube build out for my company, and I dont want to create issues down the road.
I need to manage timelines for accounts. Each account will move through 5 stages, and at the end of each stage I need to record the transaction date. Start to Finish I will have 6 dates (including the starting one) and I need to measure the times book ending each stage to determine home much time an account lived in each stage.
Seems easy coming from a relational angle, but in the dimensional world I have not yet seen the light at the end of the tunnel.
Here are some of my thoughts no the topic:
- Concerned that date reference dimensions for each of the 6 dates will cause bloat and slow cube processing
- Unclear about how to track the time between each stage, don't want to force Null values into my fact table
I could just use some advice or pitfalls to avoid
May 25, 2012 at 9:32 am
We have a date dimension that covers past and future (I didn't build it so don't know offhand how far we go to either extreme, but far enough for our business).
We maintain an "unknown member" in every dimension, always -1 key, but for the Date dim, we also have -2, before any "known date" (as far as the dim table knows) and -3, farther in the future than "known". This makes it very easy to ferret out Bad Date Data.
Each date used in a dimension or fact table becomes a date key which corresponds to the correct date value in the Day dimension. Our date keys are YYYYMMDD integers. Hooking to the Day dimension, we can get any piece easily so slice n dice by day or month or quarter or year is a simple matter.
We then create Aging Buckets, which are arbitrary per use. Many times they are 30 day Buckets, so I could have invoices out for 1-30 days, 31-60, 61-90, however far out you wish to go. I generally materialize the bucket name on the DW table for use in the cube but sometimes we leaving the bucketing for the cube only.
Hope this helps!
May 25, 2012 at 9:37 am
also, we create Lag fields. Integers normally expressed as number of days.
Time from inception to event, time from any event to any other event. We also move things through stages and it is important to us how long a stage takes for Normal Situations and Worst Case situations. Large Lags create opportunities for business improvement and will be very obvious.
It sounds to me like Lag would be very useful in your situation in order to assess average time from this to that, and to also have Start to Finish Lag.
May 25, 2012 at 10:19 am
Very insightfull, will have to stew on it for a while. Thanks!
May 26, 2012 at 7:27 pm
I came up with a strategy to manage dates in SSAS, and I almost have the kinks worked out. I added a second fact table that has the following columns:
DateId - represents the dates I want to manage
AccountId - represents the account the date is tied to
ActionId - represents what kind of date it is, ie. ship date, order date, cancel date etc.
AccountCount - represents the column that will be aggregated
From the DW I unioned this data together, and made it available in my DSV. Next I added a FK relationship to my Date Dimension on the DateId column making my date hierarchies available to every date in my cube without creating a ton of reference dimensions.
I also added an Action Dimension to provide color on each ActionId.
This approach is working out really well, and has made counting actioned dates extremely easy.
BUT where I am having issues, is now my Date dimension is a Conformed Dimension with FK relationships to two fact tables. When I try to process my cube, if the same dates are not present in both fact tables, then the cube fails due to missing keys. It's very strange, all dates are in my Date Dimension,
Have you ever encountered issues like this with Conformed Dimensions?
May 27, 2012 at 12:37 am
I feel pretty dumb, I finally figured out why the cube would not process. The date dimension was being limited in the DSV. Moral of this, when all else fails just do start from the begining and walk through each step.
May 29, 2012 at 7:01 am
Lots of good ideas here already, but let me add:
I don't use the YYYYMMDD system for my keys, but integers instead. It's very Ralph Kimball, and it's disputed because it loses human readability. I like it, however, because it's easy to subtract one date from the other to populate columns between the dates. I can also use a small integer. The users shouldn't see the keys in any case, and if I need to see the date, I just join to the calendar table. The join is easy enough that I don't see the justification to loss the advantages of the smaller field and easier calculation.
Don't worry about having multiple dates joining to the same date table. I have about 10 dates on one of my main cubes. But note that if using SSAS 2005 you can create a single date dimension and use it over and over. This eases the maintenance. It can be a little confusing if someone is using lots of dates in the user interface, but I haven't found this to be a real problem.
I use "0" for my unknown and "-1" when the date is N/A. This also fits better into using small integers for dates and avoids any null values. Some of your dates sound as if at first they will be N/A.
We also use aging buckets. The earlier buckets can be broken out down to the day, but the later buckets can only be broken out to 10 days and the highest can't be broken out at all. Otherwise your table would keep growing.
A good date dimension is essentially, so take care with it. Good luck!
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply