October 28, 2010 at 2:33 pm
Hi all,
I've a question about date dimensions related to constructing hierarchies for events that may span more than one year.
I'd like to include religious holidays in my date dimension as this could prove extremely useful for the marketing department where I work (I'm considering one field for each major religion). Hierarchies based on year are relatively straightforward (e.g. Year-Quarter-Month etc.) but what is the best hierarchical approach to handle an event which may span more than one year (e.g. a religious holiday that starts in December and ends in January)?
Academic holidays aren't a problem as I can just add in a field for academic year that my end users will easily understand, but religious holidays seem to be a lot trickier for me to get my head around. I could possibly add separate "religious year" fields, but apart from instinctively feeling that it's a bit like overkill I can see this seriously confusing my end users even more than it could confuse me!!!
Can anyone offer any advice or point me in the right direction?
Cheers,
RF
p.s. As a fan of ancient history I've often considered following Roman Polytheism. Not for any spiritual reason, but simply down to the fact that they had so many religious festivals! Trouble is, I can't see my manager being too happy if I insist work respect my religious rights and give me paid time off in March to celebrate Bacchanalia... 😀
_____________________________________________________________
MAXIM 106:
"To know things well, we must know the details; and as they are almost infinite, our knowledge is always superficial and imperfect."
Francois De La Rochefoucauld (1613-1680)
October 29, 2010 at 7:50 am
RainbowFfolly (10/28/2010)
I've a question about date dimensions related to constructing hierarchies for events that may span more than one year.I'd like to include religious holidays in my date dimension as this could prove extremely useful for the marketing department where I work (I'm considering one field for each major religion).
Have you considered a snow-flake design where SDIM_RELIGIOUS_HOLIDAY table hanging from DIM_DATE defines such events?
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.October 29, 2010 at 9:48 am
Hi Pablo,
No I hadn't even thought of it. I think I may have been unwittingly brainwashed by Kimball into believing that if Ralph hates snowflakes then so should I (and as he is my leader I should hate them even more fervently than he does). Never mind 'denormalisation', I need the same kind of 'deconditioning' that a member of a bizarre and wacky religious cult needs to return to a normal life without the long, flowing, turquoise robes and free-love they've become accustomed to... :w00t:
PaulB-TheOneAndOnly (10/29/2010)
Have you considered a snow-flake design where SDIM_RELIGIOUS_HOLIDAY table hanging from DIM_DATE defines such events?
Would this affect my end-users 'experience' of the data warehouse? I've tried to keep it as simple as possible and would be concerned that it could complicate things for them. Fortunately, my DW is still at design stage - and I'm admittedly still learning - so this approach would have very little overhead or impact on its development.
To my eternal shame, my original DW ended up as a glorified staging area thanks to our marketing department and other IT staff hijacking it quite early on. They discovered that they could get a report out of it in seconds which would otherwise take hours to retrieve from our unwieldly Dataflex-based OLTP, and by not looking to the future effectively killed the proverbial goose that laid the golden egg. They understandably see it as a big success (even if on a personal level I definitely don't) so I've now got a lot of leeway in the organisation to build a proper data warehouse that I can be proud of. I can also use the original DW as an efficient staging area to provide its data.
Hmmmm... all I need now is a catchier name than "Data Warehouse - The Next Generation" to sell the proper DW to staff when I want them to start using it. 😀
Cheers and thanks for the reply,
RF
_____________________________________________________________
MAXIM 106:
"To know things well, we must know the details; and as they are almost infinite, our knowledge is always superficial and imperfect."
Francois De La Rochefoucauld (1613-1680)
October 29, 2010 at 10:35 am
Events may have more dimension attributes than just time and they might have multiple date ranges involved. For example, a book signing might be in store 1 on Jan 3-5, store 2 on Jan 7-8.
You need to look closely at all attributes of events. It has been a while since I did a design for events, but I believe it actually became another fact table to be able to tie together the various dimension attributes.
I'm not trying to suggest a particular design here, but trying to point out that events may be more complex to model than they seem at first. You might be better starting with an ER model and then tranform that into the data warehouse.
October 29, 2010 at 2:21 pm
RainbowFfolly (10/29/2010)
No I hadn't even thought of it. I think I may have been unwittingly brainwashed by Kimball into believing that if Ralph hates snowflakes then so should I (and as he is my leader I should hate them even more fervently than he does).
RF - I know what you mean but, always remember what Deng Xiaoping once said: "no matter whether it's a black cat or a white cat, any cat that catches a mouse is a good cat."
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.October 29, 2010 at 3:47 pm
PaulB-TheOneAndOnly (10/29/2010)
RainbowFfolly (10/29/2010)
No I hadn't even thought of it. I think I may have been unwittingly brainwashed by Kimball into believing that if Ralph hates snowflakes then so should I (and as he is my leader I should hate them even more fervently than he does).RF - I know what you mean but, always remember what Deng Xiaoping once said: "no matter whether it's a black cat or a white cat, any cat that catches a mouse is a good cat."
Thanks Pablo, and kudos for the very apt quote that gets your point across nice and succinctly. 😎
_____________________________________________________________
MAXIM 106:
"To know things well, we must know the details; and as they are almost infinite, our knowledge is always superficial and imperfect."
Francois De La Rochefoucauld (1613-1680)
October 30, 2010 at 7:43 am
Michael Valentine Jones (10/29/2010)
Events may have more dimension attributes than just time and they might have multiple date ranges involved. For example, a book signing might be in store 1 on Jan 3-5, store 2 on Jan 7-8.You need to look closely at all attributes of events. It has been a while since I did a design for events, but I believe it actually became another fact table to be able to tie together the various dimension attributes.
I'm not trying to suggest a particular design here, but trying to point out that events may be more complex to model than they seem at first. You might be better starting with an ER model and then tranform that into the data warehouse.
Hi Michael,
I've been pretty much coming to the same conclusion that it would be more complex than I first imagined. One issue was that it's value would be seriously reduced if the end-users could only easily analyse sales during an event, and not in the more important period before it (e.g. Sales on Christmas Day are minimal and not of serious value to marketing, whereas the sales in the two months before are incredibly valuable). To do this I think I'd need to provide them with relative days/weeks/months against the event, and I feel the date dimension would be the wrong place to store that information. There could be workarounds to holding events in the date dimension, but the DW is supposed to be intuitive and quick, and I'm loathe to let either of those two qualities go - and that's before I even begin to say how I'd feel knowing that I've bodged and kludged something ugly and wrong.
Nonetheless, it'll be a great learning experience for me as I've never had to deal with events like this before. Hopefully it'll be extremely reusable (I'd like to also include major sporting events) and would provide an ideal model if I'm ever faced with solving an event related issue in the future.
I'll think on it later this week, and if I come up with something I'm happy with I'll post it here for criticism and advice etc. so I can fine tune it (or remodel if it's a monstrosity!). As it's data that is not specific to a single company (and I'll probably put it together in my own time for fun and experience) I'll upload the data and DDL for the structure when it's finished in case anyone else finds it could be of use to them.
Cheers,
Stuart
_____________________________________________________________
MAXIM 106:
"To know things well, we must know the details; and as they are almost infinite, our knowledge is always superficial and imperfect."
Francois De La Rochefoucauld (1613-1680)
October 30, 2010 at 7:48 am
Correct design decisions aside, I'm sorely tempted to include these events in the date dimension for the time being. The reason for this is that by storing this data, I can send staff an email each morning showing how many days there are before religious festivals and major sporting events - raising the profile and "selling" the DW is always a positive thing, and if I do this right (and make it fun and/or interesting) then I can increase the awareness of my DW throughout the whole company at almost no cost to myself or the IT Department.
A lateral benefit of this is that by adding the production of this email to the end of my overnight ETL process, I'll have created a company-wide early warning system to tell me when the ETL has failed - I'm fairly certain that as soon as I arrive in work (and probably before I even sit at my desk) someone will ask me why they didn't receive an email letting them know how long they've got to order flowers for Valentine's Day or order tickets for the 2012 Olympics in London...
So, what I'm thinking is that I can implement this quickly and keep the actual data hidden from end-users of the DW until I create the correct structure to handle these events and allow them to slice 'n' dice using it. And when I do create the correct structure I'll have all the data instantly available to populate it.
It might not be the most obvious usage of data within a DW, but as far as I'm concerned "data is data is data" and if any application of that data has value - and causes no issues with the performance of the DW - then I'll use it without a second thought... 😉
_____________________________________________________________
MAXIM 106:
"To know things well, we must know the details; and as they are almost infinite, our knowledge is always superficial and imperfect."
Francois De La Rochefoucauld (1613-1680)
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply