November 6, 2012 at 9:59 am
In the application data that I wish to transform into a data warehouse, people and businesses are basically treated in identical ways. In the OLTP database the core data about both types of entities are stored in the same table, with the "LastName" column doing double duty as the "BusinessName" column as well. There is an indicator column to distinguish individuals from business, but there really is very little else to distinguish them.
In the data warehouse, should businesses and people be commingled in the same dimension? If so, should the "LastName" column still do double duty, or should there also be a separate "BusinessName" column? Filters on the entities' names would be complicated by such separation, which is one reason that a single column is being used for both types of names in the OLTP database.
If the business data and people data are in separate dimensions, wouldn't that complicate the fact table structure? In such a case there would be two separate dimensions instead of one to reference for the "who did it" data for transactions.
Any help or insight is welcome.
Thanks in advance,
Geoff
November 6, 2012 at 12:19 pm
Hello Geoff
If you are confident that true distinction between people and businesses won't occur then keeping them together presents no problem.
However, you still may want your fact table to represent both people and business so you would still require two foreign key values in your fact table to the base dimension so create two views based off this dimension.
By using views, you could then incorporate Business Name verses Last Name as well. Your end users will thank you for it. It's one less thing for them to deal with.
Greg
November 6, 2012 at 12:37 pm
Thanks for the reply, Greg. Based on the use cases for the facts in the database I'm dealing with, I don't think any fact tables will reference more than one entity of the dimNames table at one time. So I think I will retain an "IsBusiness" flag column but I will leave the business name as a LastName value, just as it is in the OLTP. If I build a view exclusively for businesses, I will just put an alias of "BusinessName" onto the LastName column.
Speaking of the "IsBusiness" column, would you recommend that this column be a bit data type or should it be a varchar for literal "Yes" and "No" text? I ask because I know a data warehouse is supposed to be about making the data accessible to non-techies and to that end it should avoid "codes" for values. But I also know that a bit data type is much more efficient for storage and for querying than is a varchar data type.
November 6, 2012 at 12:48 pm
Hi Geoff,
I would use a literal "Yes" or "No". (Or possibly a "Y" vs. "N" )
Greg
November 21, 2012 at 11:39 am
As a general rule, I go with "don't store different things in the same container". Your case is a little confusing, as I don't know how the data is reported on. Are they reporting on entities (which may well be either people or businesses) or are they reporting on people and businesses?
How you model the data should reflect how the DW users intend to report on the data.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply