Dimension design problem

  • Let's say you have a dimension that rolls up as a nice hierarchy, like Company -> Department -> Office. I know you can model this all in one table (star) or as three separate tables (snowflake).

    Now, let's say you have two different fact tables. The first one, FactExpense, is at the granularity of the Office - it tracks expenses, and each expense is tied back to a particular office. The second one, FactBudget, is actually at the granularity level of the Department - budgets are done only down to the department level and office never comes into it.

    My thought is that in this scenario, you couldn't roll all levels of the hierarchy into one table, because that would mean your dimension is intersecting with at least one of the fact tables at the wrong grain - it would intersect FactBudget at the office level even though that fact table is the grain of the department. So you would need at least two different dimensions, like DimCompanyDepartment and DimOffice, snowflaked off each other so they could intersect the two fact tables at the right grain.

    The competing thought on my design team is that you could have all three things in one dimension table, DimCompanyDepartmentOffice, and for the FactBudget table you would just arbitrarily pick one of the Offices (say, always the first office for a department) as the intersection point for the fact and dimension. So, if DepartmentA had two offices, Office1 and Office2, and you're recording that DepartmentA had a 2007 budget of 400K, you would make an entry in the BudgetFact table with the amount and the dimensional key from DimCompanyDepartmentOffice that represents Office1.

    That solution, while it might technically work, seems wrong to me because the fact and dimensions are at different grains. Am I correct in looking at it this way?

  • Granularity has more to do with the fact tables than it does the dimension tables.

    The point behind designing dimensions is to make them conformed so that you can use it with whatever business process interacts with it. Dimensions exist outside the grain of the facts that use them.

    With your example Company->Department->Office, you would have one dimension record for each existing combination of the available attribute (ie ACME, Inc. -> Marketing -> Tulsa, OK and ACME, Inc. -> Sales -> Tulsa, OK). The assignment of dimension records to the fact record is not arbitrary at all. It is determined by which dimension was "responsible" for that fact record being present.

    Another example is a Date dimension. Fiscal Year -> Fiscal Quarter -> Fiscal Month fits the same construct, but you wouldn't want those in separate dimensions. And any fact table (whether an atomic "beep" level fact, or an aggregate monthly fact) would link to the date dimension based on its grain.

    Another approach would be a hierarchical dimension, which gets even more complicated. This would not be the easiest forum to get to that level of detail. Kimball's Warehouse Toolkit book covers it (pages 162-168).

    Hope this helps!

    Dan

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply