Under the gun, need help quick!

  • Hi gang,

    I am hoping one of the kind uber-brains here will be able to point me in the right direction.

    I am building a DW that primarily needs to count rows. At this current juncture, there is no data to be summarized. However, I am having a tough time bending the examples to do what I need.

    Here is what I have:

    We have customer data and item data. I have the customer data broken-up and flattened into a Dim and a Fact. In the FactCustomer table is a customer_id that points to the id of the DimCustomer table. That all works fine. My problem is with the Item data. A customer has a proprietary code, which is one of 7 characters. An item is also coded with the same value but an item can have one or more of the 7 codes.

    What I am interested in is the number of items that match a person by the code. So if I have two customers, one with code "A" and the other "B", and I have three items, one with "A", one with "AB" and one with "ABCD", I want to report the number of items. The columns would be the codes themselves, and the rows would be the customers, and the data in the middle is the quantity of items. Something like this:

    A B C D

    Cust 1 3 - - -

    Cust 2 - 2 - -

    It seems to me that I may not need a Fact table at all, or perhaps the Item table is both Fact and Dimension... I am just not sure how to make this happen. The OLTP data is WAY over-normalized and too granular, so I have a bunch of SSIS packages to aggregate and flatten it into OLAP-structured tables. Point being, I can modify the organzation of the data however it will be most beneficial.

    Can someone walk me through this, or at the very least, point me in the right direction? As always, this was due yesterday, and people are already asking for changes.

    Thanks a MILLION for any input!

    Cheers,

    Chris

  • Anyone have any input on this?

  • I'd break the item codes out into single-character many-to-many. Basically, they become a dimension.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Hey there!

    Thanks for the response.

    Do you mean each code becomes a row in its own dimension table (like DimCode)?

    What about if I left them as a string, and created a calculated measure? Would that get me where I need to go?

    Cheers,

    Chris

  • I'd declare item dimention as fact mesure with "Fact" relationship to itself.

  • I don't understand that at all... can you clarify, please?

  • or perhaps the Item table is both Fact and Dimension

    I'd implemented that way.

  • Ok gang...

    I have made some progress, but it has left me with an error that I cannot figure out. It doesn't make sense to me how this is even happening, which probably indicates that I am missing something completely.

    I have two hierarchical axes I want as dimensions, with a count of rows as the report data at their intersection. The row axes are CODE and SIZE, and the columns axes are TYPE (general) and TYPE (specific). The CODE and SIZE come from the DimCustomer table which has about 240K rows, and whose PK is the customer id from the OLTP system. This table links to FactCustomer, which is 31.5M rows of item/customer intersection info (one row per customer per item), and it has a surrogate key with a FK pointing to the PK of DimCustomer. The 3rd player is DimFactItem (about 60K rows) which is one row per item/CODE combination (the same CODE as for a customer, except an item can have multiple codes whereas a customer can have only one). The FactCustomer table has a FK to the PK of DimFactItem.

    Here is what I want (the numbers in the body of the report are just examples):

    TYPE(gen)/TYPE(spec) -> Coat Jacket...

    C1 C2 J1 J2 J3...

    CODE/SIZE

    vvvvvvvvv

    X

    abc 30 44 ...

    def 35 51 ...

    ghi 33 39 ...

    Y

    abc ... ...

    def ... ...

    ghi ... ...

    Z

    abc

    def

    ghi

    What I can get right now is everything but the SIZE level info. So for CODE X above, there are a total of 30+44+35+51+33+39 = 232 Coats. I end up seeing this (without expanding the TYPE specific):

    TYPE(gen)/TYPE(spec) -> Coat Jacket...

    CODE/SIZE

    vvvvvvvvv

    X

    abc 232

    def 232

    ghi 232

    and if I expand TYPE specific:

    TYPE(gen)/TYPE(spec) -> Coat Jacket...

    C1 C2 J1 J2 J3...

    CODE/SIZE

    vvvvvvvvv

    X

    abc 98 134 ...

    def 98 134 ...

    ghi 98 134 ...

    The DimCustomer table has the CODE and SIZE info, and it is joining to the DimFactItem table through the FactCustomer table. If the filtering works at the CODE level, why is it not working at the SIZE level?

    One thing that may be important, but I am not sure, is that when I look at the count of DimCustomer as the report data (middle of the report), and I just look at rows of CODE/SIZE, I see the customers broken-down as expected:

    X

    abc 3200

    def 1893

    ghi 8874

    So I know the logic is working and is sound.

    Am I missing a relationship or something?

    Also, is there any way to see the "query" or MDX that generates the report data? I'd love to be able to right-click on a cell and see the "formula" that is generating it. I think I could figure out this problem if I knew what was going on under the hood, so to speak.

    Thanks for ANY insight!

    Chris

  • Just to close this thread:

    I solved this issue by re-architecting the data, making the Fact table contain only FKs to Item and Customer and then using a Many-to-Many relationship between Items and Customers THROUGH the Fact table.

    Thanks!

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply