Star Schema draft

  • I'm trying to figure out if this star schema works for counting the quantity of products in the warehouse for each day.  This is my schema......Are the dimension and fact tables alright?

     

     

    2

  • Here are some initial thoughts, without knowing what type of business you are modeling or what the requirements are:

    Time_Dimension

    I would expect more columns here than just the key.  I don't know if your business has accounting or sales calendars that are different from the standard calendar, but even for the standard calendar you would typically want attributes such as week, month, quarter, etc to describe the date in more detail.  You'll also want to determine how to store special non-dates here such as an "unknown date", "not applicable date", or maybe "no end date" if something has date ranges.  Here's some good info to get you started:

    https://www.kimballgroup.com/2004/02/design-tip-51-latest-thinking-on-time-dimension-tables/

    Product_Dimension

    Does the products sold by this company have product lines, classifications, or some other type of categorization?  I'd expect some sort of hierarchy represented here.

    Order_Dimension

    I'd expect a separate Order Key like other dimensions have.

    Warehouse_Dimension

    I'd expect a separate Warehouse Key like other dimensions have.  I think you would want some description and address information here, and other attributes about the warehouse.

    Order_Fact

    Something about the Order_Fact table feels awkward to me.  What is the grain of transactions in this table?

    https://www.kimballgroup.com/2008/11/fact-tables/

    Shipment_Fact

    Does your business need to know what warehouse a shipment originated from?

    Return_Fact

    Does your business need to know what customer the return is from?

    Warehouse_Fact

    Is this table a reflection of what the current balance of each product is or is it supposed to be a reflection of incoming and outgoing transactions of products for a warehouse?

  • Ok..The order table is to record the number of products order by each customer so I deem as a transaction.... Also for the warehouse table, I would like to find the remaining quantity of each product in the respective warehouse at the end of each day....It's because I feel that sometimes the number of quantity orders by the customer might not tally with the quantity of product ships to the customer...so to calculate the remaining quantity I need the shipping quantity from each order Id to do the calculation..

  • I guess what confused me about Order_Fact was the Order_Status, so I wasn't sure if there would be multiple rows in here for the same order item at various points in time.  What you're describing sounds more like it's an Accumulating Snapshot Fact table:

    https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/accumulating-snapshot-fact-table/

    You may want more than 1 different date associated to an order, e.g. Order_Date_Key, Cancel_Date_Key, Complete_Date_Key, etc

    Your Warehouse_Fact table sounds more like a Periodic Snapshot Fact table:

    https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/periodic-snapshot-fact-table/

    I hope this information helps, the Kimball group literally wrote the book on designing and reporting from star schema dimensional models.

  • If I would like the warehouse_fact to monitor the quantity of products left after each day, how should I organize it? I think of including the warehouse_id into the shipment_fact too. How would it calculate the remaining quantity of products?

  • I would imagine whatever your daily process is to update the Warehouse_Fact would have to be able to find what products where shipped from a given warehouse and what products were returned to a given warehouse.  If you have your dimension key in both Shipment_Fact and Return_Fact then this could be easily obtained, otherwise you would have to get the shipment and return data from your OLTP system again to maintain Warehouse_Fact.

  • Thank you for your suggestion. I have another point which is better between using the product dimension table for the quantity on hand and create another inventory dimension for storing the quantity on hand? For most of the examples I have seen, most of them use inventory table for storing the quantity, with only the minority uses the product table for storing the quantity on hand along with the product information...

  • You typically wouldn't want to put a value like quantity on hand in a dimension table.  Dimension tables tend to have many columns but few rows.  Think of it this way, if you have existing fact rows pointing to a product row, and the quantity on hand changes, you now have 2 different rows in product dimension that are the same except for quantity.

    For example, if a product is "red widgets", and someone wants to do order analysis on how many red widgets are ordered per month over the last 2 years, then you'd have many different red widgets entries in your dimension table that you need to group together, adding more work for the database engine to do in the aggregation.  If there is only one red widgets row, then the database engine seeks to that row in the dimension and can join to the fact table for all items that match that one key.

    A true difference would be if the product dimension is what's called a "Slowly Changing Dimension" where you want to track differences in the product's attributes over time.  An example is if the supplier you get the red widgets from changes, you may want to have a different row in the product dimension for that to do analysis on something like if the new supplier's red widgets are more often returned from customers than the old supplier.

  • Ok. I will need to create another inventory fact table. Inventory fact table will consist of date key, inventory key (degenerate dimension key), product key, warehouse key and quantity on hand. Then, I will use the inventory key will relate to warehouse fact table where the shipped goods, returned goods and quantity on hand will be visible on the table. Otherwise, I will not be able to relate the quantity on hand to the warehouse fact table.. am I right?

    • This reply was modified 4 years, 10 months ago by  TestKing2017.
  • Call your Time_Dimension Date_Dimension instead.  Kimball called this time in his first edition book, but revised it to Date by the second edition.  Eventually you may have a need for a real (clock) time dimension.

    Your Order fact table looks more like an Order_Detail fact table.  The level of granularity is a strict immutable rule in OLAP designs.  You would probably also need an Order fact table to store metrics that have only to do with the overall order.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply