Data Warehouse Design Suggestion

  • Hi all, I am about to build my first DW for our company. I tried build one last year, but I was pulled into another project, and never had chance complete the DW project. So here I go again... 🙁 Hopefully, I get some good tips and advice.

    Anyway, this time, I decide to start with small easy DW for our sales team.

    Business Requirement: Our sales manager wants a dashboard/report to show following:

    1. Sales by Region: show where are customers are from

    2. Sales by Sales Rep: show how each reps are performing and track up-sale records.

    3. Sales by Shipment: show where product is shipped and what shipment type was used(we ship to other warehouse or customer direct)

    Our sales manager is only concerned about how many products his team sold, and his teams performance. Nothing else. I asked him about other possible analysis scenarios, but he feels that any complex sales analysis should be performed by business analyst, so based on his requirement this is my DW design:

    DimSalesPerson

    - Sales Person Key

    - Sales Person Name

    - Start Date

    - End Date

    - Active

    - Years Employed

    DimCustomer

    - Customer Key

    - City

    - State

    - Zip

    - Country

    DimShipment

    - Shipment Key

    - Shipment Type

    - Shipment Carrier

    DimTime

    - Time Key

    - Date

    - Year

    - Day

    - Quarter

    - Holiday Flag

    FactSales

    - Customer Key

    - OrderDate Key

    - ShipmentDate Key

    - Shipment Key

    - SalesPerson Key

    - Sales Type (Return/Sales)

    - Invoice Number

    - Order Number

    - Total Qty Sold

    - Total Amount

    Questions I have:

    1. On my fact table, I created a column to distinguish sales type between return and sales. When transaction is a return, sales type will be flagged with return and qty and amount would be negative number. Am I doing this right?

    2. How would you track up sales record? When customer places an order, we could have following scenarios:

    a. Sale rep takes an order then release order for shipment

    b. Sale rep A takes an order then place it on hold, customers calls again, Sales rep B reopens add more item qty, then release the order.

    c. Sale rep A takes an order then place it on hold, customers calls again, Sales rep B reopens, but this time decrease order qty, then release the order.

    I already created trigger in our OLTP to track any upsales/down sale. My current idea is using Sales Type field on fact table. Is there any better way to track up-sale?

    3. Our sale manager wants to look at sales by shipped region. I have Shipment Dimension but it only has the attributes tracks shipment type (ship to customer vs ship to warehouse) and carrier (UPS, FedEx). To track shipped region, do I add more attributes in current shipment dimension or just create a field on factSales table? My current idea is creating Oder Detail Dimension table with following attributes:

    DimOrderDetail

    - Order Detail Key

    - Invoice Number

    - Order Number

    - Customer City

    - Customer State

    - Customer Country

    - Shipment City

    - Shipment State

    - Shipment Zip

    - Shipment Type

    - Shipment Carrier

    Thank you for reading my post!

  • You've really got a lot here. You're not asking for someone to simply solve a problem, but to validate your design. I don't want to get into a back and forth over this, but here are some items you may want to look at:

    1. You have more fact table key fields than dimensions. That doesn't seem right.

    2. I would move the location information to a separate table, so that it's just about location. You can join that to the customer table on zip code. SSAS will be able to handle that. The current configuration will only show you information for where customers are. You may want to see where things aren't as well.

    3. It's possible to update rows. You just need a way to bring the changes to your warehouse. This known as an accumulating snapshot table.

    It's good that you are starting small. It will help you get the basics down. I highly recommend you read Ralph Kimball's The Data Warehouse Toolkit. I found that invaluable when I started and still re-read it regularly.

    Good luck!

  • Thanks for your kind suggestion.

    After going over your comments, and I made some changes to my DW:

    DimSalesPerson

    - Sales Person Key

    - Sales Person Name

    - Start Date

    - End Date

    - Active

    - Years Employed

    DimCustomer

    - Customer Key

    - Location Key

    DimShipmentType

    - Shipment Key

    - Shipment Type

    - Shipment Carrier

    DimLocation

    - Location Key

    - City

    - State

    - Zip

    - Country

    DimTime

    - Time Key

    - Date

    - Year

    - Day

    - Quarter

    - Holiday Flag

    DimOrderDetail

    - OrderDetail Key

    - Invoice Number

    - Order Number

    FactSales

    - Customer Key

    - OrderDate Key (DimTime plays multiple role)

    - ShipmentDate Key (DimTime plays multiple role)

    - ShipmentLocation key (DimLocation act as Shipment Location Dim)

    - ShipmentType Key

    - SalesPerson Key

    - Sales Type (Return/Sales/upsale)

    - OrderDetailKey

    - Total Qty Sold

    - Total Amount

    I was trying to avoid any relatioships between dimensions to simplify the DW, but when you mentioned SSAS it all made sense 🙂

  • Two comments.

    First one. Not totaly sure about the snowflake DimCustomer-DimLocation structure... it looks like a plain 1-1 relationship which I usually try to avoid at all costs.

    If the idea is to keep a history of particular locations it can be implemented as a Type 2 SCD on DimCustomer.

    Second one. Original specifications suggest that a single Invoice may include several items; if that is the case and there is one row per Invoice on FactSales table some columns like "Total Qty Sold" do not make a lot of sense and others like "Taxes", "S&H", etc are missing. On the other hand, if the idea is to have one row on FactInvoice per each Item then I do not see where are the totals of the Invoice kept. DimOrderDetail table doesn't help me to understand either.

    I may be missing something or missunderstanding something - would you clarify please?

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Thank you SSCrazy for your input!

    1. I think you are right on Type 2 SCD for Customer Location.

    2. While I am only required to capture sales data for one product sku, our typical sales transaction could have multiple invoices and shipment dates.

    Here is typical Sales transaction scenario we have:

    a. Customer initially orders 5 products.

    b. New PO gets created by sales rep A.

    c-1. Customer calls back, orders 3 more products, but this time with sales rep B, and sales rep B close/release the order.

    c-2. Although rep B closed the order, system still puts initial sale rep A as sales person in invoice. So trigger is implemented to capture up-sale data.

    d. From total of 8 ordered qty, 3 goes to our warehouse for storage, 5 goes directly to customer.

    e. 2 invoices get generated by system. 1 for warehouse storage, 1 for direct shipment to customer.

    f. We have 1-to-1 relationship on invoice and shipment. However, since we have 1-3 day delay on invoice to actual shipment, invoice date and ship date are usually different.

    One of my manager's requirement is to track up-sale record for individual sales rep. So I am setting granularity for fact table as 'per transaction action by sales rep'.

    'Sales Type' column in fact table is created to distinguish transaction type between sales/return/up-sale.

    So this is my latest design:

    DimSalesPerson

    - Sales Person Key

    - Sales Person Name

    - Start Date

    - End Date

    - Active

    - Years Employed

    DimCustomer

    - Customer Key

    - Customer City

    - Customer State

    - Customer Zip

    - Customer Country

    DimShipmentType

    - Shipment Key

    - Shipment Type

    - Shipment Carrier

    DimLocation

    - Location Key

    - City

    - State

    - Zip

    - Country

    DimTime

    - Time Key

    - Date

    - Year

    - Day

    - Quarter

    - Holiday Flag

    FactSales

    - Customer Key

    - OrderDate Key (DimTime plays multiple role)

    - ShipmentDate Key (DimTime plays multiple role)

    - ShipmentLocation key (DimLocation act as Shipment Location Dim)

    - ShipmentType Key

    - SalesPerson Key

    - Sales Type (Return/Sales/upsale)

    - PO Number

    - Invoice Number

    - Qty Sold

    - Amount

    Once again, thanks for reading my post and your kind advice!

  • sdhan79 (1/15/2012)


    Hi all, I am about to build my first DW for our company. I tried build one last year, but I was pulled into another project, and never had chance complete the DW project. So here I go again... 🙁 Hopefully, I get some good tips and advice.

    Anyway, this time, I decide to start with small easy DW for our sales team.

    Business Requirement: Our sales manager wants a dashboard/report to show following:

    1. Sales by Region: show where are customers are from

    2. Sales by Sales Rep: show how each reps are performing and track up-sale records.

    3. Sales by Shipment: show where product is shipped and what shipment type was used(we ship to other warehouse or customer direct)

    Our sales manager is only concerned about how many products his team sold, and his teams performance. Nothing else. I asked him about other possible analysis scenarios, but he feels that any complex sales analysis should be performed by business analyst, so based on his requirement this is my DW design:

    I guess my statement would be that those are actually some pretty simple reports to come up with. Why is it that anyone thinks they need a DW for this?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yes, we could easily utilize reporting service to create reports and simple dashboard. But before we deploy something more complex (Sales DW), I wanted start it off with small and practice SSIS/SSAS :).

  • I agree that it is a good idea to start with a single, simple datamart.

    On the other hand, do not constraint your design to what your initial user spells out, be sure your design does everything your initial user wants but also be sure you do not miss any opportunity of creating a more robust system.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • First one. Not totaly sure about the snowflake DimCustomer-DimLocation structure... it looks like a plain 1-1 relationship which I usually try to avoid at all costs.

    In a sense, it is a 1-1, as a customer, from what I gather, can't have more than one location. Keeping them combined, however, would not show all the possible locations, only locations where there were customers. Normally I am very strict on using a star schema. I use this particular combination in my own DW for point of sales and it works very well. The location table also connects directly to the fact table in mine, but that doesn't seem to apply in this case.

  • My previous employer was a hard sell on dw so I whipped up "serve yourself reporting" so I could get the confusing-to-the-users terms out of there and they could focus on "what's in it for me?" Simple stuff, for sure, but that made it quick so I could deliver before they could forget about the project.

    Sometimes, you gotta do proof of concept before getting the go-ahead to undertake a larger initiative.

  • Couple of points.

    This dimension seems to have no customer information except location. Wouldn't you at least want a name?

    DimCustomer

    - Customer Key

    - Customer City

    - Customer State

    - Customer Zip

    - Customer Country

    One of your requirments was:

    "Sales by Shipment: show where product is shipped and what shipment type was used(we ship to other warehouse or customer direct)"

    However you have no product dimension, so how would you know where a product is shipped? You should consider additional, more detailed Sales and Shipment fact tables that show information at the order item level to be able to track sales and shipments by product.

  • Thanks for your kind feedback!

    Good point on product dimension.

    Before I post my another updated design of DW, I would like to ask some question.

    My sales manager wants to capture following data for sale rep performance evaluation:

    - Sales person for initiating order.

    - Sales person for releasing order.

    - Sales person for up selling product during order.

    Example scenario:

    Customer calls Sales Rep A. Sale Rep A starts order in system, customer orders 5 products. However, customer wants to hold order for 7 days (we allow 7 day holding on all orders). Few days later, customer calls again, but this time Sales Rep B answers the phone and he release the order, and he sold 3 more products to existing order.

    For this situation, how would you create sales fact table? Do you create another fact table for upsale transaction?

    Thank you very much reading my post and feedback!

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply