What is the meaning actually for DWH

  • Hello..

    Im working on a data facility almost since 8 years on a BI and DWH design … and there was always a question on my head about designing the DWH

    when we talk about storing data .. are we talking about every part , column , row , cell of data or are we talking about a history based on indicator or KPI…

    for example if we are talking about a Dwh for a labour force survey, we are talking about unemployment rate which is critical indicator …

    in the DWH I care about the employment status and nationality and gender for example and I care about fully completed surveys only…

    but some times some requests come to me like , give us a phone number of all participants.. or whats the participants with (delete flag =1)

    those informations I dont care what Im talking about the indicator.. so whats the correct design of the DWH … can any one help me

  • The premise of a Data Warehouse is to store historical information, so that you can go back in time and see what the value of a metric and/or KPI was at a given point in the past. If your Data Warehouse does not account for that, you probably have a good reason to rethink the design.

    In my opinion, your Data Warehouse should also account for enquiries like "what was the phone number of the participant in this survey a year ago", even if not directly used in your metrics and/or KPIs. This doesn't necessarily mean that you store every single attribute of every single entity, but it does mean that you should think about (and work with the business to understand if) including some of these peripheral attributes will be useful.

  • My personal feeling is that phone numbers for people that have taken a survey are actually no one's business and should not be included in a DW.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hypothetically speaking, Jeff 🙂

  • Heh... I saw no indication of your comment being hypothetical.  I'm concerned that others won't, either. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I tend to subscribe to the idea of it's better to have the data and not need it than need the data and not have it.  There's obviously some limits to what you might reasonably want and storage considerations but in general if something was important initially it might be important again.

  • A data warehouse should be built to answer business questions in an accurate and timely manner.

    If they are driving their campaigns from the data warehouse, i.e. using it for something other than its main purpose, that's fine but then you need to apply all your good CRM practices to your data warehouse - things like maintaining contact permissions, etc.

     

  • Yes, The phone number could be a PII.  So it should be a choice for the information provider.

    DBASupport

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply