Best practice data warehouse

  • I am building or rebuilding a datawarehouse for our company. We have multiple datasources being ETLed into this datawarehouse. My first thought was to prefix the tables to identify the source, ie... (tERP, tWhse, tHR)

    Is it best practice to do this or to use a different schema like (ERP.Table, Whse.Table, HR.Table) Or again is there another best practice that works well.

  • Usually the company set up a standard for the table names. So it is up to you how to  name your tables.  The most important thing to me is to identify which one is dimension table  and which one is fact table.  So my table name ended with tbl_XXXXX_dim (dimension) and tbl_xxxx_fact (fact table.

  • Usually, in a warehouse, you would want to combine the data from your sources into a single place for reporting.

    For instance, all of your accounts may come from your finance system and your sales system and end up in the AccountDim dimension in your warehouse.  Of course the source is often important information, so you may need to include a data source identifier field in the dimension table.

     

  • Thank you for the advice.

  • We are building a data warehouse where I work, and we are using schemas to identify the source systems in our staging environment.  Once the data actually moves to the warehouse, we won't use the source system schema.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply