Calculated field within Dimension

  • Hi everyone,

    I would like to get your input on the best practice/method for calculating a dimension attribute that is derived from a fact record. Something along the lines of "last sale amount" attribute in the customer dimension.

    Options that comes to mind...

    Option1:

    populate client dim, populate sales fact and then as a 3rd seperate process, calculate the attr value by querying the fact table by customer key

    Option2:

    During the fact etl process, calculate the value per customer id and update the client dim.

    Options3:

    Leave it for the reporting tool

    Any other options? Thanks!



    What's this "backup strategy" everyone is on about?

  • T_VR (11/22/2011)


    Hi everyone,

    I would like to get your input on the best practice/method for calculating a dimension attribute that is derived from a fact record. Something along the lines of "last sale amount" attribute in the customer dimension.

    Options that comes to mind...

    Option1:

    populate client dim, populate sales fact and then as a 3rd seperate process, calculate the attr value by querying the fact table by customer key

    Option2:

    During the fact etl process, calculate the value per customer id and update the client dim.

    Options3:

    Leave it for the reporting tool

    Any other options? Thanks!

    Personally I do not like the idea of including such calculated attribute on a DIM table - DIM table should include the attributes used to dice and slice the FACT data.

    If "last amount sale" is a granular piece of data then it should be stored on FACT table, if "last amount sale" is nothing but the summarization of a invoice then such information should be located on invoice describing table like fact_invoice.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • T_VR (11/22/2011)


    Hi everyone,

    I would like to get your input on the best practice/method for calculating a dimension attribute that is derived from a fact record. Something along the lines of "last sale amount" attribute in the customer dimension.

    Options that comes to mind...

    Option1:

    populate client dim, populate sales fact and then as a 3rd seperate process, calculate the attr value by querying the fact table by customer key

    Option2:

    During the fact etl process, calculate the value per customer id and update the client dim.

    Options3:

    Leave it for the reporting tool

    Any other options? Thanks!

    In general I would prefer Opt#3. I would also not consider it as Dimensional Attribute.

  • Thanks for the feedback,

    Maybe my exampe wasn't the best one...

    If we change the dim attribute to "last purchased date"

    1) would you include that in your dimension?

    2) if so, which option would you go to do that?



    What's this "backup strategy" everyone is on about?

  • T_VR (11/22/2011)


    Thanks for the feedback,

    Maybe my exampe wasn't the best one...

    If we change the dim attribute to "last purchased date"

    1) would you include that in your dimension?

    2) if so, which option would you go to do that?

    Do you really think it adds any value to Customer Dimension? Can it be a property of Customer?

    It's a fact and need to be calculated at runtime.

    If you add it to Dimension, you need to load it in each ETL (don't know the frequency). How your SCD will identify this change?

  • Thanks Dev,

    The main advantage for me was that this is something that is used often within queries and this solution could have made the report not hit the fact table to get the value. So basically shift the processing from the report to the ETL. However, it seems this might not be the best way of going about it 🙂



    What's this "backup strategy" everyone is on about?

  • T_VR (11/22/2011)


    Thanks Dev,

    The main advantage for me was that this is something that is used often within queries and this solution could have made the report not hit the fact table to get the value. So basically shift the processing from the report to the ETL. However, it seems this might not be the best way of going about it 🙂

    You can still do it if your SCD is simple (overwrite). I would still not prefer it.

  • T_VR (11/22/2011)


    The main advantage for me was that this is something that is used often within queries and this solution could have made the report not hit the fact table to get the value. So basically shift the processing from the report to the ETL. However, it seems this might not be the best way of going about it 🙂

    I see... specification suddendly changed from "last purchased amount" to "last purchased date" - no problem 😀

    "last purchased date" is certainly a granular attribute describing a purchase. Hard to believe a query against data warehouse would ever not hit a fact table, after all there are the "facts" business people likes so much to see 🙂

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • PaulB-TheOneAndOnly (11/22/2011)

    I see... specification suddendly changed from "last purchased amount" to "last purchased date" - no problem 😀

    Hey, if business can do it so can I 😀

    Thanks for all the input! much appreciated!



    What's this "backup strategy" everyone is on about?

  • The main advantage for me was that this is something that is used often within queries and this solution could have made the report not hit the fact table to get the value. So basically shift the processing from the report to the ETL. However, it seems this might not be the best way of going about it 🙂

    Personally from what you're saying I think this probably is the best way of going about it, can there are potentially good uses for it in a dimension, although it is unusual, at least in my experience.

  • I guess YMMV, depending on your report vs ETL performance, but in general I would have expected it to be more expensive to do it in the ETL, quite cheap to do it in the report, and also the data is properly factual.

    Having said that, I have done it in the ETL before (at someone elses insistance) but more often pushed it back successfully.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply