Adding new SCD 2 attribute to existing Dimension AND history required

  • I've just had a lively discussion with a dev here on the best/proper way to approach adding a new SCD2 attribute to an existing dimension table. The problem I'm having is for the requirement to install history for this attribute.

    Easier to explain with an example

    Dimension: Loan

    New Attribute: IsLoanPerforming ('Yes' , 'No', 'Unknown')

    Dimension has about 30 attributes, all tracked as SCD2 to retain full history. Dimension has been in use for about 3 years.

    The new 'IsLoanPerforming' can be derived through a mix of dimension attributes and fact info (Fact EndOfDay values) but has been requested its added as a standard attribute. As it can be derived request has been made to update the historical info for this.

    My usual approach to adding a new scd2 dimension attribute is to only inforce it from the go-live date forwards. Unless the attribute can be wholly derived from other attributes in the same dimension.. The reason? complete pain to rejig the whole dimension and dependant fact tables.. eg if a particular loan x say had a startdate of 2011-01-01 and enddate of 2011-01-30 , but the loan switched from 'Yes' to 'No' for IsLoanPerfoming on 2011-01-15, I would have to take the old record change the enddate to 2011-01-14 and insert a new record taking all the values from the previous just updated one, except changing IsLoanPerforming of course.

    Then I meed to goto every fact table that has a foregin key to this dimension and mess around with the foregin keys for all dates forward of 2011-01-14 to repoint the fact to the newly inserted record rather than the old one. Thats one change done. now multiply out a gizllion fold

    The other developer has requested as a solution to getting around the problem of rejigging the dimension to add history for this new attribute is to move the attribute to the Fact End of Day Value table. that way it much easier to add the history. This is a totally non additive fact and it really doesn't sit well with me to add this.

    Anyone any comments/experience of the correct way to resolve this? Is it right to migrate to Fact table in this manner? I know its easier, but for me it just seems to be a 'quick' fix

  • I'm with you, a SCD of the Type 2 has to be tracked on the DIMension table, that's why we call it Slowly Changing Dimension 🙂

    On my experience SCD2 tracking strategy always include a from/to pair of dates to document the time period each state of the SCD was valid.

    Question is... do business requirements are asking to calculate all previous states of the SCD2 back in time?

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Funnily enough that is how we left it, With the analyst to go back to the biz to see how important history is for this.

    I suspect the answer will be 'of course the history is important' though...

    :w00t:

    I reckon that once a 'framework' is designed for this 'recreating' of history it should be easy enough to apply it the next 'inevitable' time I get a request like this. I get the feeling that biz will push for the quick 2 day fix (adding to fact table) compared to the probably one week + it would take to dev the proper scd2 history in the dimension.

  • nahk.fussuy (7/29/2011)


    Funnily enough that is how we left it, With the analyst to go back to the biz to see how important history is for this.

    I suspect the answer will be 'of course the history is important' though...

    :w00t: we can see the future! we should be making money out of it.

    nahk.fussuy (7/29/2011)


    I reckon that once a 'framework' is designed for this 'recreating' of history it should be easy enough to apply it the next 'inevitable' time I get a request like this. I get the feeling that biz will push for the quick 2 day fix (adding to fact table) compared to the probably one week + it would take to dev the proper scd2 history in the dimension.

    Business shouldn't dictate how to technically implement a solution.

    Moving it to the FACT table is bad dimensional design, tell them they can't feed broken glass to a pig and expect the pig to survive.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • I like that metaphor. rofl.

    The biz always errs on the side of caution, that's why its easy to predict what they will say!! 😀

  • Business shouldn't dictate how to technically implement a solution.

    yes, but in a lot of cases (dare I say most?), the business controls the purse strings and therefore has more input on the design decisions (than they should). As the SME's, it is our job to present options with pros and cons...preferably in writing (to cover your ***).

  • This one is a tough one, and w/o more information it isn't possible to say with certainty the best way forward.

    With things as complicated as you say, I would generally prefer the "from this point forward" strategy. Unless you are prepared to reload the dimension and the fact table in the same way you might if the cube were just coming on line for the first time, what you are suggesting would not simply be a lot of work, which alone isn't a showstopper, but it seems to me to have a high percentage probability that not everything is going to be pointed where it should be.

    If this is truly a dimension attribute, sticking it in the fact table simply because it's easier is going to deprive the dw of some of it's analytical value. If it's not, however, or at least not in the same way as the other ones, then it should be in the fact table as a new dimension key.

    In any case, I'm glad I'm not in your shoes. Good luck!

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply