Need Strategy for Historic Data Changes!

  • We have a web site database (registered users etc) and a reporting database which must be kept up to date with the

    web site db. Trouble is, registered user data changes over time so somebody called "Miss Jane Smith" in February could be "Miss Jane Doe" in April (she got married in March!).

    This means a report with February data, produced retrospectively in July,

    incorrectly shows her name as "Doe", which it wasn't - it was "Smith" in Feb, not "Doe" until March.

    Help! Does anyone have a strategy for solving this problem?

  • How about a log table that keep track of changes to data that is important?

    -Roy

  • And in my experience, you'll probably need more than just the log table, if you are doing reports (since logs can be inefficient in joins for reports). You'll probably need to build into your DB design model a way for tracking point-in-time changes for key info that must be maintained at that time. You would pick the changes that were required to track and make history tables that track that attribute. The live app has one setting and uses that, but the report joins the live setting to the history tables (which are optimized by only including the pertinent data) to see what the value was in history (may be the same, may not be). But then that means that the app or something has to be aware to populate those history tables.

  • jeff.mason (6/21/2010)


    And in my experience, you'll probably need more than just the log table, if you are doing reports (since logs can be inefficient in joins for reports). You'll probably need to build into your DB design model a way for tracking point-in-time changes for key info that must be maintained at that time. You would pick the changes that were required to track and make history tables that track that attribute. The live app has one setting and uses that, but the report joins the live setting to the history tables (which are optimized by only including the pertinent data) to see what the value was in history (may be the same, may not be). But then that means that the app or something has to be aware to populate those history tables.

    I think you're on the right track. Add a history table with start/end dates. The current "live" data has a null end date. Use a trigger on the user table to update the null end date with the current time, and then insert a new live record, when any of the fields in interest change. Kinda like:

    if UPDATE(first_name) begin

    update h

    set enddate = getdate()

    from history h

    JOIN inserted i

    ON i.pk = h.pk

    insert into h (columns, StartDate)

    select (columns, GetDate())

    from inserted

    end -- if UPDATE(first_name) begin

    You know, it's not too often that normalized data actually causes a problem.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • The best method for tracking changes over time is to build an OLAP database from your live db. You can design it to track the desired changes over time and build your reporting accordingly. However, this is no small task. Modeling and implementing an OLAP db is a specialized area of its own within the db community. Certainly not an unattainable skill set for anyone working regularly with SQL Server, but not a day or 2 project.

    Spend a few minutes googling SQL Server OLAP db, SQL Server Cube, SQL Server Business Intelligence, SQL Server Data Warehouse.

    How ever you decide to go good luck.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply