How to identify changed values from a history table?

  • We have several tables with system_versioning set to on.

    There is a requirement now to produce a solution (function / stored procedure) to return a list of changed columns of a given table within a range of days. Below is an example of one of the tables - customer_sites and a bit of data:

    /**** to drop off the existed table *******
    ALTER TABLE customer_sites
    SET (SYSTEM_VERSIONING = OFF)


    DROP TABLE IF EXISTS customer_sites_history;
    GO
    DROP TABLE IF EXISTS customer_sites;
    GO
    *********************************************/CREATE TABLE customer_sites(
         customer_site_id INT IDENTITY(1,1) NOT NULL,
    customer_id   INT,
    site_name        VARCHAR(100),
    site_description VARCHAR(1000),
    site_address     NVARCHAR(1000),
    created_by       INT,  --- app user id
    created_on       DATETIME2 DEFAULT SYSUTCDATETIME(),
    time_start   DATETIME2(0) GENERATED ALWAYS AS ROW START HIDDEN NOT NULL,
    time_end         DATETIME2(0) GENERATED ALWAYS AS ROW END HIDDEN NOT NULL,
    ip_address   VARCHAR(48) NULL,
    PERIOD FOR SYSTEM_TIME (time_start, time_end));
    GO


    ALTER TABLE customer_sites
    ADD CONSTRAINT customer_sites_PK PRIMARY KEY (customer_site_id);
    GO


    ALTER TABLE customer_sites
    SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.customer_sites_history));


    INSERT INTO customer_sites (customer_id,site_name,site_description,site_address,created_by,ip_address)
    VALUES (1,'Site 1','Storage','Great London Area',1,'10.10.1.1'),
       (1,'Site 2','Manufacturing','Shetland',1,'10.10.1.1');
    GO


    DELETE cs
    FROM customer_sites cs 
    WHERE cs.site_name = 'Site 1';


    UPDATE cs
    SET   cs.site_address = 'London'
    FROM customer_sites cs
    WHERE cs.site_name = 'Site 2';

    As you can see, the table has two records for two sites then one of the sites was removed and another - amended.

    The result I need to achieve is four records with the following columns:

    customer_site_id

    time_start

    time_end

    action (for the two added rows it should be "Created" then one "Modified" and one "Deleted")

    field (name of the column - empty when "Created" or "Deleted" and  "site_address" when the column was amended)

    oldValue (empty when "Created" or "Deleted" and "Shetland" when amended)

    newValue (empty when "Created" or "Deleted" and "London" when amended)

    Any suggestion will be truly appreciated.

  • Okay well first off you should get into the habit of defining your keys as PK_FieldName rather than FieldName_PK as the former allows you to seemlessly use them in conjunction with Database Diagrams and the latter does not. This goes for all of the various prefix as follows:

    • PK_PrimaryKeyName
    • FK_ForeignKeyName_ForeignTableName
    • IX_IndexFieldName
    • UQ_UniqueFieldName
    • DF_DefaultFieldName
    • CK_CheckFieldName

    That aside as for your query issue, well that will kind of depend on what is being created within your dbo.customer_sites_history table as I believe that would hold a major key as to how to design the query to get the end results that you are asking for. So to make it easy on me, I would require you to supply me, using your examples above, what both tables (dbo.customer_sites  and  dbo.customer_sites_history) contain after the INSERT, then after the DELETE, and then after the UPDATE. Once I have this information, which will give me a solid idea of what all the data might look like, I can then teach/tutor you how to translate that into the query results you are looking for.

    Note I do run a free teaching/tutoring Discord Server if you want on going instruction or we can always do it hear. However, I teach/tutor students to fish (aka build their own solutions from scratch); I do not like giving them a fish (aka give them the full solution). That is because giving someone a fish today, means you have to give them one each day for the rest of their lives whereas teaching them to fish today means they can catch their own fish for the rest of their life.  Sure, one might need to help them learn more advanced fishing techniques but again you teach them the advanced technique or better yet how to easily find those more advanced techniques from online documentation and they then have it for life. So keep that in mind, when asking questions anywhere. If you can learn the how to behind it all then later on, you can repeat or expound upon that without needing to ask a semi-redundant question.

    Look forward to seeing those example table data views

    • This reply was modified 1 year, 1 month ago by  Dennis Jensen. Reason: Fix Typo
    • This reply was modified 1 year, 1 month ago by  Dennis Jensen. Reason: fix typo
  • First I want to say I've not used system versioned temporal tables before, but from reading about them, the below logic makes sense to me.

    Step 1 though is to read the docs on querying system versioned temporal tables to make sure it will do what I need it to do and so I know how to handle reading it and getting the results - https://learn.microsoft.com/en-us/sql/relational-databases/tables/creating-a-system-versioned-temporal-table?view=sql-server-ver16

    Now, once I'd read that, I would understand more of how to query temporal tables. Basically, I will want to query for all the changes then take that data and "see" what changed and where.

    Now, what makes your query a bit of a challenge (and thus fun) is the action and the fields after that as you would need to build logic around all of them. BUT they are not too hard to do.

    Action - If time_start = minimum time_start for the site id, then created. if time_start = max time_start for the site id, then that is deleted. else modified. I'm not sure the most efficient way to do this, but when I do something like this I tend to build a subquery or a CTE up that will capture all the data in the table along with a min and max for the datetime column I care about. I am a fan of CTE's as they are easy to read, understand, and debug and I find subqueries can get messy quickly, but use whichever you are more comfortable with.

    Field use the same logic as for action and populate as needed.

    Based on your requirements for old value and new value, I'd use the same logic, but personally I think that "old value" for a deleted item shouldn't be empty. It had a value before, why would you want to see "empty" now?  Just my opinion mind you.

    The method I would use is to get the old value using lag partitioned by the site id and ordered by time_end or time_start. new value is just value at that time period.

    Now, as I said, I haven't used these before so I apologize if my ignorance is showing, but I hope that the above is a bit helpful.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply