Identify Changed Records

  • I have a database that is refreshed from a backup nightly. I do not have access to the source database, nor the job that creates the backup.

    A few tables are large, ~40M rows. Rather than handle the entire table for ETL, I'd like to work with just the delta and new records. I have a couple questions:

    • If I enable change data capture on the database to which the backup is restored, and CDC is not enabled at the source, will this disable CDC on my database? ( my assumption is yes ).
    • Is there a way (other than snapshots) to cleanly and efficiently assess data changes between backups? Notes:

      • The modified dates in the database are not reliable
      • There is no audit table available
      • CDC is not enabled on the source data, and enabling it is not an option.

     

    Thanks

  • As you have no access to the source database and no way (in code) of identifying rows which have recently been added, modified or deleted, you are in a very difficult position.

    If there's no logic you can use to identify data changes in code, you will have to scan entire tables.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Rough. I had a feeling that was the case.

     

    I may try and pivot to keeping the nightly ETL performance efficient rather than focus on parsing changes.

    Thanks

  • You could keep a second copy of the database around. That will give you a comparison point. You have to sacrifice storage, but disks are pretty cheap.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply