May 29, 2019 at 3:54 pm
I have a database that is refreshed from a backup nightly. I do not have access to the source database, nor the job that creates the backup.
A few tables are large, ~40M rows. Rather than handle the entire table for ETL, I'd like to work with just the delta and new records. I have a couple questions:
Thanks
May 29, 2019 at 4:01 pm
As you have no access to the source database and no way (in code) of identifying rows which have recently been added, modified or deleted, you are in a very difficult position.
If there's no logic you can use to identify data changes in code, you will have to scan entire tables.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
May 29, 2019 at 4:24 pm
Rough. I had a feeling that was the case.
I may try and pivot to keeping the nightly ETL performance efficient rather than focus on parsing changes.
Thanks
May 30, 2019 at 12:21 pm
You could keep a second copy of the database around. That will give you a comparison point. You have to sacrifice storage, but disks are pretty cheap.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply