September 15, 2021 at 6:59 pm
I need to extract changed records from a client's db. The client's db I have access to is a read only copy and is updated once per day with transactions logs from some source db. I'm targeting 70 tables and only a handful have a "last_updated" column. So I was wondering if it is possible to setup CDC on this client's db given that fact that the only updates are from transaction logs? There are not many google references to this subject and everything I've read points to NO. Just wanted to ask here to make sure or see if anyone had other ideas on how I could extract the changed data.
September 15, 2021 at 9:28 pm
Yeah that's not really intended use of CDC to posthumously recreate the change capture from logs.
Another option is to keep two snapshots on your target, so pre and post restoring the log from source and then doing the deltas yourself. That's assuming you don't need to capture multiple changes during the same day and that you have the access on the target system to do that. How big the database in question and how many changes do you actually need to capture?
September 15, 2021 at 10:17 pm
Unfortunately, I only have read access to the database at this point. Not sure if that will change. My hope is that if I can find a solution and lay out the steps the client will implement the solution on my behalf. I don't need the intermediate changes through out the day only the final changes for the day so you diff solution would work. A diff solution is the only thing I've come up with so far but a diy solution is shaky. Who's going to support it when it breaks? The db size is about 1560 GB with 5800 tables. I need 76 tables with 280M rows and combined size of 60 GB. Not sure what the daily change volume is on those 76 tables. I'm going to try to capture that over the next week or so.
September 22, 2021 at 10:05 am
Hello,
In my application, with a DB running on SQL Server 2012, I've got a job (scheduled task) that periodically executes an expensive query and writes the results to a table that can later be queried by tellthebell application.
Ideally, I would like to run that expensive query only if something changed since the query has last executed. Since the source tables are very big, I cannot just select a checksum over all candidate columns or something like that.
I've got the following ideas:
Explicitly write a last changed timestamp, a "must be queries" flag, or something like this to a tracking table whenever I change something in a source table.
Use a trigger to do the same.
September 22, 2021 at 7:28 pm
Edit...
September 22, 2021 at 8:16 pm
Hello,
In my application, with a DB running on SQL Server 2012, I've got a job (scheduled task) that periodically executes an expensive query and writes the results to a table that can later be queried by the application.
Ideally, I would like to run that expensive query only if something changed since the query has last executed. Since the source tables are very big, I cannot just select a checksum over all candidate columns or something like that.
I've got the following ideas:
Explicitly write a last changed timestamp, a "must be queries" flag, or something like this to a tracking table whenever I change something in a source table. Use a trigger to do the same.
do you want to run the expensive query not only IF something has changed but also WHEN something has changed or are you doing that on a fixed schedule? And how long does your expensive query take to run and how big in GB is the size of the table it produces?
Sorry for all the questions but I do have a way to make the total apparent downtime for the table to be measured in several milliseconds but just need to make sure I'm not wasting either of our time.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 14, 2021 at 7:28 am
This was removed by the editor as SPAM
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply