How to detect changes in the data using T-SQL

Question

How to detect changes in the data using T-SQL

Ross Glenn

Old Hand

Points: 360
More actions
January 9, 2004 at 3:49 am

#107282

Hi All
Does anybody know how to detect changes in the data using T-SQL but not with triggers.
I have a table that needs to be updated whenever any kind of data is updated, deleted or inserted in any table. This table will indicate whether the data has changed and thus be flagged for a backup. The only problem is that there are a hundred users writing to the database at any given time and using triggers can create a large amount of overhead which I want to avoid.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Ferguson SSCertifiable Points: 5837 More actions · Answer 1

We did something once that might satisfy you, though honestly I'm not sure that your approach isn't better served with triggers, at least setting an update date/time in the sale table.

What we needed was to track usage in a "sandbox" where people could create their own tables. We did not have control of the schema so we couldn't depend on triggers, and we wanted to recognize stale tables (ones created and forgotten). So we weekly run a script which uses the CHECKSUM_AGG(BINARY_CHECKSUM(*)) functions, and save such a checksum for each table in a modifications table. The following week we compare a new calculation against prior week to see if anything changed and thus we know which tables have changed.

It is fast and easy. It is not 100% accurate. We found the checksum used is mediocre. Certain changes can produce the same checksum far more frequently than one would expect (well, to be precise, with a 32 bit checksum I expected it to be virtually never and we found a couple cases so far, so it is not bad).

Knowing nothing about your application I hesitate to say this, but I've rarely seen cases where backing up individual tables really made sense in the long run. Are you sure you are solving a problem you need to solve, backing up tables triggered by changes, vs. just backing up the entire database? How hard would it be to put back the entire database if you lost it, from lots of pieces?

Robert Davis One Orange Chip Points: 28027 More actions · Answer 2

I agree. I think triggers is the way to go. I don't see why it would create a lot of overhead if you use the triggers wisely. In my important tables where I need to know if and when modifications have been made, I use a datetime field. It has a default of getdate() so that inserts will have a datetime. Then it has an update trigger that updates that field with getdate() if it was not updated already by the update query.

No detectable overhead at all.

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]

Ross Glenn Old Hand Points: 360 More actions · Answer 3

Thanks for the replies. I would just like to ask RawHide if you could perhaps give me some more details on using the datetime for data change notification. I understand that everytime the data changes the datetime field will be udpated but I only need to be notified once. Let me explain my whole situation for better clarity.

I have a database(actually about 150) each containing about 250 tables. These databases are updated frequently at different times. I am using log shipping to update my databases. I have another database which has a table used to indicated any changes made to the data in the database. The table has two columns called databasename and active. If there are any chages made to any data in any database the active column is set to 'Y' for the database that is changed. Now can you imagine a trigger on every table in every database to update this table to show active 'Y'. I need this notification to happen once and once only util the active column can be reset to 'N'. I though about disabling the triggers in the database but you cannot do this from within a trigger. Even if I create a stored procedure to disable the triggers I cannot execute the stored procedure from the trigger.

Any ideas will be greatly appreciated.

Thanks

Robert Davis One Orange Chip Points: 28027 More actions · Answer 4

Well, my thinking was that you could query the field to see if the highest date in any of the tables is higher than the last time you did a backup.

Have you already tried it with the triggers? If so, then the overhead that you're seeing may be because you have so many tables and databases that could be potentially trying to update the same table in your tracking db. Updates, of course, lock the table.

Are deletes a possibility too? If not, here is a similar option that may work for you. Instead of using a field with a trigger that updates the datetime every time something happens, add a timestamp column to each table. Timestamps are binary numbers that are guaranteed to be unique within the database. A timestamp is automatically updated any time a record with the timestamp field is inserted or deleted.

Then when it comes time to determine which db's to backup, you can check the @@dbts of each database to see if it has been updated. You would need to track the timestamp for when you last backed up teh db.

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]

Steve Jones - SSC Editor SSC Guru Points: 736212 More actions · Answer 5

The only real way to automate this is with a trigger. But the trigger can be really simple and update the same table as mentioned above, then load your "backup" table just prior to the backup by querying each table as mentioned above.

There are products to read the log and such, but I'm not sure how you would query them from your backup system (like Log Explorer). You could use Profiler, but I think the overhead here would be higher than than of the other method.

marilou Mr or Mrs. 500 Points: 574 More actions · Answer 6

Hi all,

speaking of timestamps in the database, is it advisable to create createtimestamps and update timestamps on all tables of the database? We have one database here that is problematic and we are trying to re-model. One of the database architects I am working with is trying to insist about adding those timestamps column to ALL tables in the database to monitor the changes. I am skeptic about it. I am thinking that not all tables in the database needs to be monitored like that.

What do the experts can say about this?

Thanks.

Lhot

Jeff Moden SSC Guru Points: 1004341 More actions · Answer 7

We had someone do a similar thing at work and it caused, on the average, of 624 deadlocks a day because they used UPDATE instead of INSERT. And we only have about a 100 users. Then, they caused "block outages" for several minutes several times a day while they processed this highly transactional target table. Be careful of how you approach the problem of having hundreds of users hammering a small number of rows in a single table whether you use triggers, or not.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Robert Davis One Orange Chip Points: 28027 More actions · Answer 8

I would say that you should only add it if you think it needs to be monitored or if the time stamp data would be important. For example, I wouldn't add those fields to a static lookup table.

To what Jeff posted, all I can say is WOW!!! There should be as close to zero deadlocks per day as possible. Things may happen ocassionally, but an average day, there are no deadlocks in my database and very few delays because of regular locking.

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]

Jeff Moden SSC Guru Points: 1004341 More actions · Answer 9

Yup, once I identified the problem and fixed it, we went back to zero deadlocks. Just wanted everyone to know that this type of thing, if done incorrectly, can really put a pinch on Trace 1204...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Robert Davis One Orange Chip Points: 28027 More actions · Answer 10

Yeah, I'm glad you mentioned that!! It's something people tend to not think about and then they wonder why it worked in development or test (with 1 user) and doesn't work in production.

My blog: SQL Soldier[/url]
SQL Server Best Practices: SQL Server Best Practices
Twitter: @SQLSoldier
My book: Pro SQL Server 2008 Mirroring[/url]
Microsoft Certified Master: SQL Server, Data Platform MVP
Database Engineer at BlueMountain Capital Management[/url]

marilou Mr or Mrs. 500 Points: 574 More actions · Answer 11

guys, thank you so much for the input... these are all valuable to me. Jeff, I am interested about Trace 1204. Can you give me more idea on how updates on those tables with timestamps can cause problem on Trace 1204?

Again, thank you for all your reliable input.

Jeff Moden SSC Guru Points: 1004341 More actions · Answer 12

Sure, Marilou,

To make a really long story shorter, if you have triggers, procs, whatever, all vying for the same row to update in a tracking table as you suggest, the chances of deadlocks increases at a logarithmic rate especially if BEGIN TRAN/COMMIT pairs are used in code that updates more than one table at a time.

I had a similar problem at work where some third party geniuses didn’t know about the IDENTITY property and made a sequence table, like in Oracle, instead. Again, long story short, instead of rewriting a couple hundred user procs, I rewrote the routines that would get a NextID from the sequence table so that inserts occurred in a separate table (couldn't change theirs)instead of updates by dozens of users/procs on a single row.

It sounds as though your tracking table could cause the same type of problem under the load of many users. I would do inserts to the tracking table instead of updates to avoid the inevitable deadlocks that will be cause by using updates, in this case. Once the backup occurs, truncate the tracking table instead of deleting from it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 13

pmfji, but there are a few things I'm not quite clear about..

What is the end purpose of all this? It seems to me much like reinventing the wheel, by triggers determine if something should be backed up or not...? If this is the end goal, knowing if I should backup db x or not, then my suggestion is to drop the idea entirely. It's not going to work in the long run, and has the potential of being an administrative nightmare.

Just backup each db like 'regular', and be done with it - everything else is bound to break.

/Kenneth

Jeff Moden SSC Guru Points: 1004341 More actions · Answer 14

Finally, the voice of reason. I agree with Kenneth 100%.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)