Change Tracking (CT) and Change Data Capture (CDC) were both added to SQL Server in 2008. At first it seems like these two items ought to be synonyms, but they’re separate features. They are similar, but there are some differences, and you might choose to use them in different situations.
Change Tracking (CT)
CT is not as well known as CDC, and I see it talked about less. This is really a feature that allows the net changes made to data to be easily returned from a query. This only lets you know that a particular row has changed since your last query. You have no idea
- how many times it’s changed
- the various change values over time
The queries you run will return a table that lets you know which rows have changed since the last check, and then let you know the type of DML change. You need to join this table with the source table to actually get the data.
This is really useful for those applications that cache data and periodically query to update their caches. Lots of .NET frameworks allow this, and it’s a great way to limit the load on your database server.
Change Data Capture (CDC)
CDC is more well known, and seemed like a great tool when I first saw it, but like many useful enhancements, there is a bit of complexity that you have to work through in order to understand and use this feature.
CDC is a little more complex to implement, and it creates a bit more data in your database. You get a change table that is a copy of your table, along with a few additional columns that contain metadata. For each DML operation, you get a row(s) added to the change table. Inserts get one row (new data). Deletes get one row (old data) and updates get two rows (old and new data).
This gives you lots of history and information about your table, but it’s a lot of data. The amount can grow quickly in a busy database, so you need to be sure that you extract the information you need and prune the change tables periodically.
Which One?
Which feature should you use? Is CT better than CDC? They work differently, and they capture different amounts of data. There’s an entry in BOL that compares them, and you should understand the differences. However you really need to spend time working with both to make a decision about which one meets your needs.
Pick a table or two, enable one, test with some workload changes, then evaluate. Then repeat with the other. You might find that you need to use CT in some places, and CDC in others, depending on the downstream processing of the data.
Filed under: Blog Tagged: CDC, sql server, syndicated