In some database designs, you may encounter SQL Server columns with a TIMESTAMP or ROWVERSION data type. What are these, and how do you deal with them in SQL Server Integration Services?
Time? Version?
First, let's talk semantics. There are several unfortunately bad aspects to these names. First, the name "TIMESTAMP" is deprecated - newer versions of SQL Server use the name ROWVERSION instead. I can imagine that there was confusion that the data type somehow represented an actual time value (it does not), or matched the ISO standard definiton of a TIMESTAMP data type (it does not). Unfortunately the ROWVERSION name doesn't describe it that well to me either, as there is no relation to the primary key, or how many updates have been made to a particular row.
A better description would probably be to call the data type SEQUENCE... but that's already a keyword used for another construct in SQL Server. A column with a TIMESTAMP or ROWVERSION data type is an auto-incrementing, unique value for the entire database. (Perhaps it's implemented internally as a SEQUENCE... π ) Another wrinkle is that this column gets updated whenever any data in the row is changed by an UPDATE. So it's almost like an IDENTITY column for inserts and updates.
What's it Used For?
As highlighted in the opening of the MSDN page for ROWVERSION, this data type is (allegedly) typically used to version-stamp table rows. In my (limited) experience, it can be equally (mis)used similarly to a GUID column - a unique identifier for a row in a table, or a hash-like "has this been modified" value.
However it's used in the construction of the table, it usually enters into the Integration Services world as a column you have to compare against another table's column - typically to detect changes.
Some database developers may take advantage of the documented (but not guaranteed in my reading) "incrementing" behaviour, and use it to help determine which row is the newest when comparing two rows.
Careful reading of the MSDN page also shows that duplicate rowversion values are possible if SELECT INTO statements are used improperly. Something to watch out for there.
How Does it Appear to SSIS?
When you extract data from a table that has a ROWVERSION column, SSIS will see it as a DT_BYTES column of 8 bytes. This makes sense, as the MSDN page describes it as semantically equivalent to a binary(8) or varbinary(8).
Since it's a byte stream, that greatly limits what you'll be able to do with this data in SSIS - but then, the intent of this data type isn't to contain any more data than some kind of version stamp. Your most likely action inside SSIS is to compare this value with the value from another table to see if they're different...
Performing Comparisons
If you try the straightforward comparison in a Conditional Split or Derived Column... you'll get a syntax error. Despite claims to the contrary, SSIS can't use any comparison operators on DT_BYTES columns. (Except the ISNULL function.)
What to do? Convert the DT_BYTES to a data type that SSIS can compare - DT_WSTR. There's a tricky spot here though - you can't convert the DT_BYTES(8) to a DT_WSTR(8). The conversion code sometimes generates more character bytes - so double the space to a DT_WSTR(16), and you should be off to the races.