September 3, 2009 at 1:13 pm
So here's my situation.
I have a Production Table that has about 23 Columns.
The SSIS Package I'm developing in a nutshell, imports records from a .csv (flat file connection) and loads it into the production table.
The first step I do is to load the .csv contents into a Staging table.
I then take the Staging table (OLE DB Source) and do a lookup transformation on the Production Table to catch any records that don't exist in it (route the error rows to a OLE DB Destination to insert it into the Prodcution Table). <- This part is fine.
On the flip side, I need to check for and records that have changed. I know how to do it using the instructions below :
http://blogs.conchango.com/jamiethomson/archive/2006/09/12/SSIS_3A00_-Checking-if-a-row-exists-and-if-it-does_2C00_-has-it-changed.aspx
Which is fine for a few columns, but with 23 Columns, thats alot of work.
So I did my research and found the Checksum Transformation
http://www.ssistalk.com/2007/03/09/ssis-using-a-checksum-to-determine-if-a-row-has-changed/
The problem I'm having is, the example above, creates a table with a column called 'hash', and eventually populates the same table (update or insert).
In order for me to do this I will have to add a column 'hash' into both my Staging Table and my Production Table. (which i want to avoid)...
My brain has been like mush, so I may be missing something, or unable to see this clearer...
Any one use this Checksum Transformation before?
Or is there another way to update a table with 23 columns..
I hope my question is clear.
Currently I'm cheating by updating everything even if nothing has changed; ie, if the rows don't exist, insert otherwise do an update.
September 8, 2009 at 10:09 am
You've got a couple solutions to look at. One, use the SCD transformation to do the compares for you. You'll still need to configure the updates types in the wizard for all 23 columns, but that's it. Second, use the method shown in your link, but create the expressions with SQL code so you don't have to manually type out one giant expression that checks all 23 columns. I've done this before. Here's an example of some code to create your expression. Note that this was used a while back and may not generate your expression 100% as you want it. It will, however, get you moving in the right direction and hopefully help save you time.
DECLARE @expression varchar(max)
SET @expression = ''
SELECT @expression = @expression + '('+
CASE --handle NULLs
WHEN t.name IN ('bit')
THEN '(ISNULL('+ c.Name +') ? (DT_BOOL)0 : (DT_BOOL)'+ c.Name +') != (ISNULL(Dest_'+ c.Name +') ? (DT_BOOL)0 : Dest_'+ c.Name +')) || '
WHEN t.name IN ('char','varchar','nchar','nvarchar', 'text', 'ntext', 'sysname')
THEN '(ISNULL('+ c.Name +') ? "" : '+ c.Name +') != (ISNULL(Dest_'+ c.Name +') ? "" : '+ c.Name +')) || '
WHEN t.name IN ('smalldatetime', 'datetime')
THEN '(ISNULL('+ c.Name +') ? (DT_DBTIMESTAMP)"01/01/2000" : '+ c.Name +') != (ISNULL(Dest_'+ c.Name +') ? (DT_DBTIMESTAMP)"01/01/2000" : Dest_'+ c.Name +')) || '
ELSE '(ISNULL('+ c.Name +') ? 0 : '+ c.Name +') != (ISNULL(Dest_'+ c.Name +') ? 0 : Dest_'+ c.Name +')) || '
END
FROM sys.columns c
INNER JOIN sys.types t
ON c.system_type_id = t.system_type_id
WHERE object_id = OBJECT_ID('put your table name here')
AND c.Name NOT IN ('put a comma separated list of your business key columns here')
ORDER BY Column_ID
SELECT @expression
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply