August 4, 2010 at 4:42 pm
Was hoping someone could help me here, I have a project to create an SSIS package (SQL Server 2005) that will read from a CSV source (flat-file) and import to a SQL table.
The flat file has a unique SEQ row that acts as a unique key/identifier, based on that the UPSERT will either update/insert.
I followed the article here to create a test project:
http://sqlserverpedia.com/blog/sql-server-bloggers/ssis-%E2%80%93-performing-an-upsert/
I have a trigger defined as:
CREATE TRIGGER [dbo].[trg_SSIS_Update]
ON [dbo].[vw_SSIS_Import]
INSTEAD OF INSERT
AS
BEGIN
UPDATE import
SET
DATE = i.Date,
NFIRST = i.NFIRST,
NLAST = i.NLAST,
BIRTHDAY = i.BIRTHDAY,
COUNTRY = i.COUNTRY
FROM dbo.csv_data import
INNER JOIN inserted i
ON i.SEQ = import.SEQ
END
Which is used by the UPSERT > OLEDB Destination so that it can update in bulk and not row by row for performance.
Now what I would like to do, is implement some kind of "tracking" to be able to troubleshoot if there are errors, how would one create a mechanism that would show me all new/updated rows that the ETL does?
For the sake of having to "troubleshoot" the data.
Any help is greatly appreciated.
August 5, 2010 at 3:27 am
What sort of errors are you expecting? Any serious errors may not, of course, even make it into the db, so your envisaged method might not be a comprehensive solution.
But you could always add 'created' date/user and 'last modified' date/user fields to your table too - they will help.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
August 6, 2010 at 4:35 am
That is a nice article you are using. But personally, I would create a temp table in the SSIS package and write the update rows to that table. Then perform the set based update using that temp table.
This way, the BI developers don't have to touch the database and create views/triggers for every table that needs updating. Triggers are a bit tricky regarding to maintenance, as they can easily be forgotten when you deploy your project (although having the right settings in SSMS helps a lot).
But back to your question:
as Phil suggested, add two columns: creation_datetime with a default of getdate() and update_datetime, also with a default of getdate(). Then, when a record is updated, the update_datetime is set to the current getdate().
If you need exact logging of how many records are inserted or updated, you can use the @@rowcount function.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply