November 3, 2011 at 10:49 pm
Comments posted to this topic are about the item Deleting Rows in Incremental Loads – Level 5 of the Stairway to Integration Services
Andy Leonard, Chief Data Engineer, Enterprise Data & Analytics
November 4, 2011 at 5:06 am
As you're now using a staging table, you could do all three ADD/DELETE/UPDATE operations at once with an SQL MERGE. Are you going to cover that in the next part of the series?
November 4, 2011 at 10:07 am
Delete src
From Person.Contact src
Join StageDeletes stage
On stage.Email = src.EmailAddress
I think it should be deleted from dbo.Contact
November 4, 2011 at 12:37 pm
David Data (11/4/2011)
As you're now using a staging table, you could do all three ADD/DELETE/UPDATE operations at once with an SQL MERGE. Are you going to cover that in the next part of the series?
^^^ THIS ^^^, yes please! I believe the MERGE function still must be contained within a script task / execute T-SQL task (no SSIS component built for it...yet...) but its so powerful its worth the extra overhead 🙂
Peace.
- Savage
November 4, 2011 at 2:22 pm
Hi David,
You are correct, but I do not plan to cover the Merge statement in this series.
Andy
Andy Leonard, Chief Data Engineer, Enterprise Data & Analytics
November 4, 2011 at 2:23 pm
Hi Erin,
You are correct - my intent was to delete the rows from the Destination table (dbo.Contact), not the source (Person.Contact).
Good catch and my bad!
Andy
Andy Leonard, Chief Data Engineer, Enterprise Data & Analytics
December 12, 2011 at 6:42 am
I recently was surprised to find importing a large number of wide rows it was faster to use a SSIS merge of two sorted inputs with a conditional split rather than a lookup diverting no matches to a no match connection. Have you seen a general rule when it is optimal to use the merge rather than a lookup?
January 8, 2012 at 10:47 pm
The final test query should also be
Use AdventureWorks
go
Select Count(*) As RecCount
From dbo.Contact
Where FirstName = 'Andy'
And LastName = 'Leonard'
If you run the query before executing the package you get a count of 1. After executing the package the count is 0.
January 25, 2012 at 6:35 pm
Hi Davide,
How can one use the three ADD/DELETE/UPDATE operations at once with SQL MERGE?
I would really like to have a clue of how it works.
Thanks
January 26, 2012 at 6:00 am
SQL 2008 BOL "ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10de_1devconc/html/952595a6-cf1d-4ff5-8927-66f9090cf79d.htm". MS finally did something nice. It has it's limitations and place but doesn't everything. I use it quite a bit, but would really like to see it added to SSIS. Enjoy 🙂
August 1, 2012 at 7:47 am
Hi Andy,
Great series (integration services)! It was very helpful -- easy to follow and extremely relevant. I just have one thing I believe is an error. It was on the last part (delete rows). The final query to check if rows were deleted went against the person.contact table. However, the row was actually added to the dbo.contact table. So I believe the last query should also go against the dbo.contact table. Just an Fyi. Thanks for your great articles!
Michelle
September 21, 2012 at 8:11 am
Andy, for the TSQL step to apply updates and Deletes to the destination database, would you recommend adding indexes to the stage tables that match the indexes on the production tables (I'm assuming that in reality the join will be on some unique (or close to) identifier rather than an email address) to help the optimiser and therefore speed the package processing time?
September 21, 2012 at 1:10 pm
Hi Jonathan,
If it would improve performance, sure. I am unsure how much performance improvement could be realized unless there are a lot of staged rows, though. But as with many things related to SQL Server, I am certain the answer is "it depends."
:{>
Andy Leonard, Chief Data Engineer, Enterprise Data & Analytics
December 20, 2012 at 3:12 pm
Andy,
Thank you for the great post. This tutorial helped me a lot. I think your delete script is deleting from the source table instead of the destination table. The script should have been:
[font="Courier New"]Delete dest
From dbo.Contact dest
Join StageDeletes stage
On stage.Email = dest.Email[/font]
The table should have been dbo.Contact instead of Person.Contact. And the test script should change to:
Use AdventureWorks
go
Select Count(*) As RecCount
From dbo.Contact
Where FirstName = 'Andy'
And LastName = 'Leonard'
August 7, 2013 at 10:55 am
Hi Andy,
Appreciate your good stuffs.. I know it is very late comment to this post. However I like to share my opinion and get your thoughts as well.
1. From your scenario, my view of the incremental data means the data that has been amented/inserted since certain time. according to this logic, the extracted source data in this article is not incremental one. it is FULL SET. if my assumption / definition is not correct, please advise me.
2. Lookup transformation with "cache" will degrade the performance for huge amounts of data due to memory usage. my opinion is that it can not be used for transactional tables (or fact tables). But surely it helps for small lookups table ( or dimension tables)
3. Lookup transformation with "no cache" again degrade a lot when we use the FULL DATA SET as it make millions of calls to the database
4. my option is "use the MERGE JOIN " route for delete operation by selecting only key colunms and compare them to identify the missing ones..
to be honest , I am a LOOKUP TRANSFORAMTION lover, but no other go for DELETE OPERATION as I am aware of. Requesting you provide me any other route/ logic
looking forward to hear your thoughts..
Thanks again for such a wonderful series,
Have a good day!!!
K.Singanan
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply