August 20, 2011 at 8:29 pm
Comments posted to this topic are about the item Level 3 of the Stairway to Integration Services - Adding Rows in Incremental Loads
Andy Leonard, Chief Data Engineer, Enterprise Data & Analytics
September 15, 2011 at 10:56 pm
How do you handle a scenario when data exists in the target table but has been deleted from the source table. Also how will this handle the load in case email id gets updated in the source.
September 16, 2011 at 3:54 am
raghav.kaura (9/15/2011)
Also how will this handle the load in case email id gets updated in the source.
If your source system uses an attribute like email as a key column and that attribute can be changed then you have a very bad source system or you're using the wrong column as a key onto the source data.
September 16, 2011 at 6:45 am
Below figure 20, there is a statement that reads, "In SSIS 2008 and SSIS 2008 R2, the Lookup Transformation provides this built-in output to catch records in the Lookup table (the dbo.Contact destination table, in this case) that do not exist in the source (the Person.Contact table) - it's the Lookup No Match Output."
Shouldn't this be the other way around? " In SSIS 2008 and SSIS 2008 R2, the Lookup Transformation provides this built-in output to catch records in the Source table (the dbo.Contact destination table, in this case) that do not exist in the lookup (the Person.Contact table) - it's the Lookup No Match Output."
September 16, 2011 at 9:00 am
I just finished reading your SSIS articles and they are an excellent introduction. I just can not wait to see other articles to handle updates an deletes. I hope I will not have to wait too long...
Is there any way to simulate the same as Replication and keep a log of insert, update and deletes in the source so that only those manipulations are handled in the destination ? It seems to me it is a waste of resources to have to do lookups periodically to detect changes between source and destination when SQLServer already does that job of tracking changes. Any suggestions?
Keep the good work.
Stephane
September 16, 2011 at 9:40 am
Hi Raghav,
You're getting ahead of me. Step 5...
:{>
Andy Leonard, Chief Data Engineer, Enterprise Data & Analytics
September 16, 2011 at 9:42 am
Hi NYSystemAnalyst,
That's a poorly-worded sentence. My apologies. The No Match output exists in 2008 and R2.
:{>
Andy Leonard, Chief Data Engineer, Enterprise Data & Analytics
September 16, 2011 at 9:45 am
Thanks. This is a great series of articles and look forward to reading more.
September 16, 2011 at 9:48 am
Thanks Stephane,
There is a way to create audit trails of record states, but that topicis beyond the scope of this series. I can get you started offline. Email me at andy.leonard@gmail.com.
:{>
Andy Leonard, Chief Data Engineer, Enterprise Data & Analytics
September 19, 2011 at 12:17 pm
This is a great (and timely) article. I'm looking forward to the next installment. I watched your SQL Lunch presentation on re-executable T-sql and it has changed my life. Thanks.
September 21, 2011 at 10:41 am
You have great skill in explaining things. Thanks for taking the time to share your information in a way that is actually helpful!
September 21, 2011 at 11:18 am
Hi JJ,
I have an excellent editor, Steve Jones.
:{>
Andy Leonard, Chief Data Engineer, Enterprise Data & Analytics
October 1, 2012 at 8:05 am
Hi Andy,
Thanks for this great article!!! I have a question; what if the Person.Contacts table is located on an another server, what would you do? We definitely don't want to get all rows as it impacts performance and may have timeout issues for huge data. How can we join Person.Contact and dbo.Contact when they are located on different servers, then only select the rows we need to insert from Person.Contact?
Thanks again!
December 14, 2012 at 5:39 am
Andy I'am enjoing the series but I have a small problem with the 'Update'.
I'ts working, but it only updates about 700 records each run, I can't find any option to change this behavior.
The only thing I can imagine is that I use a very limited virtual server and that this is caused by the Full Cache option of the Lookup.
But I see the correct number of records at each Data Flow.
I do see 2 warnings, but I don't think they have any thing to do with the update problem:
[SSIS.Pipeline] Warning: Warning: Could not open global shared memory to communicate with performance DLL; data flow performance counters are not available. To resolve, run this package as an administrator, or on the system's console.
and
[SSIS.Pipeline] Warning: The output column "LkUp_BusinessEntityID" (131) on output "Lookup Match Output" (103) and component "Lookup" (101) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance.
Rob
March 29, 2013 at 2:19 pm
I agree that the second method is better than the first, but the duration of each method can only be measured correctly if you clear the procedure cache for the previous dataset. 😉
Viewing 15 posts - 1 through 15 (of 22 total)
You must be logged in to reply to this topic. Login to reply