May 15, 2018 at 7:18 am
I have two tables in separate databases on separate servers. I have to pull data from one table filtered by the other table. I could use SSIS merge join or I could pull all the data (INSERT within the data flow component) and then filter the data based on the other table (DELETE via the SQL task). The result is the same.
I am curious in general in the differences of the two approaches. I offered no details because I am not concerned about my specific case as it works adequately either way. I'd just like to understand the differences better. My main concern is that it feels inefficient to effectively join across servers via the SSIS merge join.
Thanks.
May 15, 2018 at 7:28 am
dgl - Tuesday, May 15, 2018 7:18 AMI have two tables in separate databases on separate servers. I have to pull data from one table filtered by the other table. I could use SSIS merge join or I could pull all the data (INSERT within the data flow component) and then filter the data based on the other table (DELETE via the SQL task). The result is the same.I am curious in general in the differences of the two approaches. I offered no details because I am not concerned about my specific case as it works adequately either way. I'd just like to understand the differences better. My main concern is that it feels inefficient to effectively join across servers via the SSIS merge join.
Thanks.
Merge join in SSIS performs badly at scale and is to be avoided if possible.
The INSERT/DELETE option is also to be avoided, unless the number of rows being deleted is small.
Instead, I'd move the data to a (truncated and appropriately indexed) staging table on the target server and then use that to do your filtered INSERT.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
May 15, 2018 at 7:55 am
"Instead, I'd move the data to a (truncated and appropriately indexed) staging table on the target server and then use that to do your filtered INSERT."
This is what I was trying to describe as my second option. That is what I chose to do. I just wondered if I was missing something positive by not using the merge join.
Thanks!
May 15, 2018 at 8:36 am
dgl - Tuesday, May 15, 2018 7:55 AMThis is what I was trying to describe as my second option.
Your second option clearly mentions DELETEs. My suggestion requires none, other than the initial TRUNCATE of the staging table.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply