April 16, 2011 at 7:53 am
Hi, we have quite a basic SSIS task running and the performance isn't great. Our SSIS server is separate to the DB Server.
The source of the compiled script task is a select on a table with over a million rows, it then calls a webservice to get updated information for each row before inserting a new row into another table. It is doing this one row at a time. The DB Server is busy but not maxed out with each insert taking milliseconds but the throughput from the SSIS server should be higher as its taking days when I would expect hours which we have seen before on similar volumes.
I was thinking about batch size settings but not sure if they are available for .net provider. The table has no indexing and DB simple recovery model.
Any ideas?
Steve.
April 16, 2011 at 8:19 am
At first glance I would say that insert performance is not your problem.
Simple math, 86,400 seconds in a day OR 86,400,000ms. Even if every insert took 100ms you could do 864,000 inserts a day.
I tend to think that the query of the webservice is really where performance becomes an issue.
Can you explain why you need updates of the million records? Is there no better process?
CEWII
April 21, 2011 at 6:46 am
The web service is DEFINITELY the issue. How many records are hosted in your web service? Is it possible to load data from it in bulk ? Instead of asking for one record, can you ask for 1000 records or more in one call to the web service?
April 21, 2011 at 6:53 am
Thanks for the replies. This is agency data which enriches and corrects our dataset which is why we run over the top of all records each month.
The database is fine but can handle a lot more inserts per second, it simply isn't getting them.
The webservice is one by one and can't alter that, however we load all records into memory before cycling through so we intend to store the results in memory before using the batch insert method to get it into the database.
Online resourses would suggest this method is up to 4x faster than sending batches of inserts through.
Steve.
April 21, 2011 at 7:01 am
Steve Newton (4/21/2011)
Thanks for the replies. This is agency data which enriches and corrects our dataset which is why we run over the top of all records each month.The database is fine but can handle a lot more inserts per second, it simply isn't getting them.
The webservice is one by one and can't alter that, however we load all records into memory before cycling through so we intend to store the results in memory before using the batch insert method to get it into the database.
Online resourses would suggest this method is up to 4x faster than sending batches of inserts through.
Steve.
Hi Steve,
Don't waste time trying to optimize the insertion speed. No matter how slow the individual insert is, it will not take days to insert 1 mil records.
Your problem is the web services calls. You have to realize that each call has to be translated and packed into a SOAP envelope, then unpacked and translated into the respective format on the server side. Include also the time this information has to travel back and forth over the network and there it goes your issue.
You have two options:
- Find a way not to process 1 mil records every time, but only the newer or updated records.
- Ask your third-party to enhance the web service to be able to check not one record, but a list of records in one call.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply