May 16, 2008 at 10:48 am
Basically the issue is-
If i run t-sql srcipts, to pull data from the staging area, remove duplicates, validate , data and perform lookups i can do it for each of the 6 channels that i pull data from in one day.
However, replicating the same using SSIS packages takes about a day(at best!!) for Each channel.
I used these tools/packages to basically automate all my ETL/Data cleansing tasks, it works perfectly fine, but takes agonizingly long.
What can i do to improve the performance? i mean i have done all the basic things like setting the cache levels when i do a lookup etc. Or should i just use my T-SQL scripts.
May 16, 2008 at 11:13 am
If you have TSQL scripts that will do what you need, I would put them into a job and let it do its thing. SSIS does have its place, specifically when moving data to/from different formats but I wouldn't bother with having it run scripts that can run simply inside of a job.
May 19, 2008 at 10:09 am
Thanks, however, now that i built it, i am looking work/experiment with it for a few days.
I am now facing the following situation-(am not sure if i should create separate topic for this)--
In one package
1] Import data(source)
2] Multicast to 3 Lookups
3] I then do insert and update accordingly
There are over 8000 rows in the source table, when the package executes, it caches 4,300 rows (since i set it to full cache mode), then does a lookup of 2,100, 4300 and 2161 rows.
It then does an update or insert accordingly, but the problem is when it does this forst 'batch' , the flow 'pauses'. I am not sure at this point what is going on.
Has the package finishes executing, has it run into an error(nothing is visible since it stays in the "yellow" ) mode.
It inserts the "error" fields also correctly, so the settings looks correct, but im not sure why it seems to hang.
IF anyone has any idea please do let me know...!
thanks
May 20, 2008 at 4:17 am
I noticed the indefinite yellow state when destination table was locked. maybe you could run table operations in sequence?
Do you use fast load destination for inserts?
Piotr
...and your only reply is slàinte mhath
May 20, 2008 at 2:18 pm
Yes i use the fast load option when finally inserting into an OLE DB destination, what is the difference between the various options?
May 20, 2008 at 6:13 pm
Well the options are documented in help, one of them allows for locking the table during insert. This in my opinion can cause blocking. These amounts of rows are not very dramatic, What is performance of data sources and data destinations? Do you call sql statement for each of the rows? Can you indicate task or component that takes longest time to turn green apart from the blocking ones?
Piotr
...and your only reply is slàinte mhath
May 21, 2008 at 11:10 am
Underdog,
Have these packages been deployed to a production or are you just running them locally? Just a hunch but that may be why you're seeing such shotty performance...
Also, you said that your T-SQL scripts run from a staging area. This implies there is already some ETL processing that occurs before your T-SQL scripts are executed. Do you have direct access to the source files? If so you may look at taking the source data and performing all the transforms in memory (new feature SSIS brings to the table) which could add some value towards your performance goals overall...
Ben Sullins
bensullins.com
Beer is my primary key...
January 7, 2013 at 6:32 am
I am making Lakhs of WCF calls from SSIS. What is the best way to do it? Performance is key concern here.
January 7, 2013 at 6:49 am
Gurinder-356063 (1/7/2013)
I am making Lakhs of WCF calls from SSIS. What is the best way to do it? Performance is key concern here.
This is a 4+ year old thread. You really should post new questions in their own thread.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply