August 29, 2009 at 9:53 am
I have a design scenario to which i need your expertice. I want to automate the ETL process.I want to process the ETL for three applications A,B and C one after another. If there is any error in any one of three application, i want to restart the application only from that application. say application A and B processed succesfully and there is an error in apllication C during ETL and process is stoped. In that case we will go ahead and fix the error for application C and then we need to restart only from C onwards. can we restart from one particular package within a package. does sql server scduling agent help?
So the question is how would i design in this scenario. Do i need to create a package for each application and then create a main package for these three packages?
In the aboove scenario, how would i restart the application after fixing the error in C.
What is the ideal scenario and how would you do it if you were in my situaton.
Thanks for your time.
August 29, 2009 at 4:07 pm
You have an interesting problem that cannot use the all or none ADO.NET Command object Transaction operation and you cannot use unit of work transaction with SavePoint because you want your operation to stop and continue. This means you are not doing ETL or you have skipped design that will clean up the need for this execution model.
You could run the packages in one application, start over after failure and do the same thing with the other applications. You need some service to manage your execution and transactions including transaction SavePoints.
Kind regards,
Gift Peddie
August 29, 2009 at 6:00 pm
Peddie,
Thanks for your reply. I don't think i understand your answer. I am new to BIDS and learning righrt now. I would apprecite if you could explain it little bit more on this.
let me explain myscenario again here.
All these three applicatio data are in the sql server. I start the etl proccess and let it complete athough some errro. say, aapplication A is processed successfully without any error(no error in the eror table) then it kicks off application B and then finally C. All data for these three application are loaded into staging table. At the end of all three application, i check the error table to see any error on any particular application before i load the data into Destination bi mart.
suppose there are some errors in Application C and there is no error in app A and B. That means all good data are loaded for application A, B and C except the bad data in C. I would go ahead fix the bad data for C in the SOURCE, delete the good data that is loaded for app C from staging table and then i want to process ETL for app C only. I don't want to process app A and B. So my intention is basically to save the processing time of A and B.
Thanks
August 29, 2009 at 6:22 pm
If all you need is staging table you can run all three processes separately when one package is completed you start the run for the second and third. What you are running can affect data integrity because you are trying to run all concurrently.
Kind regards,
Gift Peddie
August 29, 2009 at 8:25 pm
As you suggested, i can run one after another. how can i make it a automatic process? create one ETL package for each aplication and then create a another package from which then call these three one after another sequencially? can you please step by step?
August 29, 2009 at 9:28 pm
I have given you an implementable plan now you need to look at your processes and design and build a solution. The first step is to build an automated package that will pull the data from the first application after you get that to work the rest is a variation of the first. If I give a you step by step then I am doing your work. Good luck.
Kind regards,
Gift Peddie
August 29, 2009 at 10:02 pm
You need to handle this witrh meta data in SQL side. Create one runid for each ETL execution. Unless all the tasks doesnt completed successfully dont create new runid. And Against runid you can add task details with task starttime and endtime. And before starting each task check for end time null so that once it get executed it will never get executed untill you create new runid.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply