March 28, 2022 at 12:56 pm
I know in SQL Server jobs, if a step fails, you can set the number of retries and interval. What I want to do though is if a step fails, retry the entire job from Step 1, not just the individual step.
The reason I need to do it this way, is that the first step truncates the various tables, then steps 2 - 15 load data into the table from various files. So if step #4 fails and half the file is loaded, if I retry step #4, we will have duplicate records. I realize the best solution is to rework the SQL Job so I can retry each step, but until we get that accomplished, restarting the entire job would at least provide a solution until we go through all the various development cycles to rework this job.
However, I cannot seem to find a method to retry the entire job if a step fails.
March 28, 2022 at 2:32 pm
I can't see a way of doing that in SQL Agent.
What do the other steps do? Is this a sequence of T-SQL / stored procs, or do the steps include calls to other technologies (SSIS/SSAS/command line etc)?
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
March 28, 2022 at 2:48 pm
We get files that need to be imported daily into our SQL Server because we do not have direct connections to import the data. The import of each file is an SSIS package and there are about 15 files that need imported. The first step in the job truncates the table so the data if fresh each morning. So if a step fails, restarting that individual step will reload that data for that step resulting in duplicates.
I know how I can rework this SQL Job to avoid this and it is on my plate, but until I am able to get to rework the actual job and code, was thinking that if a step failed, simply rerunning the entire SQL Job is my best option so that I know it is not loading duplicates. Hopefully I explained this well enough, but if not, please let me know.
March 28, 2022 at 3:39 pm
If step 4 fails, in the setting of the step, can you simply set it to go to step 1?
Of course, you run the risk of going in an endless loop
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
March 28, 2022 at 3:39 pm
Can you set step 4 to go to step 1 on failure?
This does not allow you to limit the number of times it will try to restart and could end up in an infinite loop, but if you have an alert that the step failed you can intervene if it keeps failing.
March 28, 2022 at 3:42 pm
Putting the truncate as step 1 in the SSIS package seems like the way to go – sounds like you've worked that out. This should be quick to implement.
Creating one Agent job per table is another (horrendously inelegant) possibility, with the added potential benefit of making some of the imports run in parallel.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
March 28, 2022 at 5:06 pm
I understand the risk. This is a stop gap until I can rework the SQL Job to avoid it. The reason it would fail is due to server load and resources.
March 28, 2022 at 5:08 pm
I understand the risk. This is a stop gap until I can rework the SQL Job to avoid it. The reason it would fail is due to server load and resources.
So there's no chance that one of the files could contain bad data?
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
March 28, 2022 at 5:23 pm
Phil:
"So there's no chance that one of the files could contain bad data?"
To answer that question, I cannot say there will "never" be bad data, but as long as this has been running that has not been an issue. The only time it has failed is due to resource issues with the server caused by this other team. This started about 1 month ago when another team that pushes data to our server changed their processes and now it bogs down our server to the point of jobs failing, inability to perform basic functions, etc. The bad thing is, their process is not predictable and every day is a gamble on what will fail / work. So until I get to change the actual routine on what was a process that worked with no issues to accommodate these new found issues, I am just hoping to stop gap the manual intervention that was needed to manually run this job multiple times a day until it succeeded.
March 28, 2022 at 5:29 pm
OK, sure. Just wanted to make sure that you've considered that before setting the 'On error, go back to start and try again' option!
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
March 28, 2022 at 5:33 pm
Rather than just going back to the first step, I would go to a later, isolated step in the job, and ultimately that step would go back to the first step.
The difference being that the special step could log the fact that an error occurred, count the errors and/or make other determinations as to whether it was ok to go to Step 1 again.
For example, say the job currently has 8 steps (for example). The eight step exits the job on both good and bad completion. Add a 9th step to do the error checking / looping. The only way to go to that step is by an error in a previous step.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
March 28, 2022 at 5:39 pm
I understand the risk. This is a stop gap until I can rework the SQL Job to avoid it. The reason it would fail is due to server load and resources.[/quot
I understand the risk. This is a stop gap until I can rework the SQL Job to avoid it. The reason it would fail is due to server load and resources.
Here's a possible dirty hack to prevent an infinite loop in the event of bad data. It should take only a few minutes to set up.
Create a table called dbo.retry with a single integer column retrycount with a single row, set to zero.
create table dbo.retry (retrycount int)
insert retry values (0)
Job starts at step 1
Step 1) update dbo.retry set retrycount = 0
Step 2) update dbo.retry set retrycount += 1
step3) Query retrycount and raiserror if too great
declare @count int,
@n int = 10
set @count = (select max(retrycount) from dbo.retry)
if @count > @n
begin
raiserror('too many failures', 16, 1)
end
step 4) Truncate tables
step 5) Insert 1
Step 6) insert 2 etc...
On failure for all the insert steps, go to step 2. On success go to next step.
This will allow you to restart the job if any of the insert steps fail, but you can fail the job if the number of retries gets too big. This prevents the infinite loop.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply