Rollback the job running very long time

Question

Post reply

Rollback the job running very long time

Sunny Liu

Ten Centuries

Points: 1234
More actions
February 11, 2009 at 11:45 pm

#134668

Dears,
I encounter this question:
We scheduled SSIS Packages, generally, they run ok, but sometimes somejob run for very long time even two days. It neither fail nor success, but executing...
So I want whether there are some methods to monitor job's running time, if it runs more than one or two... hours, stop it and best to email administrators.
I considered for a long time ,but haven't any good idear, maybe we can discuss the topic together.:P

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

MANU-J. SSC-Dedicated Points: 31126 More actions · Answer 1

Get some ideas from http://www.sqlservercentral.com/Forums/Topic504330-338-1.aspx

MJ

EdVassie SSC Guru Points: 60472 More actions · Answer 2

If a job that is cancelled after running a long time has been doing SQL insert/update/delete logic all that time, it will probably take longer to roll back than it took to the time it was cancelled.

If a job that is cancelled after running a long time has been spending most of its time in CPU activity and has done very little SQL insert/update/delete work it should roll back fairly quickly.

You need to be aware of what your job is doing before you cancel it.

Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

Brandie Tarvin SSC Guru Points: 173068 More actions · Answer 3

Ditto to the last post. Don't kill a rollback unless you have extenuating circumstances and don't care about possible data / log file corruption. By extenuating circumstances, I'm talking about "Server down, no one can work, losing millions of dollars an hour while the rollback happens".

Even then, you need to know what you'll be killing before you kill it and whether or not you can recover if you destroy the database or even just a few records in the process.

Just let the rollback ride.

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Sunny Liu Ten Centuries Points: 1234 More actions · Answer 4

Thanks for your advices!

But, if the job executes very long time even more than two days, it will block the other jobs. Finally, result in almost all the jobs fail. So we want to be aware of this when the job runs longer time than normal too much. Then we can troubleshoot the case before it block others...

Jeff Moden SSC Guru Points: 1004372 More actions · Answer 5

The real key here is... you shouldn't have jobs that run that long nor get "stuck". Find out what the root cause is and fix it. Chances are, the code is in bad need of a fix.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sunny Liu Ten Centuries Points: 1234 More actions · Answer 6

Yeah, I understand that. But the question is that the job runs well all the time, it amazing or marvelous only sometimes. So I want to make an Auto_monitor for them, when they unnatural, I can check the root cause at the earliest possible time.

Chand00 SSCertifiable Points: 6466 More actions · Answer 7

Hi,

You must be having some idea on the time taken for each job to get completed on your server. Suppose if a job executes for 20 mins give it a 10 mins grace time and after 30 mins check for the status and if it does not gets executed then you can alert yourself through the job. I think below query might help you on how much time it may take to complete.

select percent_complete,estimated_completion_time,total_elapsed_time,* from sys.dm_exec_requests

Thanks

Chandra Mohan N

[font="Verdana"]Thanks
Chandra Mohan[/font]

Brandie Tarvin SSC Guru Points: 173068 More actions · Answer 8

Sounds like you need to run Profiler when you run this job so you can track down exactly where the problem occurs. I'm betting your job gets suspended in the middle of itself because it's trying to compete with itself for resources.

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Jeff Moden SSC Guru Points: 1004372 More actions · Answer 9

sunny1_liu (2/15/2009)
Yeah, I understand that. But the question is that the job runs well all the time, it amazing or marvelous only sometimes. So I want to make an Auto_monitor for them, when they unnatural, I can check the root cause at the earliest possible time.

Sorry, Sunny... didn't mean to imply anything bad on your part. It's just that most folks don't even think of that. Glad to see you're on the ball.

I'm sure you'll agree... I hate intermittant problems. Profiler can help, but when such a thing happens on my systems, I find that I end up having to add some code to the offending proc to log things like what the input parameters where, what the row count of each section was, what the duration of each section was, and I turn on a proc that runs once a minute that checks for blocking and, if present for more than a couple of minutes, sends me an email with the spids and other information so I can go look at what's happening while it's in-process.

The standards I set up for companies that I'm new to, requires such things to be built in from the git.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sunny Liu Ten Centuries Points: 1234 More actions · Answer 10

Thank you Jeff. I agree with your views.

But in my opinion, profiler may consume many system resources, so we hardly turn it on unless really necessary.

Jeff Moden SSC Guru Points: 1004372 More actions · Answer 11

I usually don't turn it on either... especially when I know which proc is causing the problem.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Mike - CI SSCarpal Tunnel Points: 4146 More actions · Answer 12

To add a little to this, if you do want check for long running jobs and it is a scheduled job within the SQL Agent, you can use the xp_sqlagent_enum_jobs in SQL 2000 and 2005. It only works in pretty specific scenarios (job has to be scheduled) but it does work under that scenario. By converting the date and time pieces into a datetime field you can create a threshold (2 hours I think was what you mentioned) and set up something to email or page you if a job runs beyond that threshold. I have been using this since SQL 2000 and it has worked well and have it going on one instance of 2005. Do a search on that XP and you should find some good information if you would like to have something that proactively emails you based on duration of a job.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 13

sunny1_liu (2/16/2009)
But in my opinion, profiler may consume many system resources, so we hardly turn it on unless really necessary.

So don't use profile. You shouldn't be using profiler on a production system anyway. Use a server-side trace. Much less overhead, especially if the trace file is written to a fast, unused drive.

If you're not tracing regularly, how do you know what's normal performance and what isn't?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Brandie Tarvin SSC Guru Points: 173068 More actions · Answer 14

sunny1_liu (2/16/2009)
But in my opinion, profiler may consume many system resources, so we hardly turn it on unless really necessary.

If you do profiler right, it won't consume as many resources as you may think. And the reason I suggested it is because this *does* sound like a necessary situation.

You know what job is causing the problem, but you don't know which part of the job or proc is causing the problem. Profiler, or a server-side trace which is Profiler without the GUI, will help you determine where exactly the problem is in this job / proc. Then you can fix that specific issue instead of wasting a lot of time guessing.

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.