Rollback the job running very long time

  • Dears,

    I encounter this question:

    We scheduled SSIS Packages, generally, they run ok, but sometimes somejob run for very long time even two days. It neither fail nor success, but executing...

    So I want whether there are some methods to monitor job's running time, if it runs more than one or two... hours, stop it and best to email administrators.

    I considered for a long time ,but haven't any good idear, maybe we can discuss the topic together.:P

  • Get some ideas from http://www.sqlservercentral.com/Forums/Topic504330-338-1.aspx

    MJ

  • If a job that is cancelled after running a long time has been doing SQL insert/update/delete logic all that time, it will probably take longer to roll back than it took to the time it was cancelled.

    If a job that is cancelled after running a long time has been spending most of its time in CPU activity and has done very little SQL insert/update/delete work it should roll back fairly quickly.

    You need to be aware of what your job is doing before you cancel it.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • Ditto to the last post. Don't kill a rollback unless you have extenuating circumstances and don't care about possible data / log file corruption. By extenuating circumstances, I'm talking about "Server down, no one can work, losing millions of dollars an hour while the rollback happens".

    Even then, you need to know what you'll be killing before you kill it and whether or not you can recover if you destroy the database or even just a few records in the process.

    Just let the rollback ride.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Thanks for your advices!

    But, if the job executes very long time even more than two days, it will block the other jobs. Finally, result in almost all the jobs fail. So we want to be aware of this when the job runs longer time than normal too much. Then we can troubleshoot the case before it block others...

  • The real key here is... you shouldn't have jobs that run that long nor get "stuck". Find out what the root cause is and fix it. Chances are, the code is in bad need of a fix.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yeah, I understand that. But the question is that the job runs well all the time, it amazing or marvelous only sometimes. So I want to make an Auto_monitor for them, when they unnatural, I can check the root cause at the earliest possible time.

  • Hi,

    You must be having some idea on the time taken for each job to get completed on your server. Suppose if a job executes for 20 mins give it a 10 mins grace time and after 30 mins check for the status and if it does not gets executed then you can alert yourself through the job. I think below query might help you on how much time it may take to complete.

    select percent_complete,estimated_completion_time,total_elapsed_time,* from sys.dm_exec_requests

    Thanks

    Chandra Mohan N

    [font="Verdana"]Thanks
    Chandra Mohan[/font]

  • Sounds like you need to run Profiler when you run this job so you can track down exactly where the problem occurs. I'm betting your job gets suspended in the middle of itself because it's trying to compete with itself for resources.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • sunny1_liu (2/15/2009)


    Yeah, I understand that. But the question is that the job runs well all the time, it amazing or marvelous only sometimes. So I want to make an Auto_monitor for them, when they unnatural, I can check the root cause at the earliest possible time.

    Sorry, Sunny... didn't mean to imply anything bad on your part. It's just that most folks don't even think of that. Glad to see you're on the ball.

    I'm sure you'll agree... I hate intermittant problems. Profiler can help, but when such a thing happens on my systems, I find that I end up having to add some code to the offending proc to log things like what the input parameters where, what the row count of each section was, what the duration of each section was, and I turn on a proc that runs once a minute that checks for blocking and, if present for more than a couple of minutes, sends me an email with the spids and other information so I can go look at what's happening while it's in-process.

    The standards I set up for companies that I'm new to, requires such things to be built in from the git.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you Jeff. I agree with your views.

    But in my opinion, profiler may consume many system resources, so we hardly turn it on unless really necessary.

  • I usually don't turn it on either... especially when I know which proc is causing the problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • To add a little to this, if you do want check for long running jobs and it is a scheduled job within the SQL Agent, you can use the xp_sqlagent_enum_jobs in SQL 2000 and 2005. It only works in pretty specific scenarios (job has to be scheduled) but it does work under that scenario. By converting the date and time pieces into a datetime field you can create a threshold (2 hours I think was what you mentioned) and set up something to email or page you if a job runs beyond that threshold. I have been using this since SQL 2000 and it has worked well and have it going on one instance of 2005. Do a search on that XP and you should find some good information if you would like to have something that proactively emails you based on duration of a job.

  • sunny1_liu (2/16/2009)


    But in my opinion, profiler may consume many system resources, so we hardly turn it on unless really necessary.

    So don't use profile. You shouldn't be using profiler on a production system anyway. Use a server-side trace. Much less overhead, especially if the trace file is written to a fast, unused drive.

    If you're not tracing regularly, how do you know what's normal performance and what isn't?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • sunny1_liu (2/16/2009)


    But in my opinion, profiler may consume many system resources, so we hardly turn it on unless really necessary.

    If you do profiler right, it won't consume as many resources as you may think. And the reason I suggested it is because this *does* sound like a necessary situation.

    You know what job is causing the problem, but you don't know which part of the job or proc is causing the problem. Profiler, or a server-side trace which is Profiler without the GUI, will help you determine where exactly the problem is in this job / proc. Then you can fix that specific issue instead of wasting a lot of time guessing.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply