Agent Job Ran Out of Schedule

  • We have set of jobs scheduled as part of some home-grown log shipping.  One of the jobs is scheduled to run every ten minutes at HH:m2 i.e. 00:02,00:12,00:22.  This job copies the log from the network share to the target server and calls the next job in the series.  On Sunday morning I got a call because our reporting server was running out of space.  In a nutshell, some files weren't copied and that stopped files being restored which stopped files being deleted which meant we ran out of space.

    After digging into why the files weren't copied I eventually spotted that at 02:00 BST precisely on Sunday morning the copy job ran.  It ran again at 02:02 BST as scheduled and then at 02:12 BST and so on.  The upshot of this was a file got moved early and this caused the PoSh copy scripts to skip an hours worth of files until the timelines matched up again.

    What I'm intrigued by is why a job ran seemingly at random.  The 02:00 job should not have happened at all.  The previous execution at 00:52 UTC was successful and the next run should have been at 02:02 BST not precisely on the hour.  I suspect it's significant that at 01:00 UTC the clocks went forward to 02:00 BST but I don't see how.  This happened across roughly 40 jobs on 4 servers which makes it even more difficult to work out what went on.

    Can anybody help?

    Additional

    After a bit more investigating it appears that all the jobs we have that are scheduled to run throughout the day ran at precisely 02:00 BST regardless of their actual schedule pattern.  It seems it is almost certainly related to the time change.

    • This topic was modified 4 years, 7 months ago by  Neil Burton.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • That's really strange. I wouldn't expect the time change to reset the time to 2:00, unless the job somehow thought it had missed 2:02 and then ran with the time change?  Very strange.

  • Any chance this is malicious activity? No one kicked things off accidentally? Can you post the schedule code for these? Any similarities in them?

  • Steve Jones - SSC Editor wrote:

    Any chance this is malicious activity? No one kicked things off accidentally? Can you post the schedule code for these? Any similarities in them?

    I think it's extremely unlikely it was malicious or accidental.  The jobs were run by the service account that normally runs them and from what I remember, were called by the schedule.  (The weekly housekeeping cleaned up the history and I've lost a load of the evidence).  I'm not entirely sure what you mean by post the schedule code but I've had a look in the sysschedules table and they all look as they should, apart from the expected differences in intervals and names.  There are also no weird one-off schedules that could have triggered them.

    It's very puzzling and unfortunately it's probably going to be six months before I get chance to investigate again.  I wouldn't be surprised to see the same kind of thing when the clocks go back.


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • I'd save the logs and scripts for the jobs and recheck when the clocks go back. There I'd expect perhaps a double run. Here, going from 1->2 (or 2-> 3) maybe this was the strange timing.

    Post the scheduled code - post the DDL for the schedule for the job(s) or at least one that looked weird.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply