In this article I’ll show you how to eliminate periodic spikes in the CPU that are caused by the jobs that execute periodically. Figure 1 shows such a situation. You can see the spikes occurring every minute.
Figure 1. CPU spikes caused by periodic-scheduled jobs
There is a good reason to take some steps toward reducing the spikes. When the CPU usage increases, it will eventually hit the ceiling for this hardware and make the instance not responsive at times.
At first I didn’t know where the spikes were coming from. After doing some tuning, I found they are caused by a few jobs. Then I used this query to find the jobs' scheduling details as well as some other info about them.
/*Script source: http://www.sqlprofessionals.com/blog/sql-scripts/2014/10/06/insight-into-sql-agent-job-schedules/ */SELECT [JobName] = [jobs].[name] ,[Category] = [categories].[name] ,[Owner] = SUSER_SNAME([jobs].[owner_sid]) ,[Enabled] = CASE [jobs].[enabled] WHEN 1 THEN 'Yes' ELSE 'No' END ,[Scheduled] = CASE [schedule].[enabled] WHEN 1 THEN 'Yes' ELSE 'No' END ,[Description] = [jobs].[description] ,[Occurs] = CASE [schedule].[freq_type] WHEN 1 THEN 'Once' WHEN 4 THEN 'Daily' WHEN 8 THEN 'Weekly' WHEN 16 THEN 'Monthly' WHEN 32 THEN 'Monthly relative' WHEN 64 THEN 'When SQL Server Agent starts' WHEN 128 THEN 'Start whenever the CPU(s) become idle' ELSE '' END ,[Occurs_detail] = CASE [schedule].[freq_type] WHEN 1 THEN 'O' WHEN 4 THEN 'Every ' + CONVERT(VARCHAR, [schedule].[freq_interval]) + ' day(s)' WHEN 8 THEN 'Every ' + CONVERT(VARCHAR, [schedule].[freq_recurrence_factor]) + ' weeks(s) on ' + LEFT( CASE WHEN [schedule].[freq_interval] & 1 = 1 THEN 'Sunday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 2 = 2 THEN 'Monday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 4 = 4 THEN 'Tuesday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 8 = 8 THEN 'Wednesday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 16 = 16 THEN 'Thursday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 32 = 32 THEN 'Friday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 64 = 64 THEN 'Saturday, ' ELSE '' END , LEN( CASE WHEN [schedule].[freq_interval] & 1 = 1 THEN 'Sunday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 2 = 2 THEN 'Monday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 4 = 4 THEN 'Tuesday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 8 = 8 THEN 'Wednesday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 16 = 16 THEN 'Thursday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 32 = 32 THEN 'Friday, ' ELSE '' END + CASE WHEN [schedule].[freq_interval] & 64 = 64 THEN 'Saturday, ' ELSE '' END ) - 1 ) WHEN 16 THEN 'Day ' + CONVERT(VARCHAR, [schedule].[freq_interval]) + ' of every ' + CONVERT(VARCHAR, [schedule].[freq_recurrence_factor]) + ' month(s)' WHEN 32 THEN 'The ' + CASE [schedule].[freq_relative_interval] WHEN 1 THEN 'First' WHEN 2 THEN 'Second' WHEN 4 THEN 'Third' WHEN 8 THEN 'Fourth' WHEN 16 THEN 'Last' END + CASE [schedule].[freq_interval] WHEN 1 THEN ' Sunday' WHEN 2 THEN ' Monday' WHEN 3 THEN ' Tuesday' WHEN 4 THEN ' Wednesday' WHEN 5 THEN ' Thursday' WHEN 6 THEN ' Friday' WHEN 7 THEN ' Saturday' WHEN 8 THEN ' Day' WHEN 9 THEN ' Weekday' WHEN 10 THEN ' Weekend Day' END + ' of every ' + CONVERT(VARCHAR, [schedule].[freq_recurrence_factor]) + ' month(s)' ELSE '' END ,[Frequency] = CASE [schedule].[freq_subday_type] WHEN 1 THEN 'Occurs once at ' + STUFF(STUFF(RIGHT('000000' + CONVERT(VARCHAR(8), [schedule].[active_start_time]), 6), 5, 0, ':'), 3, 0, ':') WHEN 2 THEN 'Occurs every ' + CONVERT(VARCHAR, [schedule].[freq_subday_interval]) + ' Seconds(s) between ' + STUFF(STUFF(RIGHT('000000' + CONVERT(VARCHAR(8), [schedule].[active_start_time]), 6), 5, 0, ':'), 3, 0, ':') + ' and ' + STUFF(STUFF(RIGHT('000000' + CONVERT(VARCHAR(8), [schedule].[active_end_time]), 6), 5, 0, ':'), 3, 0, ':') WHEN 4 THEN 'Occurs every ' + CONVERT(VARCHAR, [schedule].[freq_subday_interval]) + ' Minute(s) between ' + STUFF(STUFF(RIGHT('000000' + CONVERT(VARCHAR(8), [schedule].[active_start_time]), 6), 5, 0, ':'), 3, 0, ':') + ' and ' + STUFF(STUFF(RIGHT('000000' + CONVERT(VARCHAR(8), [schedule].[active_end_time]), 6), 5, 0, ':'), 3, 0, ':') WHEN 8 THEN 'Occurs every ' + CONVERT(VARCHAR, [schedule].[freq_subday_interval]) + ' Hour(s) between ' + STUFF(STUFF(RIGHT('000000' + CONVERT(VARCHAR(8), [schedule].[active_start_time]), 6), 5, 0, ':'), 3, 0, ':') + ' and ' + STUFF(STUFF(RIGHT('000000' + CONVERT(VARCHAR(8), [schedule].[active_end_time]), 6), 5, 0, ':'), 3, 0, ':') ELSE '' END ,[AvgDurationInSec] = CONVERT(DECIMAL(18, 2), [jobhistory].[AvgDuration]) ,[Next_Run_Date] = CASE [jobschedule].[next_run_date] WHEN 0 THEN CONVERT(DATETIME, '1900/1/1') ELSE CONVERT(DATETIME, CONVERT(CHAR(8), [jobschedule].[next_run_date], 112) + ' ' + STUFF(STUFF(RIGHT('000000' + CONVERT(VARCHAR(8), [jobschedule].[next_run_time]), 6), 5, 0, ':'), 3, 0, ':')) END FROM [msdb].[dbo].[sysjobs] AS [jobs] WITh(NOLOCK) LEFT OUTER JOIN [msdb].[dbo].[sysjobschedules] AS [jobschedule] WITh(NOLOCK) ON [jobs].[job_id] = [jobschedule].[job_id] LEFT OUTER JOIN [msdb].[dbo].[sysschedules] AS [schedule] WITh(NOLOCK) ON [jobschedule].[schedule_id] = [schedule].[schedule_id] INNER JOIN [msdb].[dbo].[syscategories] [categories] WITh(NOLOCK) ON [jobs].[category_id] = [categories].[category_id] LEFT OUTER JOIN ( SELECT [job_id], [AvgDuration] = (SUM((([run_duration] / 10000 * 3600) + (([run_duration] % 10000) / 100 * 60) + ([run_duration] % 10000) % 100)) * 1.0) / COUNT([job_id]) FROM [msdb].[dbo].[sysjobhistory] WITh(NOLOCK) WHERE [step_id] = 0 GROUP BY [job_id] ) AS [jobhistory] ON [jobhistory].[job_id] = [jobs].[job_id]; GO
The script gives you a very good breakdown for the jobs. I found out that I have one group of periodic jobs scheduled every minute (Figure 2), at 00 seconds past the minute. There was also another group of jobs running at a five minute interval.
Figure 2. Periodic jobs scheduled every one minute
These jobs are making periodic updates in the databases of the instance. Sometimes the updates affect many rows so that the spikes are higher and wider. Sometimes the spikes aggregate the impact of the five minute scheduled jobs as they, too, were scheduled to run every five minutes starting at 00 seconds past the minute. All the jobs were starting at the same moment, at 00:00 (Figure 3), i.e. in the first second of the minute. When they all start at the same moment, they cause the CPU usage to jump immediately, causing those spikes.
Figure 3. Example scheduling for the periodic jobs
I changed the jobs schedules, so they start at different moments (Figure 4). I made a simple alteration for the jobs schedules: the first job continued to execute on the 00 second of each minute, the second job was rescheduled to start its running in the 5th (05) second of each minute, and the third job was rescheduled to start in the 10th (10) second of each minute.
Figure 4. The one-minute interval jobs rescheduled
This simple naming convention in Figure 4 for the periodic jobs helps me to know how to schedule the next one-minute periodic job that will be added in the future. I will also know what scheduling to set up for it instantly, so that some kind of planning and regulation is preventing us from having extremely high spikes caused by the jobs executions. Of course, the rescheduling can be done some other ways that would be good too, or even better on another instance.
After several minutes running I got the following figure for the CPU (Figure 5).
Figure 5. CPU usage after rescheduling the one-minute periodic jobs
You can see the different behavior of the CPU usage by reducing the spikes in the right half of Figure 5. Then I continued my rescheduling for the other schedules of the group of jobs running every five minutes. After that, I got even smoother execution of the CPU (Figure 6).
Figure 6. CPU spikes after rescheduling all periodic jobs
Ending
The scheduling of the jobs can be important for the CPU usage. As we’ve seen in Figure 1, there were quite high spikes caused by the not well thought through scheduling of the jobs. All the jobs were running quite frequently on every minute, so having spikes such frequently is not good for the instance or my clients.
This instance had low CPU usage so the spikes were not dangerous or alerting any triggers. However, in the near future we plan to up the average CPU usage significantly, and then we would have probably had those spikes hitting the ceiling, if not prevented ahead of time.