May 21, 2012 at 10:08 am
I am working with the data from msdb.dbo.sysjobhistory and want to take the average duration of each job. If you are not familiar with the table, here is a simplistic model:
create table #jobhistory
(
jobId uniqueidentifier
,run_duration int
)
These are the only columns necessary for my example.
run_duration is in the format of hhmmss, and if the job only took 3 seconds the value would be '3'.
Average of course doesn't return valid data. So I am trying to find the best way to find the average. Here is what I have so far.
create table #jobhistory
(
jobId int /* really uniqueidentifier, but using int for simplicity */
,run_duration int
)
insert into #jobhistory
select 1,3
union all select 1,31003
union all select 1,233
union all select 2,5
union all select 2,101
select jobid,
RIGHT('00'+CONVERT(VARCHAR,AVG((SUBSTRING(right('000000'+convert(varchar,run_duration),6),1,2) * 60 * 60)
+ (SUBSTRING(right('000000'+convert(varchar,run_duration),6),3,2) * 60)
+ (SUBSTRING(right('000000'+convert(varchar,run_duration),6),5,2))) / (60*60)),2) + ':'
+ RIGHT('00'+CONVERT(VARCHAR,AVG((SUBSTRING(right('000000'+convert(varchar,run_duration),6),1,2) * 60 * 60)
+ (SUBSTRING(right('000000'+convert(varchar,run_duration),6),3,2) * 60)
+ (SUBSTRING(right('000000'+convert(varchar,run_duration),6),5,2))) % (60*60) / 60),2) + ':'
+ RIGHT('00'+CONVERT(VARCHAR,(AVG((SUBSTRING(right('000000'+convert(varchar,run_duration),6),1,2) * 60 * 60)
+ (SUBSTRING(right('000000'+convert(varchar,run_duration),6),3,2) * 60)
+ (SUBSTRING(right('000000'+convert(varchar,run_duration),6),5,2))) % (60))),2)
from #jobhistory
GROUP BY jobId
drop table #jobhistory
I want the output to be in the format of HH:MM:SS.
Thanks for any input!
May 21, 2012 at 10:24 am
Something like this might read a bit better (it's a horrible design, I'm surprised MS haven't deprecated it yet and designed something better):
;
--The semi-colon really belongs in the previous statement
WITH Job_History AS (
select jobid,
RIGHT('00000'+CAST(AVG(run_duration) AS VARCHAR(6)),6) AS HHMMSS
from #jobhistory
GROUP BY jobId)
SELECT jobid, LEFT(HHMMSS,2)+':'+SUBSTRING(HHMMSS,3,2) + ':' + RIGHT(HHMMSS,2)
FROM Job_History
May 21, 2012 at 10:30 am
That is prettier, but I think the duration will have to be converted to seconds prior to the average and back to their format post the average.
You are right the structure is terrible for general use!
Thanks for the reply!
May 21, 2012 at 10:30 am
Here is my answer to the problem:
create table #jobhistory
(
jobId int /* really uniqueidentifier, but using int for simplicity */
,run_duration int
)
;
insert into #jobhistory
select 1,3
union all select 1,31003
union all select 1,233
union all select 2,5
union all select 2,101
;
SELECT
jobId,
AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) AvgRunInSeconds,
CAST(AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) / 3600 AS VARCHAR) + ':' +
RIGHT('0' + CAST((AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) % 3600) / 60 AS VARCHAR),2) + ':' +
RIGHT('0' + CAST((AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) % 3600) % 60 AS VARCHAR),2) AS FormatTime
FROM
#jobhistory jh
GROUP BY
jh.jobId
;
DROP TABLE #jobhistory;
May 21, 2012 at 10:33 am
dkschill (5/21/2012)
That is prettier, but I think the duration will have to be converted to seconds prior to the average and back to their format post the average.You are right the structure is terrible for general use!
Thanks for the reply!
Yep, you're correct, logic failure on my part. Lynn's looks good
May 21, 2012 at 10:34 am
Modulus is much nicer than the substring with the cast!
Thanks!
May 21, 2012 at 11:17 am
Cleaned it up some:
CREATE FUNCTION dbo.itvfn_FormatTimeHHMMSS(
@TimeInSeconds INT
)
RETURNS TABLE WITH SCHEMABINDING
AS RETURN
(SELECT
CAST(@TimeInSeconds / 3600 AS VARCHAR) + ':' +
RIGHT('0' + CAST((@TimeInSeconds % 3600) / 60 AS VARCHAR),2) + ':' +
RIGHT('0' + CAST((@TimeInSeconds % 3600) % 60 AS VARCHAR),2) AS FormatTime)
;
go
create table #jobhistory
(
jobId int /* really uniqueidentifier, but using int for simplicity */
,run_duration int
)
;
insert into #jobhistory
select 1,3
union all select 1,31003
union all select 1,233
union all select 2,5
union all select 2,101
;
WITH AvgTime AS (
SELECT
jobId,
AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) AvgRunInSeconds
FROM
#jobhistory jh
GROUP BY
jh.jobId
)
SELECT
at.jobId,
at.AvgRunInSeconds,
ft.FormatTime
FROM
AvgTime at
CROSS APPLY dbo.itvfn_FormatTimeHHMMSS(at.AvgRunInSeconds) ft
GO
DROP TABLE #jobhistory;
DROP FUNCTION dbo.itvfn_FormatTimeHHMMSS;
May 21, 2012 at 11:27 am
Do you have a rule of thumb for functions? I am timid to use them after some disastrously
horrible code I had written at one point 🙂
I hate to be one of those people that are of the opinion of:
It didn't work for me once so I will NEVER use it again!
Thanks for the help!
May 21, 2012 at 11:38 am
Depends on what I am going to be using the function(s) for in my code. Inside a SELECT list, I have learned to stay away from scalar functions. They may make since from a code reuse view, but they create hidden RBAR (Row By Agonizing Row) processing. In the code above, you see me using an inline table valued function that returns a one row table and is used in a CROSS APPLY in the FROM clause.
Although the original code may be just a little faster, especially if applied to a million row table, the inline table valued function as part of a CROSS APPLY is almost as fast. What I have done is cleanup the code and allowed for easier reuse of the code for converting secods to hhmmss format.
I would still use scalar functions where appropriate, such as in a WHERE clasue where it is evaluated only once when the query is run.
Unless needed, I would also refrain from using multi-statement table valued functions, as they can also be slow.
Each has its uses, but the best thing that I can say is to test, test, and test again until you find a solution that works and scaled well for your application.
May 21, 2012 at 11:41 am
I guess it just always depends...I should of known.
Thanks for the tips though!
May 21, 2012 at 7:35 pm
Just for fun:
create table #jobhistory (jobId INT, run_duration int)
insert into #jobhistory
select 1,3 union all select 1,31003
union all select 1,233 union all select 2,5 union all select 2,101
;WITH CTE (jobid, rd, hh, mm, ss) AS (
SELECT jobid, run_duration
,run_duration / 10000 % 100, run_duration / 100 % 100, run_duration % 100
FROM #jobhistory)
SELECT jobid, CONVERT(VARCHAR(8), DATEADD(second, AVG(hh*3600 + mm*60 + ss), 0), 108) As [HH:MM:SS]
FROM CTE
GROUP BY jobId
DROP TABLE #jobhistory
Of course, this will only work if your run time does not exceed 24 hours.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
May 21, 2012 at 7:58 pm
Here's why Dwain:
CREATE FUNCTION dbo.itvfn_FormatTimeHHMMSS(
@TimeInSeconds INT
)
RETURNS TABLE WITH SCHEMABINDING
AS RETURN
(SELECT
CAST(@TimeInSeconds / 3600 AS VARCHAR) + ':' +
RIGHT('0' + CAST((@TimeInSeconds % 3600) / 60 AS VARCHAR),2) + ':' +
RIGHT('0' + CAST((@TimeInSeconds % 3600) % 60 AS VARCHAR),2) AS FormatTime)
;
go
create table #jobhistory (jobId INT, run_duration int);
insert into #jobhistory
select 1,3 union all
select 1,31003 union all
select 1,233 union all
select 2,5 union all
select 2,101 union all
select 2,250 union all
select 3,250100 union all
select 3,243430;
WITH CTE (jobid, rd, hh, mm, ss) AS (
SELECT jobid, run_duration
,run_duration / 10000 % 100, run_duration / 100 % 100, run_duration % 100
FROM #jobhistory)
SELECT jobid, CONVERT(VARCHAR(8), DATEADD(second, AVG(hh*3600 + mm*60 + ss), 0), 108) As RunTimeSeconds
FROM CTE
GROUP BY jobId;
go
WITH AvgTime AS (
SELECT
jobId,
AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) AvgRunInSeconds
FROM
#jobhistory jh
GROUP BY
jh.jobId
)
SELECT
at.jobId,
at.AvgRunInSeconds,
ft.FormatTime
FROM
AvgTime at
CROSS APPLY dbo.itvfn_FormatTimeHHMMSS(at.AvgRunInSeconds) ft
GO
DROP TABLE #jobhistory;
DROP FUNCTION dbo.itvfn_FormatTimeHHMMSS;
jobid RunTimeSeconds
----------- --------------
1 01:04:13
2 00:01:18
3 00:47:45
(3 row(s) affected)
jobId AvgRunInSeconds FormatTime
----------- --------------- ------------------------------------
1 3853 1:04:13
2 78 0:01:18
3 89265 24:47:45
(3 row(s) affected)
May 21, 2012 at 8:00 pm
I like it! I will stick with the earlier post (for the extra time it covers), but this one is really great!
May 21, 2012 at 8:49 pm
Right. I did say mine only works if the run time does not exceed 24 hours.
For the sake of the OP, we should hope that it does not. 😛
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
May 21, 2012 at 9:04 pm
Sigh! If you're jobs are running over 24 hours, you could use this instead.
;WITH CTE (jobid, rd, hh, mm, ss) AS (
SELECT jobid, run_duration
,run_duration / 10000 % 100, run_duration / 100 % 100, run_duration % 100
FROM #jobhistory),
CTE2 (jobid, AvgRunTime) AS (
SELECT jobid, DATEADD(second, AVG(hh*3600 + mm*60 + ss), 0)
FROM CTE GROUP BY Jobid)
SELECT jobid, CAST(DATEDIFF(hour, 0, AvgRunTime) AS VARCHAR(5)) +
RIGHT(CONVERT(VARCHAR(8), AvgRunTime, 108),6) As [HH:MM:SS]
FROM CTE2
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 15 posts - 1 through 15 (of 24 total)
You must be logged in to reply to this topic. Login to reply