Best way to get average duration

  • I am working with the data from msdb.dbo.sysjobhistory and want to take the average duration of each job. If you are not familiar with the table, here is a simplistic model:

    create table #jobhistory

    (

    jobId uniqueidentifier

    ,run_duration int

    )

    These are the only columns necessary for my example.

    run_duration is in the format of hhmmss, and if the job only took 3 seconds the value would be '3'.

    Average of course doesn't return valid data. So I am trying to find the best way to find the average. Here is what I have so far.

    create table #jobhistory

    (

    jobId int /* really uniqueidentifier, but using int for simplicity */

    ,run_duration int

    )

    insert into #jobhistory

    select 1,3

    union all select 1,31003

    union all select 1,233

    union all select 2,5

    union all select 2,101

    select jobid,

    RIGHT('00'+CONVERT(VARCHAR,AVG((SUBSTRING(right('000000'+convert(varchar,run_duration),6),1,2) * 60 * 60)

    + (SUBSTRING(right('000000'+convert(varchar,run_duration),6),3,2) * 60)

    + (SUBSTRING(right('000000'+convert(varchar,run_duration),6),5,2))) / (60*60)),2) + ':'

    + RIGHT('00'+CONVERT(VARCHAR,AVG((SUBSTRING(right('000000'+convert(varchar,run_duration),6),1,2) * 60 * 60)

    + (SUBSTRING(right('000000'+convert(varchar,run_duration),6),3,2) * 60)

    + (SUBSTRING(right('000000'+convert(varchar,run_duration),6),5,2))) % (60*60) / 60),2) + ':'

    + RIGHT('00'+CONVERT(VARCHAR,(AVG((SUBSTRING(right('000000'+convert(varchar,run_duration),6),1,2) * 60 * 60)

    + (SUBSTRING(right('000000'+convert(varchar,run_duration),6),3,2) * 60)

    + (SUBSTRING(right('000000'+convert(varchar,run_duration),6),5,2))) % (60))),2)

    from #jobhistory

    GROUP BY jobId

    drop table #jobhistory

    I want the output to be in the format of HH:MM:SS.

    Thanks for any input!

  • Something like this might read a bit better (it's a horrible design, I'm surprised MS haven't deprecated it yet and designed something better):

    ;

    --The semi-colon really belongs in the previous statement

    WITH Job_History AS (

    select jobid,

    RIGHT('00000'+CAST(AVG(run_duration) AS VARCHAR(6)),6) AS HHMMSS

    from #jobhistory

    GROUP BY jobId)

    SELECT jobid, LEFT(HHMMSS,2)+':'+SUBSTRING(HHMMSS,3,2) + ':' + RIGHT(HHMMSS,2)

    FROM Job_History

  • That is prettier, but I think the duration will have to be converted to seconds prior to the average and back to their format post the average.

    You are right the structure is terrible for general use!

    Thanks for the reply!

  • Here is my answer to the problem:

    create table #jobhistory

    (

    jobId int /* really uniqueidentifier, but using int for simplicity */

    ,run_duration int

    )

    ;

    insert into #jobhistory

    select 1,3

    union all select 1,31003

    union all select 1,233

    union all select 2,5

    union all select 2,101

    ;

    SELECT

    jobId,

    AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) AvgRunInSeconds,

    CAST(AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) / 3600 AS VARCHAR) + ':' +

    RIGHT('0' + CAST((AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) % 3600) / 60 AS VARCHAR),2) + ':' +

    RIGHT('0' + CAST((AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) % 3600) % 60 AS VARCHAR),2) AS FormatTime

    FROM

    #jobhistory jh

    GROUP BY

    jh.jobId

    ;

    DROP TABLE #jobhistory;

  • dkschill (5/21/2012)


    That is prettier, but I think the duration will have to be converted to seconds prior to the average and back to their format post the average.

    You are right the structure is terrible for general use!

    Thanks for the reply!

    Yep, you're correct, logic failure on my part. Lynn's looks good

  • Modulus is much nicer than the substring with the cast!

    Thanks!

  • Cleaned it up some:

    CREATE FUNCTION dbo.itvfn_FormatTimeHHMMSS(

    @TimeInSeconds INT

    )

    RETURNS TABLE WITH SCHEMABINDING

    AS RETURN

    (SELECT

    CAST(@TimeInSeconds / 3600 AS VARCHAR) + ':' +

    RIGHT('0' + CAST((@TimeInSeconds % 3600) / 60 AS VARCHAR),2) + ':' +

    RIGHT('0' + CAST((@TimeInSeconds % 3600) % 60 AS VARCHAR),2) AS FormatTime)

    ;

    go

    create table #jobhistory

    (

    jobId int /* really uniqueidentifier, but using int for simplicity */

    ,run_duration int

    )

    ;

    insert into #jobhistory

    select 1,3

    union all select 1,31003

    union all select 1,233

    union all select 2,5

    union all select 2,101

    ;

    WITH AvgTime AS (

    SELECT

    jobId,

    AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) AvgRunInSeconds

    FROM

    #jobhistory jh

    GROUP BY

    jh.jobId

    )

    SELECT

    at.jobId,

    at.AvgRunInSeconds,

    ft.FormatTime

    FROM

    AvgTime at

    CROSS APPLY dbo.itvfn_FormatTimeHHMMSS(at.AvgRunInSeconds) ft

    GO

    DROP TABLE #jobhistory;

    DROP FUNCTION dbo.itvfn_FormatTimeHHMMSS;

  • Do you have a rule of thumb for functions? I am timid to use them after some disastrously

    horrible code I had written at one point 🙂

    I hate to be one of those people that are of the opinion of:

    It didn't work for me once so I will NEVER use it again!

    Thanks for the help!

  • Depends on what I am going to be using the function(s) for in my code. Inside a SELECT list, I have learned to stay away from scalar functions. They may make since from a code reuse view, but they create hidden RBAR (Row By Agonizing Row) processing. In the code above, you see me using an inline table valued function that returns a one row table and is used in a CROSS APPLY in the FROM clause.

    Although the original code may be just a little faster, especially if applied to a million row table, the inline table valued function as part of a CROSS APPLY is almost as fast. What I have done is cleanup the code and allowed for easier reuse of the code for converting secods to hhmmss format.

    I would still use scalar functions where appropriate, such as in a WHERE clasue where it is evaluated only once when the query is run.

    Unless needed, I would also refrain from using multi-statement table valued functions, as they can also be slow.

    Each has its uses, but the best thing that I can say is to test, test, and test again until you find a solution that works and scaled well for your application.

  • I guess it just always depends...I should of known.

    Thanks for the tips though!

  • Just for fun:

    create table #jobhistory (jobId INT, run_duration int)

    insert into #jobhistory

    select 1,3 union all select 1,31003

    union all select 1,233 union all select 2,5 union all select 2,101

    ;WITH CTE (jobid, rd, hh, mm, ss) AS (

    SELECT jobid, run_duration

    ,run_duration / 10000 % 100, run_duration / 100 % 100, run_duration % 100

    FROM #jobhistory)

    SELECT jobid, CONVERT(VARCHAR(8), DATEADD(second, AVG(hh*3600 + mm*60 + ss), 0), 108) As [HH:MM:SS]

    FROM CTE

    GROUP BY jobId

    DROP TABLE #jobhistory

    Of course, this will only work if your run time does not exceed 24 hours.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Here's why Dwain:

    CREATE FUNCTION dbo.itvfn_FormatTimeHHMMSS(

    @TimeInSeconds INT

    )

    RETURNS TABLE WITH SCHEMABINDING

    AS RETURN

    (SELECT

    CAST(@TimeInSeconds / 3600 AS VARCHAR) + ':' +

    RIGHT('0' + CAST((@TimeInSeconds % 3600) / 60 AS VARCHAR),2) + ':' +

    RIGHT('0' + CAST((@TimeInSeconds % 3600) % 60 AS VARCHAR),2) AS FormatTime)

    ;

    go

    create table #jobhistory (jobId INT, run_duration int);

    insert into #jobhistory

    select 1,3 union all

    select 1,31003 union all

    select 1,233 union all

    select 2,5 union all

    select 2,101 union all

    select 2,250 union all

    select 3,250100 union all

    select 3,243430;

    WITH CTE (jobid, rd, hh, mm, ss) AS (

    SELECT jobid, run_duration

    ,run_duration / 10000 % 100, run_duration / 100 % 100, run_duration % 100

    FROM #jobhistory)

    SELECT jobid, CONVERT(VARCHAR(8), DATEADD(second, AVG(hh*3600 + mm*60 + ss), 0), 108) As RunTimeSeconds

    FROM CTE

    GROUP BY jobId;

    go

    WITH AvgTime AS (

    SELECT

    jobId,

    AVG(((jh.run_duration / 10000) * 3600) + (((jh.run_duration % 10000) / 100) * 60) + ((jh.run_duration % 10000) % 100)) AvgRunInSeconds

    FROM

    #jobhistory jh

    GROUP BY

    jh.jobId

    )

    SELECT

    at.jobId,

    at.AvgRunInSeconds,

    ft.FormatTime

    FROM

    AvgTime at

    CROSS APPLY dbo.itvfn_FormatTimeHHMMSS(at.AvgRunInSeconds) ft

    GO

    DROP TABLE #jobhistory;

    DROP FUNCTION dbo.itvfn_FormatTimeHHMMSS;

    jobid RunTimeSeconds

    ----------- --------------

    1 01:04:13

    2 00:01:18

    3 00:47:45

    (3 row(s) affected)

    jobId AvgRunInSeconds FormatTime

    ----------- --------------- ------------------------------------

    1 3853 1:04:13

    2 78 0:01:18

    3 89265 24:47:45

    (3 row(s) affected)

  • I like it! I will stick with the earlier post (for the extra time it covers), but this one is really great!

  • Right. I did say mine only works if the run time does not exceed 24 hours.

    For the sake of the OP, we should hope that it does not. 😛


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Sigh! If you're jobs are running over 24 hours, you could use this instead.

    ;WITH CTE (jobid, rd, hh, mm, ss) AS (

    SELECT jobid, run_duration

    ,run_duration / 10000 % 100, run_duration / 100 % 100, run_duration % 100

    FROM #jobhistory),

    CTE2 (jobid, AvgRunTime) AS (

    SELECT jobid, DATEADD(second, AVG(hh*3600 + mm*60 + ss), 0)

    FROM CTE GROUP BY Jobid)

    SELECT jobid, CAST(DATEDIFF(hour, 0, AvgRunTime) AS VARCHAR(5)) +

    RIGHT(CONVERT(VARCHAR(8), AvgRunTime, 108),6) As [HH:MM:SS]

    FROM CTE2


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply