hourly average

  • I have a data in table

    ID UTCTD PRODUCED

    1 2009-01-28 06:09:34.563 140

    1 2009-01-28 06:49:11.261 150

    1 2009-01-28 06:56:21.123 160

    1 2009-01-28 07:02:14.163 180

    1 2009-01-28 07:24:22.161 190

    1 2009-01-28 07:46:41.130 200

    1 2009-01-28 08:02:14.163 160

    1 2009-01-28 08:12:22.167 170

    1 2009-01-28 08:41:31.197 180

    1 2009-01-28 08:52:54.262 170

    2 2009-01-28 06:09:34.563 110

    2 2009-01-28 06:49:11.261 130

    2 2009-01-28 07:56:21.123 180

    2 2009-01-28 07:02:14.163 200

    2 2009-01-28 08:24:22.161 110

    2 2009-01-28 08:46:41.130 120

    2 2009-01-28 08:02:14.163 130

    2 2009-01-28 09:12:22.167 120

    2 2009-01-28 09:41:31.197 120

    2 2009-01-28 09:52:54.262 140

    I want ouput hourly avg

    ID SUTC EUTC AVG

    1 2009-01-28 06:00:00.000 2009-01-28 06:59:59.000 150

    1 2009-01-28 07:00:00.000 2009-01-28 07:59:59.000 190

    1 2009-01-28 08:00:00.000 2009-01-28 08:59:59.000 170

    2 2009-01-28 06:00:00.000 2009-01-28 06:59:59.000 120

    2 2009-01-28 07:00:00.000 2009-01-28 07:59:59.000 190

    2 2009-01-28 08:00:00.000 2009-01-28 08:59:59.000 120

    2 2009-01-28 09:00:00.000 2009-01-28 09:59:59.000 130

  • --create table #temp

    --(

    --id int

    --, utctd datetime

    --, produced int

    --)

    --insert into #temp

    --select 1, '2009-01-28 06:09:34.563', 140 union all

    --select 1 ,'2009-01-28 06:49:11.261', 150 union all

    --select 1 ,'2009-01-28 06:56:21.123', 160 union all

    --select 1 ,'2009-01-28 07:02:14.163', 180 union all

    --select 1 ,'2009-01-28 07:24:22.161', 190 union all

    --select 1, '2009-01-28 07:46:41.130', 200

    --select distinct tt.id

    --, convert(datetime, convert(varchar(10),utctd,101), 101)

    --, convert(datetime, convert(varchar(10),utctd,101), 101)+ '23:59:59.000'

    --, averages.aver

    --from #temp tt

    --join (

    --select distinct id

    --, datepart(yyyy,utctd) yr

    --, datepart(dd,utctd) dy

    --, datepart(hh,utctd) hr

    --, avg(produced) aver

    --from #temp

    --group by id

    --, datepart(yyyy,utctd)

    --, datepart(dd,utctd)

    --, datepart(hh,utctd)

    --)averages

    --on tt.id = averages.id

    --and datepart(yyyy,tt.utctd) = averages.yr

    --and datepart(dd,tt.utctd) = averages.dy

    --and datepart(hh,tt.utctd) = averages.hr

  • i probably should have used a cte just to get into the habit.

  • --oops i forgot to put the hours in there...the below should do it

    create table #temp

    (

    id int

    , utctd datetime

    , produced int

    )

    insert into #temp

    select 1, '2009-01-28 06:09:34.563', 140 union all

    select 1 ,'2009-01-28 06:49:11.261', 150 union all

    select 1 ,'2009-01-28 06:56:21.123', 160 union all

    select 1 ,'2009-01-28 07:02:14.163', 180 union all

    select 1 ,'2009-01-28 07:24:22.161', 190 union all

    select 1, '2009-01-28 07:46:41.130', 200 union all

    select 2 , '2009-01-28 06:09:34.563', 110 union all

    select 2 , '2009-01-28 06:49:11.261', 130 union all

    select 2 , '2009-01-28 07:56:21.123', 180 union all

    select 2 , '2009-01-28 07:02:14.163', 200 union all

    select 2 , '2009-01-28 08:24:22.161', 110 union all

    select 2 , '2009-01-28 08:46:41.130', 120 union all

    select 2 , '2009-01-28 08:02:14.163', 130 union all

    select 2 , '2009-01-28 09:12:22.167', 120 union all

    select 2 , '2009-01-28 09:41:31.197', 120 union all

    select 2 , '2009-01-28 09:52:54.262', 140

    select distinct tt.id

    , dateadd(hh,datepart(hh, utctd) , convert(datetime, convert(varchar(10),utctd,101), 101) )

    , dateadd(hh,datepart(hh, utctd) , convert(datetime, convert(varchar(10),utctd,101), 101) )+ '00:59:59.000'

    , averages.aver

    from #temp tt

    join (

    select distinct id

    , datepart(yyyy,utctd) yr

    , datepart(dd,utctd) dy

    , datepart(hh,utctd) hr

    , avg(produced) aver

    from #temp

    group by id

    , datepart(yyyy,utctd)

    , datepart(dd,utctd)

    , datepart(hh,utctd)

    )averages

    on tt.id = averages.id

    and datepart(yyyy,tt.utctd) = averages.yr

    and datepart(dd,tt.utctd) = averages.dy

    and datepart(hh,tt.utctd) = averages.hr

  • You can cheat the hell out of this one, BaldingLoopMan... 🙂

    Using your good test data setup...

    SELECT ID,

    DATEADD(hh,DATEDIFF(hh,0,UtcTd),0) AS StartTime,

    DATEADD(hh,DATEDIFF(hh,0,UtcTd),0) + '00:59:59' AS EndTime,

    AVG(Produced) AS HourlyAverage

    FROM #temp

    GROUP BY ID, DATEADD(hh,DATEDIFF(hh,0,UtcTd),0)

    ORDER BY ID, StartTime

    Take a look at the differences in the execution plan. Notice the slightly "fat" arrow in yours? It has 256 actual rows in it which is the perfect square of 16, the number of rows in the original data. That means you have an "accidental" cross join in the code which will pretty much beat up your IO system in the face of scalability.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Another way to skin the same cat except this looks a little cleaner...

    SELECT ID, StartTime, StartTime + '00:59:59' AS EndTime, AVG(Produced) AS HourlyAverage

    FROM (

    SELECT ID, DATEADD(hh,DATEDIFF(hh,0,UtcTd),0) AS StartTime, Produced

    FROM #Temp

    )d

    GROUP BY ID, StartTime

    ORDER BY ID, StartTime

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • rpatil22 (1/29/2010)


    I have a data in table

    Heh... but we don't. We have text on a page and would have to spend a bit of time turning it into data in a table. 😉

    You're relatively new so take a look at the article at the first link in my signature line below. If you post the way that article suggests, people will beat your door down trying to help instead of you having to wait for an hour and a half until some good soul like the BaldingLoopMan took the time to turn your text into readily consumable data.

    It's just a thought... 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • trying to wrap my mind around what u guys did.

    first:

    DATEDIFF ( datepart , startdate , enddate )

    So DATEDIFF(hh,0,UtcTd) = 956142. What is that. The number of hours since when? 1900? At least that’s what it looks like after i do the math. However i'd like clarification if possible.

    Ok so rather than creating a unique year, month, day, hour. You used the unique hour since the beginning of time or sql time. Interesting. Damnit why didn’t i think of that.

    DATEADD (datepart , number, date )

    Then you’re doing a DATEADD(hh,DATEDIFF(hh,0,UtcTd),0) simply to get the date back into the proper format. That's very cool and is now logged to mem. Thanks.

    Three things i learned.

    1) the solution was screaming for unique hour and nothing more so i should have used the unique hour

    2) datediff subtracting 0 just to get the unique hour and dateadd adding 0 to get the date back into a datetime format.

    3) by converting the utcTD to a unique hour when it gets put back to a datetime you lose the mins seconds which makes sense.

    This is why i love this site. Funny, what goes around comes around. I helped the original thread creator w/ formatting the proper inserts and so on and in return i ends up learning something that has changed my thinking in relation to grouping by dd, yy, mm, hr, mi, or sec.

    Nice work...Thanks guys.

  • when i say thanks guys i mean thanks Jeff.

    I didnt realize you had replied several times. Either way nice work!

  • BaldingLoopMan (2/3/2010)


    trying to wrap my mind around what u guys did.

    ...

    You might want to have a look at Lynns blog to find some helpful date routines.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • BaldingLoopMan (2/3/2010)


    trying to wrap my mind around what u guys did.

    first:

    DATEDIFF ( datepart , startdate , enddate )

    So DATEDIFF(hh,0,UtcTd) = 956142. What is that. The number of hours since when? 1900? At least that’s what it looks like after i do the math. However i'd like clarification if possible.

    Ok so rather than creating a unique year, month, day, hour. You used the unique hour since the beginning of time or sql time. Interesting. Damnit why didn’t i think of that.

    DATEADD (datepart , number, date )

    Then you’re doing a DATEADD(hh,DATEDIFF(hh,0,UtcTd),0) simply to get the date back into the proper format. That's very cool and is now logged to mem. Thanks.

    Three things i learned.

    1) the solution was screaming for unique hour and nothing more so i should have used the unique hour

    2) datediff subtracting 0 just to get the unique hour and dateadd adding 0 to get the date back into a datetime format.

    3) by converting the utcTD to a unique hour when it gets put back to a datetime you lose the mins seconds which makes sense.

    This is why i love this site. Funny, what goes around comes around. I helped the original thread creator w/ formatting the proper inserts and so on and in return i ends up learning something that has changed my thinking in relation to grouping by dd, yy, mm, hr, mi, or sec.

    Nice work...Thanks guys.

    Correct... "0" used as a date is shorthand for "1900-01-01 00:00:00.000". It's known as a "date base date" because it does have a numeric value of zero behind the scenes. SQL Server is capable of dates before that and has a "lowest date possible" of 1753-01-01 because of the change to a Gregorian calendar the year before. Here's some code that shows these two dates as their underlying date serial numbers...

    SELECT CAST(CAST('1900-01-01' AS DATETIME) AS FLOAT),

    CAST(CAST('1753-01-01' AS DATETIME) AS FLOAT)

    The DATEADD(hh,DATEDIFF(hh,0,UtcTd),0) is just a way to get whole hours. It's one of the easier ways for me to remember and is rivaled for speed by only one other method that I forget all the time. The difference in speed is very small even at a million rows so I use the one that's easiest to remember (for me, anyway). It does avoid making a conversion to a character based datatype which really slows things down.

    Thanks for the feedback.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply