February 4, 2010 at 8:38 am
I am trying to get an average of a rowcount while I group by the hour.
This runs without error. It returns the total # of rows, grouped by the hour of a datetime column.
select count(*), convert(char(2),EventTime,108)
from sg
group by convert(char(2),EventTime,108)
The table has one column: EventTime defined as datetime.
What I want is the average of the count(*): I need the average count of the rows each day, grouped by hour. I cannot use avg(count(*)) as it generates an error.
What I am doing is saving logins in this table and want to report the average number of logins per hour across a month. So the number of logins corresponds to the count(*).
The way I have the sql written above would give me the total # of logins in each hour.
How do I accomplish this?
Steve
February 4, 2010 at 8:45 am
Please post table definitions, sample data and desired output. Read this to see the best way to post this to get quick responses.
http://www.sqlservercentral.com/articles/Best+Practices/61537/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 4, 2010 at 9:01 am
create table sg
(EventTime datetime)
insert into sg values('02/01/10 07:17:00')
insert into sg values('02/01/10 07:33:00')
insert into sg values('02/01/10 07:46:00')
insert into sg values('02/01/10 08:10:00')
insert into sg values('02/01/10 08:37:00')
insert into sg values('02/02/10 07:57:00')
insert into sg values('02/02/10 08:17:00')
insert into sg values('02/02/10 08:27:00')
insert into sg values('02/02/10 08:37:00')
insert into sg values('02/02/10 09:00:00')
insert into sg values('02/02/10 09:04:00')
output should be the average of row counts grouped by hour:
avg_count hour
======== ====
2 --------------- 07
2.5 -------------- 08
1 -------------- 09
February 4, 2010 at 10:14 am
What do you mean by 'average of row counts'. Averages over what?
I see 4 rows in the hour 7-8, 5 in the hour 8-9 and 2 in the hour 9-10. I have no idea whatsoever how to get to the values you want (2,2.5 and 1), unless by 'average' you mean 'divide by 2'
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 4, 2010 at 10:55 am
Gail,
First - thanks for taking time to looking at this.
What I mean is I need to find what the average count of rows is per day, grouped by hour.
The table is accumulating logins through an event notification. I will have one row for each login that occurs.
What I need to know is the average # of logins for each hour in the day for the month.
In the example I show, there is only Feb 1 and Feb 2 data, so for the 7 AM hour, it's 3 logins on Feb 1st, then 1 on Feb 2nd, so an average of 2 rows (4 rows over two days is 2).
For the 8 AM hour, it's 2, then 3 rows, so an avg of 2.5.
Hope this explains it.
February 5, 2010 at 12:04 pm
I made a workaround using another table.
February 5, 2010 at 8:37 pm
sgambale (2/4/2010)
create table sg(EventTime datetime)
insert into sg values('02/01/10 07:17:00')
insert into sg values('02/01/10 07:33:00')
insert into sg values('02/01/10 07:46:00')
insert into sg values('02/01/10 08:10:00')
insert into sg values('02/01/10 08:37:00')
insert into sg values('02/02/10 07:57:00')
insert into sg values('02/02/10 08:17:00')
insert into sg values('02/02/10 08:27:00')
insert into sg values('02/02/10 08:37:00')
insert into sg values('02/02/10 09:00:00')
insert into sg values('02/02/10 09:04:00')
output should be the average of row counts grouped by hour:
avg_count hour
======== ====
2 --------------- 07
2.5 -------------- 08
[font="Arial Black"]1[/font] -------------- 09
Actually, the red number above should be "2".
No need for a separate table on this...
WITH
cteHourlyCount AS
(
SELECT DATEDIFF(hh,0,EventTime) AS DaySerial,
DATEPART(hh,EventTime) AS Hour,
CAST(COUNT(*) AS DECIMAL(9,2)) AS HourlyCount
FROM sg
GROUP BY DATEDIFF(hh,0,EventTime),
DATEPART(hh,EventTime)
)
SELECT AVG(HourlyCount) AS Avg_Count,
Hour
FROM cteHourlyCount
GROUP BY Hour
--Jeff Moden
Change is inevitable... Change for the better is not.
February 8, 2010 at 7:47 am
Jeff,
Thanks.
It works well.
Steve
February 8, 2010 at 11:59 am
Thanks for the feedback, Steve.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 9, 2010 at 4:45 am
Could be I'm just being thick (it wouldn't be the first time), but I don't follow how the DATEDIFF(hh,0,EventTime) can work.
select datediff(hh,0,'09 feb 2010 12:34:56') returns 965196
and select datediff(hh,0,'09 feb 2010 13:34:56') returns 965197.
Surely for this query, we would want these to be returning the same thing?
February 9, 2010 at 5:30 am
It's not you that's thick... it's me and my fingers that are thick. Thank you for the catch, Andrew
[font="Arial Black"]@ Steve,[/font]
My most sincere apologies, Steve. I did a copy and paste but forgot the replace. The correct code should be like this...
WITH
cteHourlyCount AS
(
SELECT DATEDIFF([font="Arial Black"][highlight]dd[/highlight][/font],0,EventTime) AS DaySerial,
DATEPART(hh,EventTime) AS Hour,
CAST(COUNT(*) AS DECIMAL(9,2)) AS HourlyCount
FROM sg
GROUP BY DATEDIFF([font="Arial Black"][highlight]dd[/highlight][/font],0,EventTime),
DATEPART(hh,EventTime)
)
SELECT AVG(HourlyCount) AS Avg_Count,
Hour
FROM cteHourlyCount
GROUP BY Hour
--Jeff Moden
Change is inevitable... Change for the better is not.
February 9, 2010 at 5:46 am
That makes a lot more sense :-).
Astonishingly, I haven't seen the DATEDIFF with a zero parameter before. That's really useful.
February 9, 2010 at 5:54 am
Thanks for the feedback. Just to be sure, the "0" is the date serial number for 1900-01-01. Any number would have worked here but "0" is pretty easy to type and easy for me to remember. Lot's of folks use the same method.
Also, I sent Steve (the OP) an email to make sure he's aware of the bloody mistake I made. Thanks again for the catch, Andrew.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 9, 2010 at 7:27 am
Andrew Watson-478275 (2/9/2010)
Could be I'm just being thick (it wouldn't be the first time), but I don't follow how the DATEDIFF(hh,0,EventTime) can work.select datediff(hh,0,'09 feb 2010 12:34:56') returns 965196
and select datediff(hh,0,'09 feb 2010 13:34:56') returns 965197.
Surely for this query, we would want these to be returning the same thing?
DATEDIFF(hh, 0, EventTime) returns a partition value for the aggregate.
select datediff(hh,0,'09 feb 2010 12:00:00') -- 965196
select datediff(hh,0,'09 feb 2010 12:59:59') -- 965196
select datediff(hh,0,'09 feb 2010 13:00:00') -- 965197
Convert the number back to a date:
SELECT DATEADD(hh, 965196, 0), DATEADD(hh, 965197, 0)
So what's 0 in this series?
SELECT CAST(0 AS DATETIME)
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
February 9, 2010 at 3:45 pm
Just to clarify a bit... Andrew's question was more on why I used "hh" instead of "dd" and he was right... it should have been "dd" where I made the change.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply