December 14, 2012 at 3:51 pm
I have a bunch of data with timestamps - there may be any number of rows in a given time period, and I want to do averages and min/max for the time periods which are generated.
First I use a CTE to generate the times - the interval is not always the same.
;with Times as (select convert(datetime2, 'Jun 29 2012 4:48:29:000PM') as PeriodTime
union all
select dateadd(minute, 1, PeriodTime) as PeriodTimeStart
from Times where dateadd(minute, 1, PeriodTime) <= 'Jul 4 2012 12:00:00:000AM')
select PeriodTime from Times order by PeriodTime option (maxrecursion 32767)
I have logic earlier in the procedure to make sure I don't go over the 32767 rows limit - it picks either second, minute, hour, or day, and uses that in the CTE in the dateadd function. So, at the end of all that, I have a table with rows that look like this...
PeriodTime
2012-06-29 16:48:29.0000000
2012-06-29 16:49:29.0000000
2012-06-29 16:50:29.0000000
2012-06-29 16:51:29.0000000
2012-06-29 16:52:29.0000000
2012-06-29 16:53:29.0000000
2012-06-29 16:54:29.0000000
2012-06-29 16:55:29.0000000
2012-06-29 16:56:29.0000000
2012-06-29 16:57:29.0000000
2012-06-29 16:58:29.0000000
2012-06-29 16:59:29.0000000
2012-06-29 17:00:29.0000000
2012-06-29 17:01:29.0000000
2012-06-29 17:02:29.0000000
2012-06-29 17:03:29.0000000
Now, I have my data in another table and what I would like to do is get the average, min, and max of that data based on the times in the temp table. How can I do that? My best attempt is this... but it's not getting all the rows?
select [time] = min(Times.PeriodTime), Y = AVG([POSITION]), YMin = MIN([POSITION]), YMax = MAX([POSITION])
from [TLM_UMC-TGT_X_0], Times where TLM_TIME between Times.[PeriodTime] and dateadd(minute, 1, Times.[PeriodTime])
option (maxrecursion 32767)
I know that I could do this with three sub-queries, but isn't there a better way? THIS... is ugly... (and slow)
;with Times as (select convert(datetime2, 'Jun 29 2012 4:48:29:000PM') as PeriodTime
union all
select dateadd(minute, 1, PeriodTime) as PeriodTimeStart
from Times where dateadd(minute, 1, PeriodTime) <= 'Jul 4 2012 12:00:00:000AM')
select [time] = Times.PeriodTime,
Y = (select AVG([POSITION]) from [TLM_UMC-TGT_X_0] where TLM_TIME between Times.[PeriodTime] and dateadd(minute, 1, Times.[PeriodTime])),
YMin = (select MIN([POSITION]) from [TLM_UMC-TGT_X_0] where TLM_TIME between Times.[PeriodTime] and dateadd(minute, 1, Times.[PeriodTime])),
YMax = (select MAX([POSITION]) from [TLM_UMC-TGT_X_0] where TLM_TIME between Times.[PeriodTime] and dateadd(minute, 1, Times.[PeriodTime]))
from Times option (maxrecursion 32767)
December 14, 2012 at 4:40 pm
Could you post the DDL for your tables, some sample data (INSERT INTO statements), and expected results based on the sample data? This would really help us help you.
December 15, 2012 at 5:32 pm
If I understand correctly, maybe something like below.
[Btw, as a general performance guideline, make sure the datatype of the time in the table exactly matches your list of times. Your times list looks to be datetime2, so I'll assume your table also has datetime2. If not, change all the types to match the time type in your table.]
DECLARE @MinutesInterval int
SET @MinutesInterval = 1 --for min; for hr, set to 60; etc..
;WITH Times (
...
), Times_Range AS (
SELECT
MIN([PeriodTime]) AS StartTime,
DATEADD(MINUTE, 1, MAX([PeriodTime])) AS EndTime
)
SELECT
Times.[PeriodTime],
AVG(tlm.[POSITION]) AS Y,
MIN(tlm.[POSITION]) AS YMin,
MAX(tlm.[POSITION]) AS YMax
FROM Times
LEFT OUTER JOIN [TLM_UMC-TGT_X_0] tlm ON
tlm.TLM_TIME >= Times.[PeriodTime] AND
tlm.TLM_TIME < DATEADD(MINUTE, @MinutesInterval, Times.[PeriodTime])
WHERE
tlm.TLM_TIME >= Times_Range.StartTime AND tlm.TLM_TIME < Times_Range.EndTime
GROUP BY
Times.[PeriodTime]
ORDER BY
Times.[PeriodTime]
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 17, 2012 at 9:27 am
Yes Scott, thanks. That's the kind of change I was looking for. I think performance will be better with that but I'll test and let you know. I was just a little frustrated - I sit here next to a slamming door all day and it really messes with my head and sometimes I'm not sure if I'm doing the best code I can. In a normal environment maybe I would have come up with the grouping query myself, but thank you very much, it is really helpful!
December 17, 2012 at 9:34 am
Lynn Pettis (12/14/2012)
Could you post the DDL for your tables, some sample data (INSERT INTO statements), and expected results based on the sample data? This would really help us help you.
No I can't really do that for two reasons: one is that the database is classified and I probably posted too much already, and the other is you need thousands of points to make any sense of it - this data has 100 to 1000 rows per second.
The telemetry table ([TLM_UMC-TGT_X_0]) has a TLM_TIME column of datetime2, and a POSITION column of float - those are the only columns that matter for this issue. So, you can assume it's...
create table [TLM_UMC-TGT_X_0] (
TLM_TIME datetime2,
POSITION float
)
There is an index on the TLM_TIME, but it's not the clustered index until the next release - that should give a massive improvement in queries by TLM_TIME, which is all we ever do.
December 17, 2012 at 9:51 am
How about something like this:
SET NOCOUNT ON;
USE ProofOfConcept;
GO
IF OBJECT_ID(N'dbo.Periods') IS NOT NULL
DROP TABLE dbo.Periods;
GO
CREATE TABLE dbo.Periods
(Period INT PRIMARY KEY);
GO
INSERT INTO dbo.Periods
(Period)
SELECT TOP (32768)
ROW_NUMBER() OVER (ORDER BY t1.OBJECT_ID) - 1 /* 0 through 32767 */
FROM sys.columns AS t1
CROSS JOIN sys.columns AS t2
CROSS JOIN sys.columns AS t3;
GO
IF OBJECT_ID(N'dbo.[TLM_UMC-TGT_X_0]') IS NOT NULL
DROP TABLE dbo.[TLM_UMC-TGT_X_0];
GO
CREATE TABLE dbo.[TLM_UMC-TGT_X_0]
(ID INT IDENTITY
PRIMARY KEY,
TLM_TIME DATETIME,
POSITION FLOAT);
GO
INSERT INTO dbo.[TLM_UMC-TGT_X_0]
(TLM_TIME, POSITION)
VALUES ('20120629 16:48:29.000', 1),
('20120629 16:48:29.000', 5),
('20120629 16:49:29.000', 10);
GO
-- If you just want periods that have data
SELECT DATEADD(MINUTE, Period, '20120629 16:48:29'),
AVG(POSITION),
MIN(POSITION),
MAX(POSITION)
FROM dbo.Periods AS P
INNER JOIN dbo.[TLM_UMC-TGT_X_0] AS T
ON T.TLM_TIME >= DATEADD(MINUTE, Period, '20120629 16:48:29')
AND T.TLM_TIME < DATEADD(MINUTE, Period + 1, '20120629 16:48:29')
GROUP BY DATEADD(MINUTE, Period, '20120629 16:48:29')
ORDER BY DATEADD(MINUTE, Period, '20120629 16:48:29');
-- If you also want periods without data
SELECT DATEADD(MINUTE, Period, '20120629 16:48:29'),
T.MinPosition,
T.MaxPosition,
T.AvgPosition
FROM dbo.Periods AS P
OUTER APPLY (SELECT MIN(POSITION) AS MinPosition,
MAX(POSITION) AS MaxPosition,
AVG(POSITION) AS AvgPosition
FROM dbo.[TLM_UMC-TGT_X_0] AS T
WHERE T.TLM_TIME >= DATEADD(MINUTE, Period, '20120629 16:48:29')
AND T.TLM_TIME < DATEADD(MINUTE, Period + 1, '20120629 16:48:29')) AS T
ORDER BY P.Period;
Once you have the Periods table created, you don't need to do the calculations on the fly any more. (This makes it a classic Numbers table [aka Tally table] solution. Those are popular because they do this kind of thing very, very efficiently.)
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply