April 28, 2015 at 4:43 pm
I need to provide a minimum value over a 12 hour time range of data. I'm struggling with performance issues due to the amount of data. Currently I log about 100 devices reporting once per minute into a table. Also about once per minute I need to pull the minimum value reported for each device in the last 12 hours. Currently I'm maintaining a separate table with entries for just the last 12 hours and just performing a Select Min(Temp) Where DeviceID=x, but it already holds about 700,000 records at any given time. The number of devices will increase substantially and this will no longer be viable. I was wondering if anyone had any thoughts of a different strategy that may be more efficient. Thanks.
Sample Table
ID DeviceID Temp InsertDate
1 10 55 04-28-2015 8:00 AM
2 65 74 04-28-2015 8:00 AM
3 44 23 04-28-2015 8:00 AM
4 10 87 04-28-2015 8:01 AM
5 65 65 04-28-2015 8:01 AM
April 28, 2015 at 5:33 pm
dav_wu (4/28/2015)
I need to provide a minimum value over a 12 hour time range of data. I'm struggling with performance issues due to the amount of data. Currently I log about 100 devices reporting once per minute into a table. Also about once per minute I need to pull the minimum value reported for each device in the last 12 hours. Currently I'm maintaining a separate table with entries for just the last 12 hours and just performing a Select Min(Temp) Where DeviceID=x, but it already holds about 700,000 records at any given time. The number of devices will increase substantially and this will no longer be viable. I was wondering if anyone had any thoughts of a different strategy that may be more efficient. Thanks.Sample Table
ID DeviceID Temp InsertDate
1 10 55 04-28-2015 8:00 AM
2 65 74 04-28-2015 8:00 AM
3 44 23 04-28-2015 8:00 AM
4 10 87 04-28-2015 8:01 AM
5 65 65 04-28-2015 8:01 AM
Can you provide DDL to create a table and then populate it with the sample data above? Also needed would be your expected output based on that sample data.
If you simply need to do your MINs over specific time slots I suggest you take a look at this article to set up your time slots: Time Slots - An Essential Extension to Calendar Tables [/url]
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 28, 2015 at 6:24 pm
When you post your DDL, sample data and expected results, please include the purpose of the Temp column if it's pertinent to the query you need to run. Also, do you have any indexes on your base table already?
April 28, 2015 at 10:49 pm
There are few ways of doing this, which is best depends on the usage. This sample makes the assumption that the table is only queried for a specific insert date value, NOT for a range of insert date values. It uses a scalar value function and a computed column.
😎
USE tempdb;
GO
SET NOCOUNT ON;
GO
/* DDL and sample data */
IF OBJECT_ID(N'dbo.TBL_SAMPLE_ROLLING_TIME') IS NOT NULL DROP TABLE dbo.TBL_SAMPLE_ROLLING_TIME;
CREATE TABLE dbo.TBL_SAMPLE_ROLLING_TIME
(
ID INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_DBO_TBL_SAMPLE_ROLLING_TIME_ID PRIMARY KEY CLUSTERED
,DeviceID INT NOT NULL
,Temp INT NOT NULL
,InsertDate DATETIME NOT NULL
);
DECLARE @SAMPLE_SIZE INT = 10080;
DECLARE @NUM_DEVICES INT = 50;
DECLARE @MAX_VALUE INT = 100 / 2;
DECLARE @START_DATE DATETIME = CONVERT(DATETIME,'2015-01-01 00:00:00',0);
;WITH T(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))
, NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)
,DEVICES(DeviceID) AS (SELECT TOP(@NUM_DEVICES) NM.N FROM NUMS NM)
INSERT INTO dbo.TBL_SAMPLE_ROLLING_TIME (DeviceID,Temp,InsertDate)
SELECT
DV.DeviceID
,(ABS(CHECKSUM(NEWID())) % @MAX_VALUE) + (ABS(CHECKSUM(NEWID())) % @MAX_VALUE) AS Temp
,DATEADD(MINUTE,NM.N,@START_DATE) AS InsertDate
FROM NUMS NM
CROSS JOIN DEVICES DV;
/* Index for the calculation */
CREATE UNIQUE NONCLUSTERED INDEX UNQNCLIDX_DBO_TBL_SAMPLE_ROLLING_TIME_DEVICEID_INSERTDATE_INCL_TEMP ON dbo.TBL_SAMPLE_ROLLING_TIME
( DeviceID ASC , InsertDate ASC ) INCLUDE ( Temp );
/* Function for the calculated column */
DECLARE @FUNCTION_STR NVARCHAR(MAX) = N'
CREATE FUNCTION dbo.SVF_GET_12H_MIN_TEMP
(
@DEVICE_ID INT
,@CURR_DT DATETIME
)
RETURNS INT
AS
BEGIN
RETURN
(
SELECT MIN(SRT.Temp)
FROM dbo.TBL_SAMPLE_ROLLING_TIME SRT
WHERE SRT.DeviceID = @DEVICE_ID
AND DATEDIFF(MINUTE,SRT.InsertDate, @CURR_DT) < 721
)
;
END
';
EXEC (@FUNCTION_STR);
/* Add the calculated column */
ALTER TABLE dbo.TBL_SAMPLE_ROLLING_TIME ADD TEMP_12H_MIN AS (dbo.SVF_GET_12H_MIN_TEMP(DeviceID,InsertDate));
Sample query
/* NOTE: Only do select on a specific time or identity
otherwise the query is very expensive / slow
*/
SELECT
SRT.DeviceID
,SRT.TEMP_12H_MIN
FROM dbo.TBL_SAMPLE_ROLLING_TIME SRT
WHERE SRT.InsertDate = '2015-01-05 08:50:00.000';
Sample output
DeviceID TEMP_12H_MIN
----------- ------------
1 1
2 0
3 1
4 1
5 2
6 1
7 2
8 0
9 4
10 1
11 1
12 1
13 2
14 1
15 2
16 1
17 2
18 1
19 0
20 1
21 1
22 1
23 0
24 2
25 0
26 3
27 0
28 1
29 1
30 0
31 1
32 3
33 3
34 2
35 1
36 1
37 1
38 1
39 2
40 2
41 0
42 0
43 1
44 2
45 3
46 0
47 0
48 2
49 1
50 0
Execution plan
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply