September 2, 2016 at 3:42 pm
I'm trying to emulate a historian database which is just several thousand rows
with 3 columns:
- "Time" of type DateTime
- "Tag" of type varchar(whatever)
- "TagValue" of type float.
From online posts I've been pulling together tidbits of code (and learning) but could use some direction.
The historian only needs a few tags but I'd like to repeat the same sets of code for each tag.
So that I can adjust the "trend" values of the TagValue.
For example, say I generate 4k of rows for Tag A.
The DateTime values will be unique for each Tag.
I'll use RAND with limits for the TagValue values - but I want to adjust those limits per group of consecutive DateTimes.
Like so:
rows 1 to 1000 => rand vals between 4.0 and 5.0
rows 1000 to 2000 => rand vals between 5.0 and 6.0
rows 2000 to 3000 => rand vals between 6.0 and 4.0
rows 3000 to remaining => rand vals between 7.0 and 6.0
It seems like I should be able to use an inner SELECT for each of those groups and use ROW_COUNT and a function call to iterate through the rows within the group.
As I select a row, call the function to get a new rand value within the same limits.
Then I could repeat the code for the next group, but tweak the rand-val limits.
But have no idea how to make that sub-query/group/iteration.
Then repeat the whole process for Tags B and C. For my purposes 3 or 4 tags would be sufficient.
Here's what I got so far:
USE [SIM_MyHistorianDB]
IF OBJECT_ID('dbo.historian', 'U') IS NOT NULL
DROP TABLE [dbo].[historian]
-- from http://stackoverflow.com/questions/7149271/sql-server-2008-generate-a-series-of-date-times
-- create table with DateTime columns StartRange, EndRange
CREATE TABLE [dbo].[historian]
(
StartRange DateTime NULL
,EndRange DateTime NULL
)
declare @StartTime datetime = '2011-07-20 11:00:33',
@EndTime datetime = '2011-08-20 15:37:34',
@Interval int = 555
;WITH cSequence AS
(
SELECT
@StartTime AS StartRange,
DATEADD(SECOND, @Interval, @StartTime) AS EndRange
UNION ALL
SELECT
EndRange,
DATEADD(SECOND, @Interval, EndRange)
FROM cSequence
WHERE DATEADD(SECOND, @Interval, EndRange) < @EndTime
)
INSERT INTO [dbo].[historian](
StartRange
,EndRange
)
SELECT * FROM cSequence OPTION (MAXRECURSION 0);
-- Drop the EndRange column
ALTER TABLE [dbo].[historian]
DROP COLUMN EndRange;
-- Rename 'StartRange' to 'Time'
EXEC sp_rename 'historian.StartRange', 'Time', 'COLUMN';
GO
-- I know the above 2 steps were a funky way to get to this point (still learning)
-- Add the Tag and Tag's Value column
ALTER TABLE [dbo].[historian]
ADD Tag varchar(10), TagValue float
GO
-- Set the Tag name for *this* tag
UPDATE [dbo].[historian]
SET Tag = 'P_MAN_CHG'
-- the function for getting a rand value with limits
-- from http://blog.sqlauthority.com/2012/11/20/sql-server-using-rand-in-user-defined-functions-udf/
IF OBJECT_ID (N'dbo.RandFn', N'FN') IS NOT NULL
DROP FUNCTION RandFn;
GO
CREATE FUNCTION RandFn(@RAND FLOAT, @min-2 int, @max-2 int)
RETURNS FLOAT
AS -- Returns a random float value in range [min max]
BEGIN
DECLARE @rndValue FLOAT
SET @rndValue = @RAND*(@max - @min-2) + @min-2
RETURN @rndValue
END
GO
Thanks for any help.
September 3, 2016 at 5:17 am
Quick suggestion (if I understood your question correctly)
😎
USE TEEST;
GO
SET NOCOUNT ON;
DECLARE @SAMPLE_SIZE BIGINT = 4000;
DECLARE @StartTime DATETIME = CONVERT(DATETIME,'2011-07-20 11:00:33',120);
DECLARE @EndTime DATETIME = CONVERT(DATETIME,'2011-08-20 15:37:34',120);
DECLARE @TIME_RANGE INT = 2 * (DATEDIFF(SECOND,@StartTime,@EndTIME) / @SAMPLE_SIZE);
;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))
, NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)
,BASE_DATA AS
(
SELECT
NM.N
,CONVERT(BIGINT,ABS(CHECKSUM(NEWID())),0) % @TIME_RANGE AS TIME_VAL
,CASE
WHEN NM.N < 1001 THEN 4
WHEN NM.N BETWEEN 1001 AND 2000 THEN 5
WHEN NM.N BETWEEN 2001 AND 3000 THEN 6
WHEN NM.N > 3000 THEN 7
END AS BASE_VALUE
,ABS(CHECKSUM(NEWID())) AS R_SEED
FROM NUMS NM
)
SELECT
BD.N
,DATEADD(SECOND,SUM(BD.TIME_VAL) OVER
(
ORDER BY BD.N
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
),@StartTime) AS [Time]
,CASE
WHEN BD.BASE_VALUE < 8 THEN 'P_MAN_CHG'
ELSE 'OTHER_TAG'
END AS Tag
,BD.BASE_VALUE + RAND(BD.R_SEED) AS TagValue
FROM BASE_DATA BD;
First few rows of the output
N Time Tag TagValue
----- ----------------------- --------- ----------------------
1 2011-07-20 11:16:31.000 P_MAN_CHG 4.98761653192708
2 2011-07-20 11:29:11.000 P_MAN_CHG 4.75271534782763
3 2011-07-20 11:34:02.000 P_MAN_CHG 4.53856566898698
4 2011-07-20 11:40:38.000 P_MAN_CHG 4.72297868147395
5 2011-07-20 11:51:20.000 P_MAN_CHG 4.20620937695261
6 2011-07-20 11:54:20.000 P_MAN_CHG 4.88951905633601
7 2011-07-20 11:54:32.000 P_MAN_CHG 4.39667228572923
8 2011-07-20 11:57:47.000 P_MAN_CHG 4.25596765660796
9 2011-07-20 12:12:16.000 P_MAN_CHG 4.24687268000568
10 2011-07-20 12:21:16.000 P_MAN_CHG 4.96680488930536
11 2011-07-20 12:31:04.000 P_MAN_CHG 4.95324994986705
12 2011-07-20 12:46:20.000 P_MAN_CHG 4.13457461207966
13 2011-07-20 12:52:55.000 P_MAN_CHG 4.16643477498641
14 2011-07-20 13:10:55.000 P_MAN_CHG 4.42943506135419
15 2011-07-20 13:32:32.000 P_MAN_CHG 4.46022334598849
16 2011-07-20 13:36:32.000 P_MAN_CHG 4.6216370828385
17 2011-07-20 13:42:22.000 P_MAN_CHG 4.12497315104819
18 2011-07-20 13:55:52.000 P_MAN_CHG 4.98936973787402
19 2011-07-20 14:11:56.000 P_MAN_CHG 4.75868663303926
20 2011-07-20 14:24:34.000 P_MAN_CHG 4.17822013303251
21 2011-07-20 14:40:38.000 P_MAN_CHG 4.26549995762316
22 2011-07-20 14:44:03.000 P_MAN_CHG 4.16613587118889
23 2011-07-20 14:56:55.000 P_MAN_CHG 4.18974123713843
24 2011-07-20 15:06:45.000 P_MAN_CHG 4.35508370063455
25 2011-07-20 15:21:39.000 P_MAN_CHG 4.20015389645265
26 2011-07-20 15:28:25.000 P_MAN_CHG 4.27800895472645
27 2011-07-20 15:38:28.000 P_MAN_CHG 4.70747701004639
28 2011-07-20 15:46:20.000 P_MAN_CHG 4.56643514761468
29 2011-07-20 16:05:20.000 P_MAN_CHG 4.19048080134641
30 2011-07-20 16:06:36.000 P_MAN_CHG 4.29610962436167
September 3, 2016 at 8:32 am
@Eirikur, Much thanks! The output is nearly there. Now I need to understand it.
I got as far as
(SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))
I figure you chose 10 values arbitrarily but not sure why 10. Is 10 values processed at a time?
Can you write some pseudo-code for this code?
September 3, 2016 at 9:09 am
You are very welcome.
The code uses an inline tally table (numbers table) to generate the initial rows, the iTally works on the principal of cross joining a seed table of 10 rows in order to generate a far greater number of rows.
😎
-- SEED FOR AN INLINE TALLY TABLE
;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))
SELECT
XT.N
,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS RN
FROM T XT;
-- CROSS JOIN SHORTCUT "," RETURNS 10 x 10 ROWS
;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS RN
FROM T T1,T T2;
-- COMBINED AS AN INLINE TALLY TABLE THIS CAN GENERATE UP TO 10^9 ROWS
DECLARE @SAMPLE_SIZE BIGINT = 4000;
;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))
, NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)
SELECT
NM.N
FROM NUMS NM;
September 3, 2016 at 9:31 am
Further explanation (comments in the code)
😎
-- Number of rows generated
DECLARE @SAMPLE_SIZE BIGINT = 4000;
-- First date or Start Date
DECLARE @StartTime DATETIME = CONVERT(DATETIME,'2011-07-20 11:00:33',120);
-- Last date or End Date
DECLARE @EndTime DATETIME = CONVERT(DATETIME,'2011-08-20 15:37:34',120);
-- Date increments in second calculated as the number of seconds between the Start Date and End Date
-- divided by the number of rows and then multiplied by 2 as the average value is close to 0.5 times
-- the difference.
DECLARE @TIME_RANGE INT = 2 * (DATEDIFF(SECOND,@StartTime,@EndTIME) / @SAMPLE_SIZE);
-- Inline tally table (iTally)
;WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))
, NUMS(N) AS (SELECT TOP(@SAMPLE_SIZE) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)
,BASE_DATA AS
(
SELECT
NM.N
-- Pseudo random time interval
,CONVERT(BIGINT,ABS(CHECKSUM(NEWID())),0) % @TIME_RANGE AS TIME_VAL
-- Base group assignement
,CASE
WHEN NM.N < 1001 THEN 4
WHEN NM.N BETWEEN 1001 AND 2000 THEN 5
WHEN NM.N BETWEEN 2001 AND 3000 THEN 6
WHEN NM.N > 3000 THEN 7
END AS BASE_VALUE
-- Seed for the RAND() function
,ABS(CHECKSUM(NEWID())) AS R_SEED
FROM NUMS NM
)
SELECT
BD.N
-- To guarantee the unique time values the query uses a
-- running total of the time interval values and adds it
-- to the Start Time
,DATEADD(SECOND,SUM(BD.TIME_VAL) OVER
(
ORDER BY BD.N
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
),@StartTime) AS [Time]
,CASE
WHEN BD.BASE_VALUE < 8 THEN 'P_MAN_CHG'
ELSE 'OTHER_TAG'
END AS Tag
-- The RAND() generates values between 0 and 1, adding the
-- output to the BASE_VALUE generates values between the
-- desired group values, i.e. 4 - 5 etc.
,BD.BASE_VALUE + RAND(BD.R_SEED) AS TagValue
FROM BASE_DATA BD;
September 3, 2016 at 9:56 am
Excellent write-up! Thanks again.
So if I wanted to adjust the code and expand on the portion that builds the random value added to the base value, I could just use the BD.BASE_VALUE as a selector in another case statement. To call my rand-function with a wider range for a specific case, e.g. '7' for example. Right?
September 3, 2016 at 3:10 pm
skippyV (9/3/2016)
Excellent write-up! Thanks again.So if I wanted to adjust the code and expand on the portion that builds the random value added to the base value, I could just use the BD.BASE_VALUE as a selector in another case statement. To call my rand-function with a wider range for a specific case, e.g. '7' for example. Right?
That's correct, just do a multiplication of the range desired
😎
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply