July 5, 2016 at 4:54 pm
Hi team,
I am currently working on weather data to see if there is rain in past 3 hours. I have data for 10 years.
The sample data is presented below
ID datetime rain
1 2016/01/01 01:00:00 0
1 2016/01/01 03:00:00 8
1 2016/01/01 05:00:00 6
1 2016/01/01 06:00:00 0
2 2016/01/01 08:00:00 6
2 2016/01/01 10:00:00 0
2 2016/01/01 15:00:00 0
3 2016/01/01 16:00:00 0
3 2016/01/01 20:00:00 10
3 2016/01/01 21:00:00 3
4 2016/01/01 23:00:00 4
4 2016/01/02 01:00:00 5
4 2016/01/02 05:00:00 0
I need to have a column to see if there is any rain in past 3 hours. if there is rain it should show '1' else '0'.
the output should look this
0
1
1
1
1
1
0
0
1
1
1
1
0
I have hourly data for 10 years. Any help is appreciated.
Thanks in Advance.
July 5, 2016 at 9:27 pm
Quick suggestion, look up the SIGN function
😎
;WITH SAMPLE_DATA(ID,datetime,rain) AS
(SELECT ID,CONVERT(DATETIME,datetime,112),rain FROM (VALUES
(1,'20160101 01:00:00', 0)
,(1,'20160101 03:00:00', 8)
,(1,'20160101 05:00:00', 6)
,(1,'20160101 06:00:00', 0)
,(2,'20160101 08:00:00', 6)
,(2,'20160101 10:00:00', 0)
,(2,'20160101 15:00:00', 0)
,(3,'20160101 16:00:00', 0)
,(3,'20160101 20:00:00',10)
,(3,'20160101 21:00:00', 3)
,(4,'20160101 23:00:00', 4)
,(4,'20160102 01:00:00', 5)
,(4,'20160102 05:00:00', 0)
) AS X(ID,datetime,rain))
SELECT
SD.ID
,SD.datetime
,SD.rain
,SIGN(SD.rain) AS RAIN_FLAG
FROM SAMPLE_DATA SD;
Output
ID datetime rain RAIN_FLAG
----------- ----------------------- ----------- -----------
1 2016-01-01 01:00:00.000 0 0
1 2016-01-01 03:00:00.000 8 1
1 2016-01-01 05:00:00.000 6 1
1 2016-01-01 06:00:00.000 0 0
2 2016-01-01 08:00:00.000 6 1
2 2016-01-01 10:00:00.000 0 0
2 2016-01-01 15:00:00.000 0 0
3 2016-01-01 16:00:00.000 0 0
3 2016-01-01 20:00:00.000 10 1
3 2016-01-01 21:00:00.000 3 1
4 2016-01-01 23:00:00.000 4 1
4 2016-01-02 01:00:00.000 5 1
4 2016-01-02 05:00:00.000 0 0
July 5, 2016 at 9:31 pm
Thank you very much for the reply. It works to check if there is rain in that hour. However, i also need to check previous 2 hours for each ID.
July 5, 2016 at 10:52 pm
Missed the three hours earlier, here is an improved query
😎
;WITH SAMPLE_DATA(ID,datetime,rain) AS
(SELECT ID,CONVERT(DATETIME,datetime,112),rain FROM (VALUES
(1,'20160101 01:00:00', 0)
,(1,'20160101 03:00:00', 8)
,(1,'20160101 05:00:00', 6)
,(1,'20160101 06:00:00', 0)
,(2,'20160101 08:00:00', 6)
,(2,'20160101 10:00:00', 0)
,(2,'20160101 15:00:00', 0)
,(3,'20160101 16:00:00', 0)
,(3,'20160101 20:00:00',10)
,(3,'20160101 21:00:00', 3)
,(4,'20160101 23:00:00', 4)
,(4,'20160102 01:00:00', 5)
,(4,'20160102 05:00:00', 0)
) AS X(ID,datetime,rain))
SELECT
SD.ID
,SD.datetime
-- Check if the earlier row was within the three hours
,SIGN(SIGN(CASE WHEN DATEDIFF(HOUR,LAG(SD.datetime,1) OVER
(
ORDER BY SD.datetime ASC
),SD.datetime) <= 3 THEN LAG(SD.rain,1)OVER
(
ORDER BY SD.datetime ASC
)
ELSE 0 END)
-- add the earlier value to the current flag
+ SIGN(SD.rain)) AS RAIN_FLAG
,SD.rain
FROM SAMPLE_DATA SD;
Output
ID datetime RAIN_FLAG rain
----------- ----------------------- ----------- -----------
1 2016-01-01 01:00:00.000 0 0
1 2016-01-01 03:00:00.000 1 8
1 2016-01-01 05:00:00.000 1 6
1 2016-01-01 06:00:00.000 1 0
2 2016-01-01 08:00:00.000 1 6
2 2016-01-01 10:00:00.000 1 0
2 2016-01-01 15:00:00.000 0 0
3 2016-01-01 16:00:00.000 0 0
3 2016-01-01 20:00:00.000 1 10
3 2016-01-01 21:00:00.000 1 3
4 2016-01-01 23:00:00.000 1 4
4 2016-01-02 01:00:00.000 1 5
4 2016-01-02 05:00:00.000 0 0
July 6, 2016 at 10:01 am
Dear Eirikur Eiriksson,
Thank you very much. The query example you have given worked.
July 6, 2016 at 10:33 am
Careful with LAG its only looking at the previous record for the 3 hr window. Your data could be confusing and given your dataset Eirikur solution does work, but you say you are taking readings every hour, it wont work under that paradigm.
The solution I posted for you on the 2008 section will work either way.
declare @tmpWeatherData table (ID int, RecDate Datetime, RainAMT int)
INSERT INTO @tmpWeatherData (ID,RecDate, RainAMT)
VALUES
(1,'2016/01/01 01:00:00',0),(1,'2016/01/01 03:00:00',8),(1,'2016/01/01 05:00:00',6),(1,'2016/01/01 06:00:00',0),(2,'2016/01/01 08:00:00',6)
,(2,'2016/01/01 10:00:00',0),(2,'2016/01/01 15:00:00',0),(3,'2016/01/01 16:00:00',0),(3,'2016/01/01 20:00:00',10),(3,'2016/01/01 21:00:00',3)
,(4,'2016/01/01 23:00:00',4),(4,'2016/01/02 01:00:00',5),(4,'2016/01/02 05:00:00',0)
SELECT
t1.ID,t1.RecDate,t1.RainAMT,
CASE
WHEN t2.RainAMT > 0 THEN 1
ELSE 0
END as RainPrev3Hrs
FROM @tmpWeatherData t1
OUTER APPLY (SELECT SUM(RainAMT) as RainAMT FROM @tmpWeatherData WHERE RecDate BETWEEN DateAdd(hh,-3,t1.RecDate) and t1.RecDate ) t2
ORDER BY t1.ID,t1.RecDate
July 6, 2016 at 10:35 am
Here is the final query I have used with a small modification with yours since I have hourly data. Once again, Thank You very much.
SELECT
SD.ID
,SD.datetime
-- Check if the earlier two rows were within the three hours
,SIGN(SIGN(CASE WHEN DATEDIFF(HOUR,LAG(SD.datetime,1) OVER
(
Partition by SD.ID ORDER BY SD.datetime ASC
),SD.datetime) <= 3 THEN LAG(SD.rain,1)OVER
(
ORDER BY SD.datetime ASC
)
ELSE 0 END)
+ SIGN(CASE WHEN DATEDIFF(HOUR,LAG(SD.datetime,2) OVER
(
Partition by SD.ID ORDER BY SD.datetime ASC
),SD.datetime) <= 3 THEN LAG(SD.rain,2)OVER
(
ORDER BY SD.datetime ASC
)
ELSE 0 END)
-- add the earlier value to the current flag
+ SIGN(SD.rain)) AS RAIN_FLAG
,SD.rain
FROM SAMPLE_DATA SD;
July 6, 2016 at 10:54 am
I'm just going to make my final argument against using the LAG function here, in this case.
What if you start getting data at 30 min intervals? The LAG (and LEAD for that matter) function is/are great for not having to join to the same table to get differing data for comparisons etc... but it gets very messy (which is where your SQL is headed) when you have to look at RANGE of values.
Just my opinion, best of luck Venkat, and again welcome to the forums!
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply