September 25, 2019 at 9:09 pm
Hey all,
I'm trying to update table rows
when a row's "myCount" value is less than other rows within +-5 mins of it.
I'm wondering if there's a way to do this without a while loop.
CREATE TABLE myTable(
rowId int IDENTITY(1,1),
myDatetime datetime,
myCount integer,
invalid bit
)
UPDATE myTable
SET invalid = 1
WHERE myCount < (myCount for all rows within +- 5 mins)
Code-Blooded
September 25, 2019 at 9:35 pm
Please provide some sample data (as INSERTs) and desired (post-update) results, based on the sample data.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
September 25, 2019 at 10:21 pm
CREATE TABLE aaaMyTable(
rowId int IDENTITY(1,1),
myDatetime datetime,
myCount integer,
invalid bit
)
INSERT INTO aaaMyTable(myDatetime, myCount, invalid)
VALUES
('2019-09-25 13:34:00.000', 3, 0),
('2019-09-25 13:35:00.000', 2, 0),
('2019-09-25 13:36:00.000', 2, 0),
('2019-09-25 13:44:00.000', 2, 0),
('2019-09-25 13:44:20.000', 2, 0)
SELECT * FROM aaaMyTable
rowID myDatetime myCount invalid
1 2019-09-25 13:34:00.000 3 0 --valid no higher "myCount" exists +-5 mins
2 2019-09-25 13:35:00.000 2 1 --invalid because a higher "myCount" exists within +-5 mins
3 2019-09-25 13:36:00.000 2 1 --invalid because a higher "myCount" exists within +-5 mins
4 2019-09-25 13:44:00.000 2 0 --valid it's been over +-5 mins since a higher myCount
5 2019-09-25 13:44:20.000 2 0 --valid
Code-Blooded
September 26, 2019 at 8:18 am
A self-join would work, but may not perform well on large data sets.
SELECT DISTINCT
t1.rowID
,t1.myDateTime
,t1.myCount
,CASE
WHEN t2.rowID IS NULL THEN 0
ELSE 1
END AS invalid
FROM aaaMyTable t1
LEFT JOIN aaaMyTable t2
ON t1.myDateTime BETWEEN DATEADD(mi,-5,t2.myDateTime) AND DATEADD(mi,5,t2.myDateTime)
AND t1.myCount < t2.myCount;
How granular is the five-minute requirement - does it need to be to the millisecond, the second, the minute? What time period does your data set span? A windowing function solution may perform better for a low-granularity requirement and a relatively short time span.
John
September 26, 2019 at 10:53 am
UPDATE a
SET a.invalid = 1
FROM aaaMyTable a
WHERE EXISTS(SELECT *
FROM aaaMyTable b
WHERE ABS(DATEDIFF(ss,a.myDatetime,b.myDatetime)) <= 300
AND a.rowId <> b.rowId
AND a.myCount < b.myCount)
AND a.invalid = 0
September 26, 2019 at 4:03 pm
This is a packing intervals problem. Here is a solution that only requires one scan of the table.
;
WITH intervals AS
(
SELECT
a.rowId
,a.myDatetime
,a.myCount
,a.invalid
,CASE WHEN LAG(a.myDatetime, 1, '1900-01-01')OVER(ORDER BY a.myDateTime) < DATEADD(MINUTE, -5, a.myDatetime) THEN 1 ELSE 0 END AS new_interval
FROM #aaaMyTable AS a
)
, interval_groups AS
(
SELECT
i.rowId
,i.myDatetime
,i.myCount
,i.invalid
,SUM(i.new_interval) OVER(ORDER BY i.myDatetime ROWS UNBOUNDED PRECEDING) AS interval_group
FROM intervals i
)
SELECT
ig.rowId
,ig.myDatetime
,ig.myCount
,CASE WHEN ig.myCount < MAX(ig.myCount) OVER(PARTITION BY ig.interval_group) THEN 1 ELSE 0 END AS invalid
FROM interval_groups ig
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
September 26, 2019 at 4:23 pm
These are great. I'm studying them now.
Thanks!!
Code-Blooded
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply