find group of time

Question

Post reply

find group of time

sharon-472085

Ten Centuries

Points: 1182
More actions
August 26, 2020 at 12:12 pm
Go to Answer
#3783718

hi all,
i have the follwing table : 10 lines
i want to show only row after 30 second (should be paramter :)) between them. ( as summery of previous records)
on this record set:
2020-08-26 00:00:00.000 --first line bring
2020-08-26 00:00:20.000 -- 30 second not over avoid this record
2020-08-26 00:00:25.000 -- 30 second not over avoid this record
2020-08-26 00:00:40.000 --aftr 30 seconds bring.
2020-08-26 00:01:00.000 -- 30 second not over avoid this record
2020-08-26 00:01:20.000 -- aftr 30 seconds bring.
2020-08-26 00:01:40.000 -- 30 second not over avoid this record
2020-08-26 00:02:00.000 --aftr 30 seconds bring.
i hope i was clear 🙂
thank you vary much
sharon

Jeff Moden

SSC Guru

Points: 1004406
More actions
August 26, 2020 at 2:03 pm
Answer
#3783768

ktflash wrote:
For those are too lazy to copy paste for testing, added 2nd Colum for True False:
You should speak to the OP about the "lazy" thing. 😉
--Jeff Moden
RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
Change is inevitable... Change for the better is not.
Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

ktflash Default port Points: 1400 More actions · Answer 1

For those are too lazy to copy paste for testing, added 2nd Colum for True False:

create table #temp (
Date1 datetime2
   ,Wanted bit)


insert into #temp
values (cast('2020-08-26 00:00:00.000' as Datetime2),1)
      ,(cast('2020-08-26 00:00:20.000' as Datetime2),0)
  ,(cast('2020-08-26 00:00:25.000' as Datetime2),0)
  ,(cast('2020-08-26 00:00:40.000' as Datetime2),1)
  ,(cast('2020-08-26 00:01:00.000' as Datetime2),0)
  ,(cast('2020-08-26 00:01:20.000' as Datetime2),1)
  ,(cast('2020-08-26 00:01:40.000' as Datetime2),0)
  ,(cast('2020-08-26 00:02:00.000' as Datetime2),1)

I want to be the very best
Like no one ever was

https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help

Mark Cowne One Orange Chip Points: 26954 More actions · Answer 2

Recursive cte method, won't be very efficient

with cte as (
select top 1 Date1, cast(1 as bigint) as rn
from #temp
order by Date1

union all

select t.Date1, row_number() over(order by t.Date1) as rn
from #temp t
inner join cte c on t.Date1 > dateadd(second,30,c.Date1) and c.rn = 1
)
select Date1
from cte
where rn = 1
order by Date1;

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 3

sharon-472085 wrote:

hi all,
i have the follwing table : 10 lines
i want to show only row after 30 second (should be paramter :)) between them. ( as summery of previous records)
on this record set:
2020-08-26 00:00:00.000 --first line bring
2020-08-26 00:00:20.000 -- 30 second not over avoid this record
2020-08-26 00:00:25.000 -- 30 second not over avoid this record
2020-08-26 00:00:40.000 --aftr 30 seconds bring.
2020-08-26 00:01:00.000 -- 30 second not over avoid this record
2020-08-26 00:01:20.000 -- aftr 30 seconds bring.
2020-08-26 00:01:40.000 -- 30 second not over avoid this record
2020-08-26 00:02:00.000 --aftr 30 seconds bring.
i hope i was clear 🙂
thank you vary much
sharon

Is there a PK for the table?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jonathan AC Roberts SSCoach Points: 18361 More actions · Answer 4

How do you want the 30 second intervals defined?

i.e. do you want them divided into clock-face 30 second intervals or for the 30 second interval to start at the time of the found row?

sharon-472085 Ten Centuries Points: 1182 More actions · Answer 5

sharon-472085

Ten Centuries

Points: 1182

August 26, 2020 at 3:28 pm

#3783803

thank you vary much 🙂

next time not lazy:)

sharon-472085 Ten Centuries Points: 1182 More actions · Answer 6

sharon-472085

Ten Centuries

Points: 1182

August 26, 2020 at 3:30 pm

#3783804

thank you vary much 🙂

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 7

sharon-472085 wrote:

thank you vary much 🙂
next time not lazy:)

😀

Sorry... my reference was about creating readily consumable data. You can see one way to do that in the article at the first link in my signature line below. It also explains why it helps us help you.

Shifting gears back to the original topic, if the Recursive CTE method proves to be a bit slow, post back... I have a bit of an unorthodox but very effective method that will blast through a million rows in about a second or two for something like this.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

sharon-472085 Ten Centuries Points: 1182 More actions · Answer 8

sharon-472085

Ten Centuries

Points: 1182

August 27, 2020 at 8:03 am

#3784036

tank you solve 🙂

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 9

sharon-472085 wrote:

tank you solve 🙂

I'm not quite sure how to read that... Does that mean that you already have a solution that solved the problem or that you need a faster solution and are asking what it is?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jason A. Long SSC-Insane Points: 23786 More actions · Answer 10

Jeff Moden wrote:

Shifting gears back to the original topic, if the Recursive CTE method proves to be a bit slow, post back... I have a bit of an unorthodox but very effective method that will blast through a million rows in about a second or two for something like this.

Sounds like you have a quirky update in mind... If so, I'll hold off on posting my solution so that I don't burst any bubbles or step on any toes. 😀

Jason A. Long SSC-Insane Points: 23786 More actions · Answer 11

Quick question for @sharon-472085... How do you want to handle rows that have duplicate date-time values?

For example you have two rows that both have a value of '2020-08-26 00:00:40.000'? If you want to "keep the first and drop the second", do you have a means to determine which one id first and which one is second?

Jason A. Long SSC-Insane Points: 23786 More actions · Answer 12

What the heck... I'm just going to post what I have... See what you think of this.

USE tempdb;
GO
--====================================================================================================================
-- Start by creating 1M rows of test data in tempdb 
--====================================================================================================================
IF OBJECT_ID('tempdb.dbo.TestData', 'U') IS NOT NULL 
BEGIN DROP TABLE tempdb.dbo.TestData; END;
GO

CREATE TABLE tempdb.dbo.TestData (
rid int NOT NULL PRIMARY KEY NONCLUSTERED,
some_date_time datetime NOT NULL 
);
GO

WITH 
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),-- 10
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),-- 100
cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n1 b),-- 10,000
cte_Tally (n) AS (
SELECT TOP (1000000)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
cte_n3 a CROSS JOIN cte_n3 b-- 100,000,000
)
INSERT tempdb.dbo.TestData (rid, some_date_time)
SELECT 
rid = ROW_NUMBER() OVER (ORDER BY sdt.some_date_time),
sdt.some_date_time
FROM 
cte_Tally t
CROSS APPLY ( VALUES (DATEADD(SECOND, ABS(CHECKSUM(NEWID())) % 30 + t.n, '2020-08-26')) ) sdt (some_date_time);

--====================================================================================================================
-- Working with the assumption that we dont want to blindly alter the data in our "real" table or mess with its indexes, 
-- we start by dumping the required columns into a #temp_table that we can manipulate without having a heart attack.
--====================================================================================================================
IF OBJECT_ID('tempdb..#quirky', 'U') IS NOT NULL 
BEGIN DROP TABLE #quirky; END;

CREATE TABLE #quirky (
rid int NOT NULL,
some_date_time datetime NOT NULL,
sdt_rid_code AS CONVERT(binary(8), some_date_time) + CONVERT(binary(4), rid) PERSISTED,-- << this is a binary concatination that combines the datetime column with the unique pk column.
is_keeper bit NOT NULL DEFAULT(0)--it will be necessary to break ties when multiple rows share the same datetime value.
);

INSERT #quirky(rid, some_date_time)
SELECT td.rid, td.some_date_time FROM tempdb.dbo.TestData td;

ALTER TABLE #quirky ADD PRIMARY KEY CLUSTERED (some_date_time, rid) WITH (FILLFACTOR = 100);-- Add a clustered PK. This is necessary to ensure the "quirky update" processes in the correct order.

--====================================================================================================================
-- This is the funky voodoo (quirky) magic happens... Read more about it in Jeffs excelent article... https://www.sqlservercentral.com/articles/solving-the-running-total-and-ordinal-rank-problems-rewritten
-- It chews through the full 1M rows on ~2 secs on my test machine.
--====================================================================================================================
DECLARE @keep_code binary(12) = 0x000000000000000000000000;

UPDATE q SET 
@keep_code = CASE WHEN q.some_date_time >= DATEADD(SECOND, 30, CONVERT(datetime, SUBSTRING(@keep_code, 1, 8))) THEN q.sdt_rid_code ELSE @keep_code END,
q.is_keeper = CASE WHEN q.sdt_rid_code = @keep_code THEN 1 ELSE 0 END 
FROM
#quirky q WITH (TABLOCKX)
OPTION(MAXDOP  1);

----------------------------------------------------------------------------------------------------------------------
SELECT * FROM #quirky q-- do a quick validation to see which rows are "keepers" and which ones are not, before deleting non-keepers.

----------------------------------------------------------------------------------------------------------------------
DELETE #quirky WHERE is_keeper = 0;-- get rid of the non-keepers.

----------------------------------------------------------------------------------------------------------------------
-- The #quirky table can now be used to filter the original dboTestData.
SELECT 
td.rid,
td.some_date_time
FROM
tempdb.dbo.TestData td
WHERE 
EXISTS (SELECT 1 FROM #quirky q WHERE td.rid = q.rid)
ORDER BY 
td.rid ASC;

--DROP TABLE dbo.TestData;
--DROP TABLE #quirky;

This reply was modified 4 years, 3 months ago by Jason A. Long.
This reply was modified 4 years, 3 months ago by Jason A. Long.