How to build 7 day slots based on start and end dates?

Question

How to build 7 day slots based on start and end dates?

BIMind

SSChasing Mays

Points: 628
More actions
April 8, 2010 at 6:14 pm

#221123

Hi All,
I have a interesting challenge. I have data lets say RowID, UserID, Date in a table. If UserID appears twice within a 7 day period, i need to delete the one with the lower RowID. The data is for one month only and each month min and max dates are different. i have to use those to define "one month", i cannot use standard months. So lets say its Jan 2 - Jan 30. if at the end of the month its not a complete 7 day period, use whatever days are left to create the last week.
the key is to build those 7 day time slots in a programmatic way which i am short-handed with. I am thinking CTEs or maybe a table variable but the logic is escaping me. Any gurus can shed some pointers?
Thanks!

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Ben Teraberry SSChampion Points: 11845 More actions · Answer 1

Are you wanting to delete any user record if that user has another record within 7 actual days or are you only wanting to do this if it's in the same specific 7 day span (such as Sunday - Saturday)?

If it's the former, you can use something like this:

DELETE

FROM YourTable yt

WHERE EXISTS (SELECT sq.RowID

FROM YourTable sq

WHERE sq.UserID = yt.UserID and

sq.RowID > yt.RowID and

DATEDIFF(dd, yt.[Date], sq.[Date]) < 7)

On a side note, I personally dislike doing deletions like this. I would rather add a bit field 'IsActive' if possible and deactive records rather than deleting them.

└> bt

Forum Etiquette: How to post data/code on a forum to get the best help[/url]

SoundaraRajan_N Valued Member Points: 68 More actions · Answer 2

Hi buddy, got ur solution or looking for something else......

LutzM SSC Guru Points: 107049 More actions · Answer 3

Please explain a little more detailed what your looking for. The best way would be to give us some sample data and expected result. Furthermore, you need to define the "direction" where to start.

Example: date values: 3-20, 4-1, 4-6, 4-9, 4-14

If you start at the most recent date, you'd eleminate 4-9 and 4-1. But if you'd start with the oldest date, 4-6 and 4-14 need to be removed...

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

K Cline Ten Centuries Points: 1391 More actions · Answer 4

Example: date values: 3-20, 4-1, 4-6, 4-9, 4-14
If you start at the most recent date, you'd eleminate 4-9 and 4-1. But if you'd start with the oldest date, 4-6 and 4-14 need to be removed...

Also, how is your data entered? Is it done on a daily basis throughout the month or bulk entered into a table at the end of a period? That too makes a difference. For instance, if it were done on a daily basis the only records left from Imu92's example would be 3/20 and 4/14. 4/1 would be eliminated when 4/6 was entered, 4/6 would be eliminated when 4/9 was entered, and 4/9 would be eliminated when 4/14 was entered.

If it is data is entered interactively or on a daily basis then the answer is very straightfoward and can go in a trigger or in the procedure you use to insert the data (assuming the same procedure is used for all data inserts).

If you are bulk loading the data at the end of the time period then you need to establish very clear rules for eliminating the data. As has been pointed out, it makes a big difference in the data.

BIMind SSChasing Mays Points: 628 More actions · Answer 5

sorry for the late reply guys

berry...this worked perfect..i cant believe how simple this is...

if anyone is interested, here's the sample script

create table sample_7day (call_id int identity(1,1), UserID int, Post_date datetime)

insert into sample_7day values (1, '20100102')

insert into sample_7day values (1, '20100109')

insert into sample_7day values (1, '20100102')

insert into sample_7day values (2, '20100103')

insert into sample_7day values (3, '20100104')

insert into sample_7day values (3, '20100111')

insert into sample_7day values (3, '20100123')

insert into sample_7day values (4, '20100123')

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 6

bteraberry (4/8/2010)
Are you wanting to delete any user record if that user has another record within 7 actual days or are you only wanting to do this if it's in the same specific 7 day span (such as Sunday - Saturday)?
If it's the former, you can use something like this:
DELETE
FROM YourTable yt
WHERE EXISTS (SELECT sq.RowID
FROM YourTable sq
WHERE sq.UserID = yt.UserID and
sq.RowID > yt.RowID and
DATEDIFF(dd, yt.[Date], sq.[Date]) < 7)
On a side note, I personally dislike doing deletions like this. I would rather add a bit field 'IsActive' if possible and deactive records rather than deleting them.

Hi bteraberry,

First, I absolutely agree... I dislike doing deltions like this and would rather mark a column as you suggest.

On the code side of the house, I'm not sure what I'm doing differently than you but I get the following error when I run your code...

Msg 102, Level 15, State 1, Line 2

Incorrect syntax near 'yt'.

When I make the necessary fix, it has some unintended consequences...

--===== Create a table with all the same UserID and a year's worth of dates

SELECT IDENTITY(INT,1,1) AS RowID,

1 AS UserID,

DATEADD(dd,Number,'20100101') AS Date

INTO dbo.YourTable

FROM Master.dbo.spt_Values

WHERE Type = 'P'

AND Number BETWEEN 0 AND 364

SELECT * FROM dbo.YourTable

--===== Run the repaired delete

DELETE yt

FROM YourTable yt

WHERE EXISTS (SELECT sq.RowID

FROM YourTable sq

WHERE sq.UserID = yt.UserID and

sq.RowID > yt.RowID and

DATEDIFF(dd, yt.[Date], sq.[Date]) < 7)

SELECT * FROM dbo.YourTable

DROP TABLE dbo.YourTable

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 7

BIMind (4/10/2010)
sorry for the late reply guys
berry...this worked perfect..i cant believe how simple this is...
if anyone is interested, here's the sample script
create table sample_7day (call_id int identity(1,1), UserID int, Post_date datetime)
insert into sample_7day values (1, '20100102')
insert into sample_7day values (1, '20100109')
insert into sample_7day values (1, '20100102')
insert into sample_7day values (2, '20100103')
insert into sample_7day values (3, '20100104')
insert into sample_7day values (3, '20100104')
insert into sample_7day values (3, '20100111')
insert into sample_7day values (3, '20100123')
insert into sample_7day values (4, '20100123')

There's a potential problem with the code. Please post the actual code that you're using so one of us can test it to make sure that you're not going to delete more than you bargained for.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

BIMind SSChasing Mays Points: 628 More actions · Answer 8

Hmmm Jeff..thanks for the input...

i am surprised by the results with your code...i am wondering why...because it works perfectly with the same data i posted above

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 9

Here's one very high speed method that also has the advantage of being able to stop the code just prior to the final delete to verify what's actually going to be deleted (test data generator included)...

--=============================================================================

-- Create and populate a test table with known data.

-- This section isn't part of the solution. It's just to prove the code.

--=============================================================================

--===== Create a table with all the same UserID and a year's worth of dates

SELECT IDENTITY(INT,1,1) AS RowID,

1 AS UserID,

DATEADD(dd,Number,'20100101') AS Date

INTO dbo.YourTable

FROM Master.dbo.spt_Values

WHERE Type = 'P'

AND Number BETWEEN 0 AND 364

--===== Ok... now add dupes (triplicates really) for all those rows

-- just to make this a "worst case" problem

INSERT INTO dbo.YourTable

SELECT UserID, Date

FROM dbo.YourTable

CROSS JOIN (SELECT 1 UNION ALL SELECT 1)t(N) --Cross join causes the dupes

--===== Show the mess we have on our hands (this section can be deleted)

SELECT * FROM dbo.YourTable ORDER BY UserID, Date, RowID

--=============================================================================

-- Solve the problem

--=============================================================================

--===== First, copy the required data from your table

-- into someplace where we can do a little manipulation

SELECT *, CAST(0 AS TINYINT) AS KeepMe

INTO #Delete

FROM dbo.YourTable

--===== Add the quintessential clustered index

CREATE INDEX IX_#Delete_Guide

ON #Delete (UserID, Date, RowID) WITH FILLFACTOR = 100

--===== Declare and preset some handy variables

DECLARE @PrevUserID INT,

@KeepDate DATETIME,

@PrevKeepMe TINYINT

SELECT @PrevUserID = -1,

@KeepDate = '17530101',

@PrevKeepMe = 0

--===== Mark the rows to either be kept or deleted

UPDATE d

SET @PrevKeepMe = KeepMe

= CASE

WHEN UserID <> @PrevUserID THEN 1

WHEN DATEDIFF(dd,@KeepDate,Date) < 7 THEN 0

ELSE 1

END,

@KeepDate = CASE WHEN @PrevKeepMe = 1 THEN Date ELSE @KeepDate END,

@PrevUserID = UserID

FROM #Delete d WITH(TABLOCKX)

OPTION (MAXDOP 1)

--===== Now, do the final delete on the original table

DELETE yt

FROM dbo.YourTable yt

INNER JOIN #Delete d

ON yt.RowID = d.RowID

AND d.KeepMe = 0

--===== Show the original table after the deletes have occurred

SELECT * FROM dbo.YourTable ORDER BY UserID, Date, RowID

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 10

BIMind (4/10/2010)
Hmmm Jeff..thanks for the input...
i am surprised by the results with your code...i am wondering why...because it works perfectly with the same data i posted above

It's because there is no break in the dates for the given UserID... therefor, all the dates are within 7 days of some other date. The code I wrote above should do just fine and it's very fast. Read the comments in the code. The column order of the clustered index on the Temp Table is essential as is the OPTION (MAXDOP 1) hint and the WITH(TABLOCKX) hint.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 11

Oh... and just to be sure... this code also removes dupes if they occur on the same date. Only the earliest dupe according to the RowID will be kept. Change the clustered index so RowID is DESC if you want to keep the latest dupe according to RowID. If you want to keep all dupes for a given date, then we need to make a change. Lemme know, please.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

BIMind SSChasing Mays Points: 628 More actions · Answer 12

Jeff...this is pretty close...however this keeps the lowest row ID of any duplicates found and deletes the higher row IDs..i need it the other way around..

i tried row ID desc in the c. index and changing column positions but i cant get it how i want?

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 13

BIMind (4/12/2010)
Jeff...this is pretty close...however this keeps the lowest row ID of any duplicates found and deletes the higher row IDs..i need it the other way around..
i tried row ID desc in the c. index and changing column positions but i cant get it how i want?

I'm not sure which column positions you changed but put them back the way they were. Just changing the ROWID to DESC in the clustered index does the trick.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)