delete and determine dups by datetime field

Question

delete and determine dups by datetime field

Jay H

SSC Veteran

Points: 278
More actions
October 15, 2007 at 7:15 pm

#178538

Heya,
I am trying to determine how to most efficiently keep my table free from duplicates.
On an ongoing bases (every 20 minetues...or so), duplicate record will be inserted to this table. The differentiator will be a datetime field saying when the record was last modified.
I want to delete the old/er record(s) and keep the latest
PK LastModified_DateStamp
1 10/15/2007 02:14:07T657
1 10/15/2007 02:37:05T837
Here, I want to delete the record on top and keep the record at the bottom.
I would appreciate ant suggestions and/or pointers...
Thanks,
~J

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 1

What you're calling a "PK" obviously isn't where this table is concerned or you wouldn't be able to insert the duplicate rows.

However, to solve your problem, this will work... and, it's very fast... finds 950 thousand dupes in a million row table and deletes them in 35 seconds.

[font="Courier New"]DELETE&nbspt1

&nbsp&nbsp&nbspFROM&nbspyourtable&nbspt1,

&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbsp&nbspyourtable&nbsp&nbspt2

&nbsp&nbspWHERE&nbspt1.PK&nbsp=&nbspt2.PK&nbsp

&nbsp&nbsp&nbsp&nbspAND&nbspt1.LastModified_DateStamp&nbsp<&nbspt2.LastModified_DateStamp[/font]

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jay H SSC Veteran Points: 278 More actions · Answer 2

As a matter of fact, I am fully capable of inserting new record, that actually are duplicates into this table.

I am not quite sure why you say I would not be able to...did I leave some obvious information out...?

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 3

You called one of your columns "PK"... if it were trully a primary key, it would not allow the dupes.

Doesn't matter... the code I posted will find them as you asked...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jay H SSC Veteran Points: 278 More actions · Answer 4

I understand what you are saying.

And I apologize for not being more clear.

My primary key is a true primary key in another system. Instead of truncating the table and do a fresh pull (it is to time consuming and expensive), I am only pulling down the records that changed from last time the process ran. Hence, 'duplicates' will occur in my sql table.

Now, I am trying to figure out how to determine the 'oldest' record and to delete that record.

Could I used the datetime as mentioned above? Or do I have to write a procedure?

I am hesitant to use a procedure, b/c I think it will be to expensive timewise.

Any suggestions..?

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 5

Jay H (10/16/2007)
I understand what you are saying.
And I apologize for not being more clear.
My primary key is a true primary key in another system. Instead of truncating the table and do a fresh pull (it is to time consuming and expensive), I am only pulling down the records that changed from last time the process ran. Hence, 'duplicates' will occur in my sql table.
Now, I am trying to figure out how to determine the 'oldest' record and to delete that record.
Could I used the datetime as mentioned above? Or do I have to write a procedure?
I am hesitant to use a procedure, b/c I think it will be to expensive timewise.
Any suggestions..?

Like I said, the code I posted works... and it's nasty fast...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jay H SSC Veteran Points: 278 More actions · Answer 6

Thanks Jeff,

I really appreciate your persistence and help.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 7

Thanks for the feedback, Jay... if you get a chance, let me know what your timings are. Would really appreciate that...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)