Eliminating duplicate rows

Question

Post reply

Eliminating duplicate rows

zach-435178

Mr or Mrs. 500

Points: 523
More actions
September 10, 2007 at 2:30 pm

#197706

Hello all, my problem is that I was running update statements and at the bottom of the page there was a nice little insert statement that i didn't see. So every time I was executing my updates I was also inserting new rows into another table (rookie mistake, live and learn I guess).
So my question is does anyone know how to get rid these duplicate rows?
I want to keep the originals but just delete the duplicates... thanks in advance for any help

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 1

1 : Restore table from backup

2 : Make sure you always have a backup

3 : NEVER EVER WORK ON PRODUCTION SERVER.

4 : Search these forums. That answer is posteds few times a week here.

zach-435178 Mr or Mrs. 500 Points: 523 More actions · Answer 2

I have tried something like:

select

distinct emimageid, * into image_3 from image_1st

but nothing, I have also scanned over some different forums and i don't see anything that does what I am looking for.

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 3

Ummm.... you need to identify to us what makes a "duplicate" and, perhaps, post the DDL for the table where the dupes exist. Elimination of duplicates is not difficult but we do need to know what makes a duplicate by column name(s).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

zach-435178 Mr or Mrs. 500 Points: 523 More actions · Answer 4

sorry i wasn't clear, the whole row is duplicate except for the ID column which is a GUID..

sometimes there is 3 of a row and sometimes just one

SwePeso SSC-Dedicated Points: 39758 More actions · Answer 5

Are you using SQL Server 2000 or SQL Server 2005?

N 56°04'39.16"
E 12°55'05.25"

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 6

No, I understood that... but typically there is some unique key (column or combination of columns) other than the GUID that would identify the unique row. Need that info to do this...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 7

Ninja's_RGR'us

SSC Guru

Points: 294069

September 11, 2007 at 8:14 am

#732795

Congrats on the 4K mile Jeff.

zach-435178 Mr or Mrs. 500 Points: 523 More actions · Answer 8

SQL 2005 management studio...

No sorry the only thing thats different is the GUIDs..

columns I have:

ID----GUID

ManCode----nvarchar(3)

emImageID----int

description----nvarchar(250)

path----nvarchar(150)

Everything is the same on the duplicate rows except for the GUID... also not all rows have a duplicate just some, thanks for your patience and input.

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 9

This might be considered "cheating" but the most straightforward way is to extract one row per "unique" set, blow everything out and rebuild:

--find the unique file names and extract them to a temp table

Select mancode, emimageID,description,path, min(id) as ID into #temptable from <mytable>

GROUP BY mancode, emimageID,description,path

--clean out the table

delete * from <mytable>

--put the unique values back in

Insert mytable (id,mancode,emimageID,description,path)

select id,mancode,emimageID,description,path from #temptable

--clean up after yourself

drop table #temptable

This is going to force table scans and all sorts of nasty performance since we have to do this on all the fields NOT indexed, so you probably want to make sure you only have to do this once. Do that by putting a unique constraint in place (so it is not possible to insert duplicate values). You should however make sure that this kind of constraint won't blow up your application.

It's ugly, but it will do the job.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

zach-435178 Mr or Mrs. 500 Points: 523 More actions · Answer 10

Thanks matt I think I am on the right path but when I run the query I get the error:

Msg 8117, Level 16, State 1, Line 1

Operand data type uniqueidentifier is invalid for min operator.

I tried changing the ID field from a GUID to a nvarchar and the query runs fine but I get the same results as when I started.

The GUIDs are the only thing different between the rows and not all rows have a duplicate just some

thanks again for helping

zach-435178 Mr or Mrs. 500 Points: 523 More actions · Answer 11

ok figured out what i needed to do, just bad programming to start with, thanks all

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 12

Thanks Remi!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 13

C'mon Zach... folks helped you out... the least you could do is post your solution

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

zach-435178 Mr or Mrs. 500 Points: 523 More actions · Answer 14

Well.... its kind of difficult to explain.

My situation was that I had inserted several rows into one table that were duplicates, but I had already related the table ID's to other tables therefore forcing me to keep the duplicates because I do not know which GUID id is tied to the other tables.

On the duplicate rows the column I use to relate is the ID column which is GUID, so when I had a one to many relation the GUID id's alternate between the GUID values even though the rest of the data is identical.

Example:

5lk34-adfnjo8 somedata

334d-fad9324 somedata

I apologize if this isn't clear, it is difficult to explain with out seeing my situation.

Thanks to all for the input