Duplicate identification

Question

Duplicate identification

indraprakash

SSChasing Mays

Points: 637
More actions
June 18, 2008 at 3:07 am

#381203

Is there any thing like rowid in Oracle?
How to delete duplicate records from SQL sever without using template table or create an extra column.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Andras Belokosztolszki SSC-Insane Points: 22031 More actions · Answer 1

indraprakash (6/18/2008)
Is there any thing like rowid in Oracle?
How to delete duplicate records from SQL sever without using template table or create an extra column.

There is no rowid that you could access. You could identify the duplicate rows using count and group by, and delete them one by one using

delete top (1) from tablename where somecolumn = duplicatevalue

This could be automated as well. However, if there are many duplicates, you could just select the unique entries by using the distinct keyword into a new table, and replace the old table with the new one.

Regards,

Andras

Andras Belokosztolszki, MCPD, PhD
GoldenGate Software

indraprakash SSChasing Mays Points: 637 More actions · Answer 2

thanks,

But this is not the solution of my query. I want to know how to delete dulicate records without using any new table or any add any extra column

Andras Belokosztolszki SSC-Insane Points: 22031 More actions · Answer 3

indraprakash (6/18/2008)
thanks,
But this is not the solution of my query. I want to know how to delete dulicate records without using any new table or any add any extra column

In a single statement you cannot do this. You need to differentiate the rows that you are deleting, because you want to delete only the duplicated ones. Becuase of this you need to use either "delete TOP" or "set rownumber", but this would mean that you need to execute this statement for each individual duplicated value. You can iterate through all the values, e.g. use cursors, but moving the unique data out then back to the table is probably easier.

Regards,

Andras

Andras Belokosztolszki, MCPD, PhD
GoldenGate Software

noeld SSC Guru Points: 96590 More actions · Answer 4

indraprakash (6/18/2008)
Is there any thing like rowid in Oracle?
How to delete duplicate records from SQL sever without using template table or create an extra column.

Can you post DDL for your table ?

* Noel

Mauve SSChampion Points: 11316 More actions · Answer 5

Also knowing Oracle, I do wish that there was a ROWID equivalent in SQL Server. I'm also aware that it is used in "finding duplicate" queries.

In SQL Server 2005 you can use a CTE per the attached code snipit:

WITH numbered AS

(

SELECT

ROW_NUMBER() OVER (PARTITION BY {column name} ORDER BY {column name}) AS rowno,

{column name},

{column name2}

FROM

{table name}

)

DELETE FROM numbered

WHERE rowno > 1;

Working example (using a temp table as the base table) with a 2-column duplicate condition:

CREATE TABLE #SampleTestData (ID int NULL, Keyvalue VARCHAR(2));

INSERT INTO #SampleTestData (id, keyvalue) VALUES (1, 'aa');

INSERT INTO #SampleTestData (id, keyvalue) VALUES (2, 'bb');

INSERT INTO #SampleTestData (id, keyvalue) VALUES (1, 'aa');

SELECT * FROM #SampleTestData;

WITH numbered AS

(

SELECT

ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS rowno,

ID,

Keyvalue

FROM

#SampleTestData

)

DELETE FROM numbered

WHERE rowno > 1;

SELECT * FROM #SampleTestData;

Note: The "winks" in the above examples are closing (right) parens.

[font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

Andras Belokosztolszki SSC-Insane Points: 22031 More actions · Answer 6

JohnG (6/19/2008)
Also knowing Oracle, I do wish that there was a ROWID equivalent in SQL Server. I'm also aware that it is used in "finding duplicate" queries.
In SQL Server 2005 you can use a CTE per the attached code snipit:
.....

There is a simpler way 🙂 There is no need to use row_number if you want to get rid of the duplicates, you could just use

SELECT DISTINCT keyvalue FROM #SampleTestData

If you want to create an id for the unique value, you could do that easier by:

SELECT DISTINCT identity(int, 1,1) as id, keyvalue into #foo FROM #SampleTestData

I reckon neither of these will help the OP, as both of these create a separate table 🙁

Regards,

Andras

Andras Belokosztolszki, MCPD, PhD
GoldenGate Software

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 7

Andras Belokosztolszki (6/18/2008)
indraprakash (6/18/2008)
thanks,
But this is not the solution of my query. I want to know how to delete dulicate records without using any new table or any add any extra column
In a single statement you cannot do this. You need to differentiate the rows that you are deleting, because you want to delete only the duplicated ones. Becuase of this you need to use either "delete TOP" or "set rownumber", but this would mean that you need to execute this statement for each individual duplicated value. You can iterate through all the values, e.g. use cursors, but moving the unique data out then back to the table is probably easier.
Regards,
Andras

You are incorrect Andras. Another poster has already put up code that works in SQL 2005. There is a mechanism for 2000 as well although it is certainly not as elegant as using OVER and ROW_NUMBER.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Andras Belokosztolszki SSC-Insane Points: 22031 More actions · Answer 8

...
You are incorrect Andras. Another poster has already put up code that works in SQL 2005. There is a mechanism for 2000 as well although it is certainly not as elegant as using OVER and ROW_NUMBER.

You are right, what confused my was that there was an id column in that example. Thanks for pointing this out. The CTE solution indeed works, and without the id column, where rows are indeed duplicates, the script would be like:

CREATE TABLE #SampleTestData (keyvalue VARCHAR(2));

INSERT INTO #SampleTestData (keyvalue) VALUES ('aa');

INSERT INTO #SampleTestData (keyvalue) VALUES ('bb');

INSERT INTO #SampleTestData (keyvalue) VALUES ('aa');

INSERT INTO #SampleTestData (keyvalue) VALUES ('bb');

INSERT INTO #SampleTestData (keyvalue) VALUES ('aa');

SELECT * FROM #SampleTestData;

WITH numbered AS

(

SELECT

ROW_NUMBER() OVER (PARTITION BY keyvalue ORDER BY keyvalue) AS rowno,

keyvalue

FROM

#SampleTestData

)

DELETE FROM numbered

WHERE rowno > 1;

SELECT * FROM #SampleTestData;

Regards,

Andras

Andras Belokosztolszki, MCPD, PhD
GoldenGate Software

indraprakash SSChasing Mays Points: 637 More actions · Answer 9

indraprakash

SSChasing Mays

Points: 637

June 20, 2008 at 2:20 am

#831567

thanks Andras

indraprakash

Andras Belokosztolszki SSC-Insane Points: 22031 More actions · Answer 10

indraprakash (6/20/2008)
thanks Andras
indraprakash

The solution actually came from JohnG, and I've learned something too 🙂

Glad your issue is solved

- Andras

Andras Belokosztolszki, MCPD, PhD
GoldenGate Software

Mauve SSChampion Points: 11316 More actions · Answer 11

The solution actually came from JohnG, and I've learned something too
Glad your issue is solved
- Andras

I can't take credit for the actual solution. I'm just re-providing something that was provided earlier (from some thread) on the SQL ServerCentral site. So the real credit goes to this site.

Just like someone provided the "duplicate" discovery/deletion solution for Oracle using ROWID, which I have passed on numerous times, I'm also passing on something that was provided to me. Think of it as borrowing tools from a common "toolbox".

[font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]