Deleting Duplicates without a unique key

The following was sent to me by my friend and colleague Dave Miller:

Dave's Email:

Wanted to pass along something I hadn't used before and found useful to easily get rid of duplicates in a set of data. The functionality has existed in the SQL language and was supported in SQL Server 2005. This uses Common Table Expressions (CTE) and the ROW_NUMBER() function.

The PARTITION BY portion of the statment specifies when to reset the row number, in my example I had:

PersonId	Type	PNumber
A	1	123
A	1	345
B	1	123	***DELETE
A	2	123
B	1	123
B	1	123	***DELETE
A	2	123	***DELETE
B	1	1234

The following query will remove the duplicates:

WITH PersonPhones AS (
    SELECT ROW_NUMBER() OVER (PARTITION BY PersonId, PhonetypeId, PhoneNumber ORDER BY PersonId, PhonetypeId, PhoneNumber) AS GroupIndex,
        PersonId,
        PhonetypeId,
        PhoneNumber
    FROM Person.Phone pp
)
DELETE FROM PersonPhones WHERE GroupIndex > 1

The CTE would create a table with the following data, where anything that is a duplicate has a GroupIndexId > 1

GroupIndex	PersonId	Type	PNumber
1	A	1	123
1	A	1	345
1	A	2	123
2	A	2	123	***DELETE
1	B	1	123
2	B	1	123	***DELETE
3	B	1	123	***DELETE
1	B	1	1234

This method could be used to get the nth item of a group also.

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

Speaking at Community Events - More Thoughts

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

Deleting Duplicates without a unique key

Dave's Email:

Rate

Share

Share

Rate

Deleting Duplicates without a unique key

Dave's Email:

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts