May 1, 2014 at 10:26 am
Hello Everyone
I found some duplicate data as I was going thru the logic of a data pump. The entire row is not duplicated however.
I would like to delete only the one row.
This is a sample of the data:
DECLARE @SomeData TABLE
(
FirstName varchar(25)
, MiddleName varchar(25)
, LastName varchar(25)
, StreetAddress varchar(25)
, Suite varchar(25)
, City varchar(25)
, [State] varchar(25)
, PostalCode varchar(10)
, AreaCode varchar(5)
, PhoneNumber varchar(8)
)
INSERT INTO @SomeData
SELECT 'Joe','Al','Smith','123 Main Street',NULL,'Dallas', 'North Carolina', '45789', NULL, NULL UNION ALL
SELECT 'Joe','Al','Smith','123 Main Street',NULL,'Dallas', 'North Carolina', '45789', '256', '124-5879' UNION ALL
SELECT 'Sally','Ann','Smith','123 Main Street',NULL,'Dallas', 'North Carolina', '45789', '256', '124-5879' UNION ALL
SELECT 'Steven','David','Smith','123 Main Street',NULL,'Dallas', 'North Carolina', '45789', '256', '124-5879'
SELECT * FROM
@SomeData
As you can see, Joe Smith has two rows, but only one of the rows is complete. I would like to delete only the row that has a NULL value in the phone and area code for Joe Smith. There are a few thousand rows that are like this. They have duplicates all but the area code and phone number.
I am used to using a CTE to remove duplicates, but I am a little lost on this one. The things that I have tried, have not worked exactly as I planned.
I hope that someone has a better way.
Thank you in advance for your time, assistance, and suggestions
Andrew SQLDBA
May 1, 2014 at 10:40 am
I'd stage into a temporary table with an identity column and delete from the temp table where
1. The phone number or area code are null
2. And the name and address columns exist elsewhere with a different identity than the current row.
Did I capture the conditions correctly?
Thanks
John.
May 1, 2014 at 10:53 am
Thank You John
That is what I tried using a CTE, but I am not getting something correct. I am having trouble with select the correct row to delete.
Would you offer a sample of code?
Thanks
Andrew
May 1, 2014 at 12:01 pm
If you only see nulls in areacode and phonenumber you can get rid of them with temp table like john mentioned
select into #t1
from @somedata where AreaCode is not null and PhoneNumber is not null
May 1, 2014 at 12:09 pm
something like this... may be you need to modify it accordingly..
DECLARE @SomeData TABLE
(
FirstName varchar(25)
, MiddleName varchar(25)
, LastName varchar(25)
, StreetAddress varchar(25)
, Suite varchar(25)
, City varchar(25)
, [State] varchar(25)
, PostalCode varchar(10)
, AreaCode varchar(5)
, PhoneNumber varchar(8)
)
INSERT INTO @SomeData
SELECT 'Joe','Al','Smith','123 Main Street',NULL,'Dallas', 'North Carolina', '45789', NULL, NULL UNION ALL
SELECT 'Joe','Al','Smith','123 Main Street',NULL,'Dallas', 'North Carolina', '45789', '256', '124-5879' UNION ALL
SELECT 'Sally','Ann','Smith','123 Main Street',NULL,'Dallas', 'North Carolina', '45789', '256', '124-5879' UNION ALL
SELECT 'Steven','David','Smith','123 Main Street',NULL,'Dallas', 'North Carolina', '45789', '256', '124-5879'
;WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY FirstName, MiddleName, LastName Order by AreaCode DESC, PhoneNumber DESC) AS Rnum FROM @SomeData
)
--SELECT * FROM CTE
DELETE FROM CTE
--WHERE Rnum = 1
WHERE Rnum <> 1
SELECT * FROM @SomeData
http://sqlsaga.com/sql-server/how-to-remove-duplicates-from-a-table-in-sql-server/[/url]
Good Luck 🙂 .. Visit www.sqlsaga.com for more t-sql code snippets and BI related how to articles.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply