March 18, 2016 at 3:01 pm
Hi,
We are trying to clean up some data in an old database which unfortunately contains various duplicate records.
In this particular instance, I'm looking to identify any employee record where there is another employee record that is identical with the exception of the Phone field. For these employees, we want to retrieve the employee with the blank Phone value.
Here is a simplified version of what I'm working with. Yes, even the EmployeePk is duplicated in some records. Its a mess.
CREATE TABLE
#Employee
(
EmployeePk INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
BranchFk INT,
Phone VARCHAR(15)
)
INSERT INTO #Employee (EmployeePk, FirstName, LastName, BranchFk, Phone) VALUES (105, 'Bob','Smith',1,'(123) 456-7890')
INSERT INTO #Employee (EmployeePk, FirstName, LastName, BranchFk, Phone) VALUES (105, 'Bob','Smith',1,'')
INSERT INTO #Employee (EmployeePk, FirstName, LastName, BranchFk, Phone) VALUES (106, 'Jane','Jones',2,'(434) 434-4343')
INSERT INTO #Employee (EmployeePk, FirstName, LastName, BranchFk, Phone) VALUES (106, 'Jane','Jones',2,'(434) 434-4343')
INSERT INTO #Employee (EmployeePk, FirstName, LastName, BranchFk, Phone) VALUES (107, 'John','Brown',NULL,'(222) 222-2222')
INSERT INTO #Employee (EmployeePk, FirstName, LastName, BranchFk, Phone) VALUES (107, 'John','Brown',NULL,'(222) 222-2222')
INSERT INTO #Employee (EmployeePk, FirstName, LastName, BranchFk, Phone) VALUES (107, 'John','Brown',NULL,'')
--All records
SELECT * FROM #Employee
--Records I'd like to retrieve because it is a duplicate with a blank Phone value
SELECT EmployeePk = 105, FirstName = 'Bob',LastName = 'Smith',BranchFk = 1,Phone = ''
UNION
SELECT EmployeePk = 107, FirstName = 'John',LastName = 'Brown',BranchFk = NULL,Phone = ''
DROP TABLE #Employee
I greatly appreciate any assistance!
March 19, 2016 at 1:53 pm
Not sure I understand the requirement completely, but even if this is not exactly what you need, it should get you started:
SELECT EmployeePk,
FirstName,
LastName,
BranchFk,
CASE WHEN MIN(Phone) = MAX(Phone) THEN MIN(Phone) ELSE '' END AS Phone
FROM #Employee
GROUP BY EmployeePk,
FirstName,
LastName,
BranchFk;
Good luck sorting out that data mess... Don't forget to add constraints to the table as soon as th rubbish is gone, otherwise it'll all start over again.
March 20, 2016 at 5:39 am
If I understand the requirements correctly you want all the instances with blank Phone where the same FirstName,LastName,BranchFk exists with a non-blank phone.
😎
Quick suggestion
USE tempdb;
SET NOCOUNT ON;
GO
CREATE TABLE
#Employee
(
EmployeePk INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
BranchFk INT,
Phone VARCHAR(15)
)
INSERT INTO #Employee (EmployeePk, FirstName, LastName, BranchFk, Phone)
VALUES (105, 'Bob','Smith',1,'(123) 456-7890')
,(105, 'Bob','Smith',1,'')
,(106, 'Jane','Jones',2,'(434) 434-4343')
,(106, 'Jane','Jones',2,'(434) 434-4343')
,(107, 'John','Brown',NULL,'(222) 222-2222')
,(107, 'John','Brown',NULL,'(222) 222-2222')
,(107, 'John','Brown',NULL,'')
;
;WITH BASE_DATA AS
(
SELECT
EMP.EmployeePk
,COUNT(*) OVER
(
PARTITION BY EMP.FirstName
,EMP.LastName
,EMP.BranchFk
) AS INST_COUNT
,MAX(LEN(EMP.Phone)) OVER
(
PARTITION BY EMP.FirstName
,EMP.LastName
,EMP.BranchFk
) AS MX_LEN_PHONE
,EMP.FirstName
,EMP.LastName
,EMP.BranchFk
,EMP.Phone
FROM #Employee EMP
)
SELECT
BD.EmployeePk
,BD.FirstName
,BD.LastName
,BD.BranchFk
,BD.Phone
FROM BASE_DATA BD
WHERE BD.Phone = ''
AND BD.INST_COUNT > 1
AND BD.MX_LEN_PHONE > 0;
DROP TABLE #Employee;
Output
EmployeePk FirstName LastName BranchFk Phone
----------- ----------- ---------- ----------- ---------------
105 Bob Smith 1
107 John Brown NULL
March 21, 2016 at 10:13 am
Thank you for the responses! I'll give these a try.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply