Duplicate Employee Rows

Question

Post reply

Duplicate Employee Rows

catherine

SSC Rookie

Points: 35
More actions
November 26, 2007 at 7:06 am

#377184

I am trying to structure a stored procedure to do the following:
I have an employee table and an employment table. Employees are linked to employments by employeeId from employee table. Employees can have more than one employment.
The problem is that we have duplicate employee records within Employee table.
I need to write a script that will determine which employee record has the highest Employee Number. For each of the other duplicate records change the employments to reflect the employeeId of the record with the highest Employee Number. Then remove the employee record(s) with the lower Employee Number(s).
Can anyone suggest a good way of doing this.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

Loner SSC-Insane Points: 21279 More actions · Answer 1

Can you show some example in each table?

Employee table has duplicate records, what is the difference between employeeid and Employee Number?

Employee and Employment is linked by employeeid. Does one employeeid have multiple Employment records?

JDixon-385826 SSCrazy Points: 2723 More actions · Answer 2

inside a cursor (I must note this is the general idea a cursor is not always the best way to go) Call something like ---->

--OUTSIDE CURSOR

CREATE TABLE #Temp1

(MaxID INT

,MinID INT

,LastName VARCHAR (50)

,FirstName VARCHAR (50))

DECLARE @a VARCHAR (50) --(OR Whatever Type)

,@B VARCHAR (50) --(OR Whatever Type)

--INSDIE CURSOR

DECLARE CURSOR...

OPEN...

INTO @a, @b-2

IF ...

BEGIN

INSERT INTO#Temp1...

SELECT MAX(EMPLOYEEID), MIN(EMPLOYEEID), @a, @b-2

FROM EMPLOYEE

WHERE LastName = @a

AND FirstName = @b-2

INTO @a, @B...

...

END

CLOSE...

This will help you compile a list.

I should note that the name field can be replaced with any identifier that should be the same between the two.

I should also note that many people have the same name so any unique identifer two all the records in the result set should be used.

Also using the same concept as laid out above you should be able to change it to suit your needs. If you have more then 2 rows per employee then this will have to be altered or at least ran again after you remove the rows until all row have been found <--- That may take some time all depending.

then when done with the clean up create a unique constraint on the unique identifier field --> not the table identity column <-- that would not work. That should eliminate the need for a stored procedure.

A great resource for this is Books online <--- Comes with SQL Server 2000

toniupstny Hall of Fame Points: 3986 More actions · Answer 3

Non _ cursor method....(If I made the wrong assumption on which is the duplicate employee_id (highest or lowest), change max to min and > to <)

select min(e1.employee_id) 'employee_id', e1.employee_fname 'employee_fname',

e1.employee_lname 'employee_lname', e2.employee_id 'Duplicate_employee_id',

e2.employee_fname 'Duplicate_employee_fName', e2.employee_lname 'Duplicate employee_lname'

from employee e1

join employee e2 on e1.employee_id < e2.employee_id and e1.employee_fname = e2.employee_fname

and e1.employee_lname = e2.employee_lname

group by e1.employee_fname, e1. employee_lname, e2.employee_id, e2.employee_fname,

e2.employee_lname