November 26, 2007 at 7:06 am
I am trying to structure a stored procedure to do the following:
I have an employee table and an employment table. Employees are linked to employments by employeeId from employee table. Employees can have more than one employment.
The problem is that we have duplicate employee records within Employee table.
I need to write a script that will determine which employee record has the highest Employee Number. For each of the other duplicate records change the employments to reflect the employeeId of the record with the highest Employee Number. Then remove the employee record(s) with the lower Employee Number(s).
Can anyone suggest a good way of doing this.
November 26, 2007 at 9:03 pm
Can you show some example in each table?
Employee table has duplicate records, what is the difference between employeeid and Employee Number?
Employee and Employment is linked by employeeid. Does one employeeid have multiple Employment records?
November 26, 2007 at 9:57 pm
inside a cursor (I must note this is the general idea a cursor is not always the best way to go) Call something like ---->
--OUTSIDE CURSOR
CREATE TABLE #Temp1
(MaxID INT
,MinID INT
,LastName VARCHAR (50)
,FirstName VARCHAR (50))
DECLARE @a VARCHAR (50) --(OR Whatever Type)
,@B VARCHAR (50) --(OR Whatever Type)
--INSDIE CURSOR
DECLARE CURSOR...
OPEN...
IF ...
BEGIN
INSERT INTO#Temp1...
SELECT MAX(EMPLOYEEID), MIN(EMPLOYEEID), @a, @b-2
FROM EMPLOYEE
WHERE LastName = @a
AND FirstName = @b-2
...
END
CLOSE...
This will help you compile a list.
I should note that the name field can be replaced with any identifier that should be the same between the two.
I should also note that many people have the same name so any unique identifer two all the records in the result set should be used.
Also using the same concept as laid out above you should be able to change it to suit your needs. If you have more then 2 rows per employee then this will have to be altered or at least ran again after you remove the rows until all row have been found <--- That may take some time all depending.
then when done with the clean up create a unique constraint on the unique identifier field --> not the table identity column <-- that would not work. That should eliminate the need for a stored procedure.
A great resource for this is Books online <--- Comes with SQL Server 2000
December 9, 2007 at 11:30 am
Non _ cursor method....(If I made the wrong assumption on which is the duplicate employee_id (highest or lowest), change max to min and > to <)
select min(e1.employee_id) 'employee_id', e1.employee_fname 'employee_fname',
e1.employee_lname 'employee_lname', e2.employee_id 'Duplicate_employee_id',
e2.employee_fname 'Duplicate_employee_fName', e2.employee_lname 'Duplicate employee_lname'
from employee e1
join employee e2 on e1.employee_id < e2.employee_id and e1.employee_fname = e2.employee_fname
and e1.employee_lname = e2.employee_lname
group by e1.employee_fname, e1. employee_lname, e2.employee_id, e2.employee_fname,
e2.employee_lname
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply