Finding Dupes based on previous weeks input.

Question

Finding Dupes based on previous weeks input.

dramaqueen

SSCommitted

Points: 1569
More actions
October 26, 2016 at 7:35 am

#400941

I am currently looking at identifying duplicate names and looking for behavior patters. A duplicate in this case is very high level and there is risk involved as you will see.
Definition of a duplicate: The same first name, surname and date of birth. Obviously there are genuine different people that hit this criteria (this dataset is approx. 3 million records) but it does not matter at the moment.
I am looking at a data set for the previous week and running a query to identify new people that have been entered into the database. I use a sub query to get this and there are about 757 from last week.
I need to create a SQL script that can take there 757 and cross reference / compare it to the same table to see if any duplicates come up that match the criteria of same first name, surname and data of birth.
I have tried to give it ago but it seems to just show the same records side by side so need to exclude people with matching ID's as this would be pointless.
SELECT O.ow_id,
O.ow_forenames,
O.ow_surname,
O.ow_birthdate,
O2.dp_person,
O2.ow_forenames,
O2.ow_surname,
O2.ow_birthdate
FROM props.dp_owner O
join (SELECT E.dp_person,
O3.ow_forenames,
O3.ow_surname,
O3.ow_birthdate
FROM props.dp_entry E
join props.dp_owner O3
ON E.dp_person = O3.ow_id
WHERE E.dp_date_entered BETWEEN To_date('17-OCT-16') AND To_date(
'23-OCT-16')
AND E.dp_person IS NOT NULL
GROUP BY E.dp_person,
O3.ow_forenames,
O3.ow_surname,
O3.ow_birthdate) O2
ON ( O.ow_forenames = O2.ow_forenames
AND O.ow_surname = O2.ow_surname
AND O.ow_birthdate = O2.ow_birthdate )
Thanks in advance,
Paul

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

Luis Cazares SSC Guru Points: 183706 More actions · Answer 1

Is this Oracle? :crazy:

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Phil Parkin SSC Guru Points: 246992 More actions · Answer 2

dramaqueen (10/26/2016)
I am currently looking at identifying duplicate names and looking for behavior patters. A duplicate in this case is very high level and there is risk involved as you will see.
Definition of a duplicate: The same first name, surname and date of birth. Obviously there are genuine different people that hit this criteria (this dataset is approx. 3 million records) but it does not matter at the moment.
I am looking at a data set for the previous week and running a query to identify new people that have been entered into the database. I use a sub query to get this and there are about 757 from last week.
I need to create a SQL script that can take there 757 and cross reference / compare it to the same table to see if any duplicates come up that match the criteria of same first name, surname and data of birth.
I have tried to give it ago but it seems to just show the same records side by side so need to exclude people with matching ID's as this would be pointless.
SELECT O.ow_id,
O.ow_forenames,
O.ow_surname,
O.ow_birthdate,
O2.dp_person,
O2.ow_forenames,
O2.ow_surname,
O2.ow_birthdate
FROM props.dp_owner O
join (SELECT E.dp_person,
O3.ow_forenames,
O3.ow_surname,
O3.ow_birthdate
FROM props.dp_entry E
join props.dp_owner O3
ON E.dp_person = O3.ow_id
WHERE E.dp_date_entered BETWEEN To_date('17-OCT-16') AND To_date(
'23-OCT-16')
AND E.dp_person IS NOT NULL
GROUP BY E.dp_person,
O3.ow_forenames,
O3.ow_surname,
O3.ow_birthdate) O2
ON ( O.ow_forenames = O2.ow_forenames
AND O.ow_surname = O2.ow_surname
AND O.ow_birthdate = O2.ow_birthdate )
Thanks in advance,
Paul

So ... add a line to the end of the query to remove those:

WHERE o.ow_id <> o2.ow_id

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

dramaqueen SSCommitted Points: 1569 More actions · Answer 3

dramaqueen

SSCommitted

Points: 1569

October 26, 2016 at 8:29 am

#1908916

Perfect thanks. It worked and was really simple.