Find matching phone of person based on relation type - duplicates

  • I have a patient record and emergency contact information. I need to find duplicate phone numbers in emergency contact table based on relationship type (RelationType0 between emergency contact and patient. For example, if patient was a child and has mother listed twice with same number, I need to filter these records. The case would be true if there was a father listed, in any cases there should be one father or one mother listed for patient regardless. The link between patient and emergency contact is person_gu. If two siblings linked to same person_gu, there should be still one emergency contact listed.

    Below is the schema structure:

    Person_Info: PersonID, Person Info contains everyone (patient, vistor, Emergecy contact) First and last names

    Patient_Info: PatientID, table contains patient ID and other information

    Patient_PersonRelation: Person_ID, patientID, RelationType

    Address: Contains address of all person and patient (key PersonID)

    Phone: Contains phone # of everyone (key is personID)

    The goal to find matching phone for same person based on relationship type (If siblings, then only list one record for parent because the matching phones are not duplicates)

  • One method to do that is to join back to the table based on Phone where personid <> personid.

    You could also implement a row_number() function on the dataset, based on Phone and only select out the rows that have a row_number > 1.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Thanks for the schema, but to get a faster (and better answer) please provide the create table statements along with some sample code including the issue you are trying to solve along with the desired output of the query.

    With that said you should be able to use a group by with a count to come up with a list of duplicate phones by patient by contact type. If you provide the items I mention above then someone here will give the exact script you need.



    Microsoft Certified Master - SQL Server 2008
    Follow me on twitter: @keith_tate

    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Thank you so much, attached is a schema sample with some data.

  • That helps, can you provide the desired output from the sample data?



    Microsoft Certified Master - SQL Server 2008
    Follow me on twitter: @keith_tate

    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Here is an updated schema with better data to generate OP

    Kindly please get the O/P as following

    PersonName PatientID PatientName Phone Relation

    In the example I provided, the o/p should list one record for Larry and Nancy even though they are listed as parents for both Sam and Suzzie (just an example). In the phone tbl, the parents are listed more than one time. We want to filter duplicate matching phone for each person (mother has more than one record, father has more than one record)

    Thank you.

  • Using what Jason mentioned above, does this work:;with cte as (

    select

    c.Fname + ' ' + c.lname as PersonName

    ,pt.PatientID

    ,p.Fname + ' ' + p.Lname as PatientName

    ,ph.Phone_number

    ,ROW_NUMBER() over (partition by c.per_pk order by ph.Phone_number) as rn

    from dbo.Person_info p

    join dbo.Patient_T pt on p.Per_PK = pt.PatientID

    join dbo.[Patient_Parent_Relation_T] ppr on pt.PatientID = ppr.PatientID

    join dbo.Person_info c on ppr.Parent_ID = c.Per_PK

    join dbo.Phone ph on c.Per_PK = ph.Per_PK)

    select

    PersonName

    ,PatientID

    ,PatientName

    ,Phone_number

    from cte where rn = 1



    Microsoft Certified Master - SQL Server 2008
    Follow me on twitter: @keith_tate

    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Thank you SO VERY much, it worked...

  • One word of caution: based on the filtering being done - you will get inconsistent results if you run the query multiple times. As in - since your query is designed to return on phone+phone and a SINGLE patient that matches that contact, there's no guarantee than you will get the SAME patient next time you run the query. If someone doesn't understand that to be a possibility, including the patient info is likely to caused a LOT of confusion.

    Also - not to bring up a completely separate topic, but in this day and age, any presumptions around "each child should have only ONE mother and ONE father" is likely to be challenged by reality. I'd tread lightly with that one.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I agree, but records entered many times without checking if child's parent is there or not.

    I am verifying the query against real data, and let you know if I have any questions. Thanks again.

  • I ran the query against real data, and noticed the query does not take into account relationship_type. For example, mother and father can both have same phone. In that case, for each person (mother or father) the record is not duplicate if we take each person separate.

  • lsalih (8/11/2014)


    I ran the query against real data, and noticed the query does not take into account relationship_type. For example, mother and father can both have same phone. In that case, for each person (mother or father) the record is not duplicate if we take each person separate.

    You're running the PARTITION BY according to the contact person's primary key (which should return each person just once, regardless of what relationship they have).

    What is the ultimate intent:

    a. each distinct person+relationship+phone number, associated with ONE of the patients

    b. each distinct person+relationship+phone number, associated with ALL of the patients

    c. each distinct relationship+phone number, associated with ONE of the patients

    d. ???

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • lsalih (8/11/2014)


    I ran the query against real data, and noticed the query does not take into account relationship_type. For example, mother and father can both have same phone. In that case, for each person (mother or father) the record is not duplicate if we take each person separate.

    Can you provide a better sample of data to illustrate this and what your desired output should look like?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • The ultimate goal is to find duplicates. The reason we need to take relationship and phone into account because that is where most of the duplicates are.

    For example:

    Patient 1, patient 2 and 3 are siblings

    each time they were entered into the system a parent record was tied to them. Many times when data was entered, search by parent phone was not made to verify if parents exist or not. Thus same parent was entered more than once. The key is to find matching phone numbers for each relationship type tied to child, if more than one phone per relationship type exist then we need to list it. The reason why relationship type is used because we then know it is the same mother or father for the child.

    For example:

    Child1 parent1 M 123

    Child1 parent2 M 123

    O/P:

    Child1 Parent1 M 123

  • Sorry if we're being tentative - but de-duping tends to be tricky, especially when using phone # as the basis. I keep staring that this one - and am not yet sure how to take it on.

    A few questions:

    1. Have you de-duped the phone numbers by person already? If the same person is associated with the same phone # multiple times, that sounds like a fairly easy win.

    2. Say that you have 40 separate contacts all living in a half-way home. It is going to be VERY likely that they all have the same phone # (the number of the shelter), and will have the same relationship. How do you see handling this?

    3. Say that Larry and John adopt a child together, and they live under the same roof. Both will be listed as a father to said child. What's the behavior?

    4. How do you plan on using the sibling data? Trying to use that correlate parent relationship is going to be tricky (not to sound like a Jerry Springer show - siblings may not have the same set of parents).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply