Tracking "dups" in a table.

  • We have a solicitation database with the usual information: First name, last name, address, etc.

    I populated the database 6 months ago by dumping a list of names (based on geography) from a commercially available product. Now they want me to dump the names again from the same counties using the same product.

    Then after a selection process those names would be added to the same database as the original names. It looks like many of the names will be the same, resulting in duplicate database records. I don't want the dups but I do want to track the number of times we have solicited a party.

    I need suggestions on how to approach this problem.

    TIA,

    Bill

    I'd create an index based on fields in the original data, dump the "new" data to a temp table, create an index based on the same fields, and do a join against the indexes to recognize the dups. When a dup record is caught I'd update a "Count" field in the original record.

  • This was removed by the editor as SPAM

  • Write a SQL script something like the following:

    1. Import your "new" data into a #Temp table

    Maybe even in a disposable "temp" database

    2. Index the #Temp table if massive data

    3. INSERT INTO ExistingData

    SELECT * FROM NewData

    WHERE name NOT IN (SELECT NAME FROM ExistingData) -- or WHEREever

    4. Cleanup temp stuff

    Basic principal I use, is to get the new data into SQL tables, then either manipulate the new data, then update/InsertInto the existing, or perform the the update/inserts in one big step.



    Once you understand the BITs, all the pieces come together

  • I would use this little piece of code. First create a DupeId column in your main table. Then dump in your new data. Note: This will only work with exact matches.

    Select FullName, Count(FullName) as DupeId

    Into #Temp

    From Employees

    Group BY FullName

    HAVING Count(FullName)>1

    Update Employees

    Set Employees.DupeId = #Temp.DupeId

    From Employees, #Temp

    Where Employees.FullName = #Temp.FullName

    Drop Table #Temp

    Anyway, I hope this helps

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply