How to remove duplicate records

  • I have a master table in Access to which I append new records downloaded from the mainframe on a weekly basis. Some of the downloaded records match existing records in the master table in all fields except only the update_date field. I want to replace the record in the master table with the record from the mainframe if, and only if, the new record has a later update_date.

    If I leave the primary key turned on, Access just dumps the duplicate records without letting me see what they are. If I turn the primary key off, then I can find the duplicates via the duplicate wizard but need a query to analyze each record and, out of each pair of duplicates, keep the record with the newer update_date.

    I would greatly appreciate any suggests as to how to do append these new records.

  • Send the new records to a staging table. Then insert the new records that are not duplicates.

  • http://www.sqlservercentral.com/columnists/darjun/deletingduplicaterows.asp

    http://www.sqlservercentral.com/columnists/sramakrishnan/deletingduplicaterecords.asp




    My Blog: http://dineshasanka.spaces.live.com/

  • Thanks RGR'us and Dinesh for your prompt replies. I had searched the Access forum for old posts but did not think to do a wider search. The staging table solution and darjun's solution will add unique new records to the master table. sramakrishnan's solution uses a cursor that, I believe, Access does not understand/support. I should have explained this more clearly in my first post that my problem is to determine which or two or more duplicated records should be in the master table.

    The records in my master table have a number fields of which two are relevant: reg_number, which is the primary key, and update_date, which holds the value to be compared. The records downloaded from the mainframe either have new reg_numbers, in which case they can be appended directly from the staging table to the masterr table, or they have a reg_number that already exists in the master table, in which case I need to examine the value in update_date to determine which record, master table or staging table, is the newest one.

    If the record in the master table has the lastest update_date then the record in the staging table should be ignored/deleted. If the record in the staging table has the latest update_date, then the record in the master table should be deleted and the record in the staging table should be appended to the master table. Note: there may be more than one record in the staging table that has the same reg_number so each of these records must be examined.

    Again, I would greatly appreciate any suggestions/solutions.

    Regards,

  • You pretty much answered your own question. Insert the records that don't exists in master.

    Update the ones that exists and have a greater date version than the one in master.

    Drop the records in the staging table and you're done.

  • RGR'us,

    Thanks. It's obvious - once someone points it out to you.

  • It's often easier when you see the picture with some fresh eyes .

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply