Remove duplicate entries from a table with text fields!

  • I have a temporary table which I've populated with a set of records containing the following fields.

    org_code

    acct_code

    note_code

    Note_class

    Note_text (text)

    HTML_text (text)

     

    The table contains a few duplicate entries and I need to ultimately arrive at a table that eliminates the duplicate combinations of (org_code, Acct_code, Note_code and Note_class) essentially creating a unique index of the 4 columns.

    Example of dups

    SELECT CC025_ORG_CODE, CC025_EXT_ACCT_CODE, CC025_NOTE_CLASS, CC025_PROD_CODE

    FROM #TEMP_WEB_NOTES

    GROUP BY CC025_ORG_CODE, CC025_EXT_ACCT_CODE, CC025_NOTE_CLASS, CC025_PROD_CODE

    HAVING COUNT(*) > 1

    10 0000745 SFB RETAILSF

    10 0000745 WEB RETAILSF

    10 0012253 PTPG RTLCOMPSOUTH

    So essentially I need to remove all but one of each of the above records from the table, or create a new table that contains all other records and only one each of the above records.

     

    Thanks in advance

  • I've not used it with text columns, but if you give your temp table a unique identifier (say an Identity field), you could try something like:-

     

    delete a from #TEMP_WEB_NOTES as a

    inner join #TEMP_WEB_NOTES  as b on

    a.Column1 = b.Column1 and

    a.Column2 = b.Column2 and

    a.Column3 = b.Column3 and...etc

    ...where a.Identifier < b.Identifier

    I don't know if this would work if you wanted to remove duplicates within the text columns though...

  • How about just pushing the table into another temp table and use that:

    Insert #NewTempWebNotes (........) SELECT CC025_ORG_CODE, CC025_EXT_ACCT_CODE, CC025_NOTE_CLASS, CC025_PROD_CODE, max(Note_text), max(HTML_text)

    FROM #TEMP_WEB_NOTES

    GROUP BY CC025_ORG_CODE, CC025_EXT_ACCT_CODE, CC025_NOTE_CLASS, CC025_PROD_CODE

  • I discovered the answer from another post, here's the solution:

    DECLARE csr1 CURSOR FOR

    SELECT CC025_ORG_CODE, CC025_EXT_ACCT_CODE, CC025_NOTE_CLASS, CC025_PROD_CODE, COUNT(*)

    FROM #TEMP_SFCVB_WEB_NOTES -- Give your tablename having duplicate rows

    GROUP BY CC025_ORG_CODE, CC025_EXT_ACCT_CODE, CC025_NOTE_CLASS, CC025_PROD_CODE

    HAVING COUNT(*) > 1 --select values having at least one duplicate only

    FOR READ ONLY

    DECLARE @CC025_ORG_CODE VARCHAR(2)

    DECLARE @CC025_EXT_ACCT_CODE VARCHAR(8)

    DECLARE @CC025_NOTE_CLASS VARCHAR(6)

    DECLARE @CC025_PROD_CODE VARCHAR(24)

    DECLARE @COUNTER INT

    OPEN csr1

    FETCH NEXT FROM csr1 INTO @CC025_ORG_CODE, @CC025_EXT_ACCT_CODE, @CC025_NOTE_CLASS, @CC025_PROD_CODE, @COUNTER

    WHILE @@FETCH_STATUS <> -1

    BEGIN

    IF @@FETCH_STATUS <> -2

    BEGIN

    SET @COUNTER = @COUNTER - 1 --delete all duplicates except one (1)

    SET ROWCOUNT @COUNTER

    DELETE FROM #TEMP_SFCVB_WEB_NOTES -- Give same tablename given above for removing duplicate rows

    WHERE CC025_ORG_CODE = @CC025_ORG_CODE AND CC025_EXT_ACCT_CODE = @CC025_EXT_ACCT_CODE AND CC025_NOTE_CLASS = @CC025_NOTE_CLASS

    AND CC025_PROD_CODE = @CC025_PROD_CODE

    END --IF

    FETCH NEXT FROM csr1 INTO @CC025_ORG_CODE, @CC025_EXT_ACCT_CODE, @CC025_NOTE_CLASS, @CC025_PROD_CODE, @COUNTER

    END --WHILE

    CLOSE csr1

    DEALLOCATE csr1

    SET ROWCOUNT 0 --restore default of all rows

    Thank you all for your assistance

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply