Made stupid mistake, need help removing duplicates

  • That's good info but I still can't figure out what you want deleted from the data that you've provided.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • These are the duplicates that need to be deleted for this site:

    SELECT '154304W','3','2009','Diospyros virginiana','A','1','Aug 10 2009 10:06AM','1260'UNION ALL

    SELECT '154304W','3','2009','Celtis occidentalis','C','1','Aug 10 2009 10:07AM','1261'UNION ALL

    SELECT '154304W','3','2009','Ailanthus altissima','B','1','Aug 10 2009 10:08AM','1262'UNION ALL

    SELECT '154304W','3','2009','Acer negundo','A','13','Aug 10 2009 10:08AM','1263'UNION ALL

    SELECT '154304W','3','2009','Acer negundo','B','5','Aug 10 2009 10:08AM','1264'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','A','4','Aug 10 2009 10:08AM','1265'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','B','6','Aug 10 2009 10:08AM','1266'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','C','4','Aug 10 2009 10:08AM','1267'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','D','1','Aug 10 2009 10:08AM','1268'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','E','5','Aug 10 2009 10:08AM','1269'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','F','4','Aug 10 2009 10:12AM','1270'UNION ALL

    SELECT '154304W','3','2009','Robinia pseudoacacia','E','1','Aug 10 2009 10:13AM','1271'UNION ALL

    SELECT '154304W','3','2009','Fraxinus pennsylvanica','C','2','Aug 10 2009 10:15AM','1272'UNION ALL

    SELECT '154304W','3','2009','Fraxinus pennsylvanica','D','1','Aug 10 2009 10:15AM','1273'UNION ALL

    SELECT '154304W','3','2009','Quercus palustris','B','2','Aug 10 2009 10:15AM','1274'UNION ALL

    SELECT '154304W','3','2009','Quercus palustris','C','3','Aug 10 2009 10:15AM','1275'UNION ALL

    SELECT '154304W','3','2009','Quercus palustris','D','2','Aug 10 2009 10:15AM','1276'UNION ALL

    SELECT '154304W','3','2009','Ulmus americana','A','2','Aug 10 2009 10:15AM','1277'UNION ALL

    SELECT '154304W','3','2009','Quercus imbricaria','A','1','Aug 10 2009 10:15AM','1278'

  • Okay, so we're back to the original idea of what we were looking to do. However, before you trust this code, make sure that you check the results of the inner query (subquery) before you fire it off.

    I left in the construction code because it wasn't actually a temp table above where I stripped it from. 🙂 This should be perfectly valid in 2k.

    --===== If the test table already exists, drop it

    IF OBJECT_ID('TempDB..#WetlandTemp','U') IS NOT NULL

    DROP TABLE #WetlandTemp

    --===== Create the test table with

    CREATE TABLE #WetlandTemp

    (

    [SiteID] [nvarchar](25),

    [NumOfVisit] [int] NULL,

    [PYear] [int] NULL,

    [PSID] [nvarchar](255),

    [Szclass] [nvarchar](25),

    [PCount] [float] NULL,

    [DataEntryDate] [datetime] NULL,

    [sort_id] [int] IDENTITY(1,1) NOT NULL,

    )

    --===== All Inserts into the IDENTITY column

    SET IDENTITY_INSERT #WetlandTemp ON

    --===== Insert the test data into the test table

    INSERT INTO #WetlandTemp

    (SiteID, NumOfVisit, PYear, PSID, Szclass, PCount, DataEntryDate, sort_id)

    SELECT '154304W','3','2009','Diospyros virginiana','A','1','Aug 10 2009 10:06AM','1136'UNION ALL

    SELECT '154304W','3','2009','Celtis occidentalis','C','1','Aug 10 2009 10:07AM','1137'UNION ALL

    SELECT '154304W','3','2009','Ailanthus altissima','B','1','Aug 10 2009 10:08AM','1138'UNION ALL

    SELECT '154304W','3','2009','Acer negundo','A','13','Aug 10 2009 10:08AM','1139'UNION ALL

    SELECT '154304W','3','2009','Acer negundo','B','5','Aug 10 2009 10:08AM','1140'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','A','4','Aug 10 2009 10:08AM','1141'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','B','6','Aug 10 2009 10:08AM','1142'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','C','4','Aug 10 2009 10:08AM','1143'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','D','1','Aug 10 2009 10:08AM','1144'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','E','5','Aug 10 2009 10:08AM','1145'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','F','4','Aug 10 2009 10:12AM','1146'UNION ALL

    SELECT '154304W','3','2009','Robinia pseudoacacia','E','1','Aug 10 2009 10:13AM','1147'UNION ALL

    SELECT '154304W','3','2009','Fraxinus pennsylvanica','C','2','Aug 10 2009 10:15AM','1148'UNION ALL

    SELECT '154304W','3','2009','Fraxinus pennsylvanica','D','1','Aug 10 2009 10:15AM','1149'UNION ALL

    SELECT '154304W','3','2009','Quercus palustris','B','2','Aug 10 2009 10:15AM','1150'UNION ALL

    SELECT '154304W','3','2009','Quercus palustris','C','3','Aug 10 2009 10:15AM','1151'UNION ALL

    SELECT '154304W','3','2009','Quercus palustris','D','2','Aug 10 2009 10:15AM','1152'UNION ALL

    SELECT '154304W','3','2009','Ulmus americana','A','2','Aug 10 2009 10:15AM','1153'UNION ALL

    SELECT '154304W','3','2009','Quercus imbricaria','A','1','Aug 10 2009 10:15AM','1154'UNION ALL

    SELECT '154304W','3','2009','Diospyros virginiana','A','1','Aug 10 2009 10:06AM','1260'UNION ALL

    SELECT '154304W','3','2009','Celtis occidentalis','C','1','Aug 10 2009 10:07AM','1261'UNION ALL

    SELECT '154304W','3','2009','Ailanthus altissima','B','1','Aug 10 2009 10:08AM','1262'UNION ALL

    SELECT '154304W','3','2009','Acer negundo','A','13','Aug 10 2009 10:08AM','1263'UNION ALL

    SELECT '154304W','3','2009','Acer negundo','B','5','Aug 10 2009 10:08AM','1264'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','A','4','Aug 10 2009 10:08AM','1265'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','B','6','Aug 10 2009 10:08AM','1266'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','C','4','Aug 10 2009 10:08AM','1267'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','D','1','Aug 10 2009 10:08AM','1268'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','E','5','Aug 10 2009 10:08AM','1269'UNION ALL

    SELECT '154304W','3','2009','Acer saccharinum','F','4','Aug 10 2009 10:12AM','1270'UNION ALL

    SELECT '154304W','3','2009','Robinia pseudoacacia','E','1','Aug 10 2009 10:13AM','1271'UNION ALL

    SELECT '154304W','3','2009','Fraxinus pennsylvanica','C','2','Aug 10 2009 10:15AM','1272'UNION ALL

    SELECT '154304W','3','2009','Fraxinus pennsylvanica','D','1','Aug 10 2009 10:15AM','1273'UNION ALL

    SELECT '154304W','3','2009','Quercus palustris','B','2','Aug 10 2009 10:15AM','1274'UNION ALL

    SELECT '154304W','3','2009','Quercus palustris','C','3','Aug 10 2009 10:15AM','1275'UNION ALL

    SELECT '154304W','3','2009','Quercus palustris','D','2','Aug 10 2009 10:15AM','1276'UNION ALL

    SELECT '154304W','3','2009','Ulmus americana','A','2','Aug 10 2009 10:15AM','1277'UNION ALL

    SELECT '154304W','3','2009','Quercus imbricaria','A','1','Aug 10 2009 10:15AM','1278'

    --===== Set the identity insert back to normal

    SET IDENTITY_INSERT #WetlandTemp OFF

    DELETE wt

    FROM

    #WetlandTemp AS wt

    JOIN

    (SELECT

    PSID, SZclass, PCount,

    MAX(sort_id) AS RemoveValue

    FROM

    #WetlandTemp

    GROUP BY

    PSID, SZclass, PCount

    HAVING

    COUNT(*) > 1

    ) AS drv

    ON

    wt.sort_id = drv.RemoveValue

    SELECT * FROM #WetlandTemp


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • So I ran the test code and it worked. I'm guessing my next step is to try it on a section of the real table, but I had a couple of questions:

    Do I just change the source table from WetlandTemp to the name of the real table?

    In the real table, I've got data from about 30 sites, so when I'm writing the code do I need to include SiteID in the SELECT statement to make sure I'm just getting rid of duplicates within a site?

    When I run this code I also want to restrict it to year 2009, so I just include a WHERE statement, correct?

  • ryetimothy (12/7/2012)


    So I ran the test code and it worked. I'm guessing my next step is to try it on a section of the real table, but I had a couple of questions:

    Do I just change the source table from WetlandTemp to the name of the real table?

    In the real table, I've got data from about 30 sites, so when I'm writing the code do I need to include SiteID in the SELECT statement to make sure I'm just getting rid of duplicates within a site?

    When I run this code I also want to restrict it to year 2009, so I just include a WHERE statement, correct?

    Yes, Definately, and oh heck yes.

    Without seeing the actual table it's hard for me to say for sure. MAKE A BACKUP! Better yet, work things out on a non-production copy of the DB. Things can go horribly wrong the first few times you're trying to do work like this and you definately want to be able to protect yourself from data missteps.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • ryetimothy (12/7/2012)


    So I ran the test code and it worked. I'm guessing my next step is to try it on a section of the real table, but I had a couple of questions:

    Do I just change the source table from WetlandTemp to the name of the real table?

    In the real table, I've got data from about 30 sites, so when I'm writing the code do I need to include SiteID in the SELECT statement to make sure I'm just getting rid of duplicates within a site?

    When I run this code I also want to restrict it to year 2009, so I just include a WHERE statement, correct?

    NO!!!! You haven't actually tested the code against the real data yet. Make a copy of the real table and test it there espcially sinc eyou're addig caveat criteria to the code!

    Also, this is a rather wholesale change to the data. Make sure the database (or, at least the whole table) has been backed up before proceding with a real run.

    {edit} I see Craig already covered the backup recommendation. I didn't look any further down when I saw the message. All I saw was "Oh heck yes" and that prompted a bit of urgency on my part.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I've definitely backed up the database before trying this, but it looks like it works!!!!!!!!!

    I should be able to fiddle with the code to take care of the rest of the duplicates. Thank you so much for your help!!!!!

  • Jeff Moden (12/7/2012)


    {edit} I see Craig already covered the backup recommendation. I didn't look any further down when I saw the message. All I saw was "Oh heck yes" and that prompted a bit of urgency on my part.

    LOL, no worries Jeff. I do cowboy a bit more than you but only when I am making assumptions about the skill levels of those involved. 🙂

    Timothy, glad it's working. Chain your changes into multiple scripts, do an end to end test in your Dev environment, make sure nothing 'extra' got lost, so you can roll it intact and fully tested to the real prod environment.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I've got a few things that have slipped through the cracks (for whatever reason), but it looks like a manageable number. Thanks again for all the help!

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply