Query Assistance

  • I recently took over when our DBA left. I have very little experience with SQL Server, and the most complicated SQL I've written is basic select, insert, update and delete statements. My first emergency occured today, when a client has duplicate records in his database because of an update we sent out a while ago. I know why the duplications are occuring, but since the update went out already, I can't recall it. I can only fix the existing problems.
     
    So, here's the setup:
    I have this single table of Steps. There are several fields, but the only ones we care about are here, along with some sample data.
    in_StepID  in_TaskID

    2589        1462

    2592        1466

    2680        1546

    2790        1550

    2896        1462

    2966        1466

     
    What I need returned, are rows that have the same in_TaskID occuring. So, in this case, I need in_StepIDs 2589, 2592, 2896 and 2966.
    I have this query which I wrote, which returns everything. I can't figure out how to get only the rows I need. I am fairly certain I need to use Count() but nothing I've done has worked.

    select

    b.in_TaskID, a.in_StepID

    from step as a inner join step as b

    on a.in_StepID = b.in_StepID

    group

    by b.in_TaskID, a.in_StepID

    order

    by b.in_TaskID

    Thank you very much for any help. I'm sure I'll need more once we get this part working.

    em

    P.S. I had to edit this when my manager complained that I had real information And Lucky suggested the table structure. Hopefully, this is what is meant:

    CREATE TABLE [TD_TASK_STEP](

    [in_StepID] [int] IDENTITY(1,1) NOT NULL,

    [dc_UnitsMultiplier] [decimal](14, 8) NULL,

    [dc_UnitsMultiplierCorp] [decimal](14, 8) NULL,

    [tx_StepDesc] [varchar](900) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [tx_Comments] [varchar](1024) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [in_StepOrder] [int] NOT NULL,

    [in_CorpStepID] [int] NULL,

    [tx_UserName] [varchar](8) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [dt_LastChanged] [datetime] NULL,

    [in_ContentID] [int] NOT NULL,

    [in_TaskID] [int] NOT NULL,

    CONSTRAINT [PK__TD_TASK_STEP] PRIMARY KEY CLUSTERED

    (

      [in_StepID] ASC

    ) ON [PRIMARY]

    ) ON [PRIMARY]

  • Hello,

    Can you post the table structures which you are using in your query? It might give an insight for anybody to help you out.

    Thanks


    Lucky

  • Try this query in your production environment:

    select

        *

    from

        dbo.TD_TASK_STEP a

    where

        exists (    select

                        1

                    from

                        dbo.TD_TASK_STEP b

                    where

                        b.in_StepID <> a.in_StepID

                        and b.in_TaskID = a.in_TaskID)

    hth

     

  • declare @TD_TASK_STEP table (

        in_StepID int,

        in_TaskID int

        )

    insert into @TD_TASK_STEP

    select 2589, 1462 union all

    select 2592, 1466 union all

    select 2680, 1546 union all

    select 2790, 1550 union all

    select 2896, 1462 union all

    select 2966, 1466

    SELECT in_StepID, a.in_TaskID

    FROM @TD_TASK_STEP a

        INNER JOIN (

                    SELECT in_TaskID

                    FROM @TD_TASK_STEP

                    GROUP BY in_TaskID

                    HAVING COUNT(*) > 1

                    ) b

        ON a.in_TaskID = b.in_TaskID

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • Emma,

    Both Lynn and John solutions will work for finding the duplicates... the question is, what do you want to do with them once you've found them?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you so much for your help!
     
    Here's what happened, and what I'm doing. We wrote a script that inserted several default Steps for given Tasks. Then, when the user selected these Tasks, the Steps would automatically appear. Unfortunately, we neglected to account for Tasks that already existed and had identical Steps associated, hence the duplicates.
     
    It was decided that I write a script that tests for these duplicates, so first I need to find them, then I need to compare the tx_StepDesc to see if they are identical (This is not something I approve of, since this is a user editable field, but I was overruled.) Finally, I check to see if one of the tx_UserName is PRODLOAD, the one we use when we work with the database. If PRODLOAD is found, I will delete it and the only record that will remain is the one the User created.
     
    Does that make any sense? Am I going about this the right way?
     
    em
  • Yes, it does make sense... What I'd recommend is writing a script that would populate a working table (permanent table that will last until this problem is resolved and then deleted) that captures your duplicate conditions and has a column where you could mark the records for deletion.  An automated process (built to do as you have described) would make the first update to the deletion column.  Then, a review (manual and by script) could be made to assure that there are no exceptions.  Fine tuning would be accomplished by manually changing the deletion flag for individual records.

    When complete, a simple delte using the working table as the driver could be accomplished.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you very much for your help. I did what you said with the working table, and that seemed to work ok. It turns out there was an extra complication, in that there were many duplicates that were supposed to be there. Oh, and it also turned out that the original person who was unsatisfied solved his own problem by going into the application and deleting the duplicates manually anyway, and no one else has mentioned that this is a problem So, I just finished it out of curiosity to see if I could get it working. If anyone else complains, I'm ready!
     
    em
  • Why the heck does he have access to the PRODUCTION TABLES ?

  • I never ask that question because the answer is usually something that raises my hackles

    It's usually ok for Developers to have read access but I like what we do better... we make a "clone" (not sure that's the right word) of the DB using our SAN software and, at midnight, we replace the reporting DB lock, stock, and barrel except for the DB where the Developers can keep their code.  It's sometimes a bit of a problem because their data will get overwritten unless they store it in the dev DB (like they're supposed to) but, for the most part, it's worked out really well.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'm not sure you would say he has direct access to the Production Tables; he deleted the duplicates through the application that sits on top of the database.
     
    We don't have one single database, we distribute a blank copy of the db and the account fills it up as they need. Supposedly, we're the only ones that can actually access it through the SQL Server tools. In reality, there are several accounts that have fiddled with it rendering our application unstable. Does that make sense? Is that normal?
     
    Jeff, that sounds like a great idea. We sort of do something similar, in that we have a corporate database which stores all of the information from all of our accounts in one location. It's kind of their backup (since many of them don't keep local backups ) and several company executives have expressed interest in mining that data; kind of neat that I get to learn that I became DBA at the right time!
     
    em

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply