Query Assistance

Question

Query Assistance

Belisama

SSC Enthusiast

Points: 188
More actions
September 8, 2006 at 12:30 pm

#115340

I recently took over when our DBA left. I have very little experience with SQL Server, and the most complicated SQL I've written is basic select, insert, update and delete statements. My first emergency occured today, when a client has duplicate records in his database because of an update we sent out a while ago. I know why the duplications are occuring, but since the update went out already, I can't recall it. I can only fix the existing problems.

So, here's the setup:
I have this single table of Steps. There are several fields, but the only ones we care about are here, along with some sample data.
in_StepID in_TaskID
2589        1462
2592        1466
2680        1546
2790        1550
2896        1462
2966        1466

What I need returned, are rows that have the same in_TaskID occuring. So, in this case, I need in_StepIDs 2589, 2592, 2896 and 2966.
I have this query which I wrote, which returns everything. I can't figure out how to get only the rows I need. I am fairly certain I need to use Count() but nothing I've done has worked.
select
b.in_TaskID, a.in_StepID
from step as a inner join step as b
on a.in_StepID = b.in_StepID
group
by b.in_TaskID, a.in_StepID
order
by b.in_TaskID
Thank you very much for any help. I'm sure I'll need more once we get this part working.
em
P.S. I had to edit this when my manager complained that I had real information And Lucky suggested the table structure. Hopefully, this is what is meant:
CREATE TABLE [TD_TASK_STEP](
[in_StepID] [int] IDENTITY(1,1) NOT NULL,
[dc_UnitsMultiplier] [decimal](14, 8) NULL,
[dc_UnitsMultiplierCorp] [decimal](14, 8) NULL,
[tx_StepDesc] [varchar](900) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
[tx_Comments] [varchar](1024) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[in_StepOrder] [int] NOT NULL,
[in_CorpStepID] [int] NULL,
[tx_UserName] [varchar](8) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[dt_LastChanged] [datetime] NULL,
[in_ContentID] [int] NOT NULL,
[in_TaskID] [int] NOT NULL,
CONSTRAINT [PK__TD_TASK_STEP] PRIMARY KEY CLUSTERED
(
[in_StepID] ASC
) ON [PRIMARY]
) ON [PRIMARY]

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

lucky-80472 SSCertifiable Points: 7793 More actions · Answer 1

Hello,

Can you post the table structures which you are using in your query? It might give an insight for anybody to help you out.

Thanks

Lucky

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 2

Try this query in your production environment:

select

*

from

dbo.TD_TASK_STEP a

where

exists ( select

1

from

dbo.TD_TASK_STEP b

where

b.in_StepID <> a.in_StepID

and b.in_TaskID = a.in_TaskID)

hth

John Rowan SSC Guru Points: 56440 More actions · Answer 3

declare @TD_TASK_STEP table (

in_StepID int,

in_TaskID int

)

insert into @TD_TASK_STEP

select 2589, 1462 union all

select 2592, 1466 union all

select 2680, 1546 union all

select 2790, 1550 union all

select 2896, 1462 union all

select 2966, 1466

SELECT in_StepID, a.in_TaskID

FROM @TD_TASK_STEP a

INNER JOIN (

SELECT in_TaskID

FROM @TD_TASK_STEP

GROUP BY in_TaskID

HAVING COUNT(*) > 1

) b

ON a.in_TaskID = b.in_TaskID

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

Jeff Moden SSC Guru Points: 1004436 More actions · Answer 4

Emma,

Both Lynn and John solutions will work for finding the duplicates... the question is, what do you want to do with them once you've found them?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Belisama SSC Enthusiast Points: 188 More actions · Answer 5

Thank you so much for your help!

Here's what happened, and what I'm doing. We wrote a script that inserted several default Steps for given Tasks. Then, when the user selected these Tasks, the Steps would automatically appear. Unfortunately, we neglected to account for Tasks that already existed and had identical Steps associated, hence the duplicates.

It was decided that I write a script that tests for these duplicates, so first I need to find them, then I need to compare the tx_StepDesc to see if they are identical (This is not something I approve of, since this is a user editable field, but I was overruled.) Finally, I check to see if one of the tx_UserName is PRODLOAD, the one we use when we work with the database. If PRODLOAD is found, I will delete it and the only record that will remain is the one the User created.

Does that make any sense? Am I going about this the right way?

em

Jeff Moden SSC Guru Points: 1004436 More actions · Answer 6

Yes, it does make sense... What I'd recommend is writing a script that would populate a working table (permanent table that will last until this problem is resolved and then deleted) that captures your duplicate conditions and has a column where you could mark the records for deletion. An automated process (built to do as you have described) would make the first update to the deletion column. Then, a review (manual and by script) could be made to assure that there are no exceptions. Fine tuning would be accomplished by manually changing the deletion flag for individual records.

When complete, a simple delte using the working table as the driver could be accomplished.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Belisama SSC Enthusiast Points: 188 More actions · Answer 7

Thank you very much for your help. I did what you said with the working table, and that seemed to work ok. It turns out there was an extra complication, in that there were many duplicates that were supposed to be there.

Oh, and it also turned out that the original person who was unsatisfied solved his own problem by going into the application and deleting the duplicates manually anyway, and no one else has mentioned that this is a problem

So, I just finished it out of curiosity to see if I could get it working. If anyone else complains, I'm ready!

em

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 8

Why the heck does he have access to the PRODUCTION TABLES ?

Jeff Moden SSC Guru Points: 1004436 More actions · Answer 9

I never ask that question because the answer is usually something that raises my hackles

It's usually ok for Developers to have read access but I like what we do better... we make a "clone" (not sure that's the right word) of the DB using our SAN software and, at midnight, we replace the reporting DB lock, stock, and barrel except for the DB where the Developers can keep their code. It's sometimes a bit of a problem because their data will get overwritten unless they store it in the dev DB (like they're supposed to) but, for the most part, it's worked out really well.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Belisama SSC Enthusiast Points: 188 More actions · Answer 10

I'm not sure you would say he has direct access to the Production Tables; he deleted the duplicates through the application that sits on top of the database.

We don't have one single database, we distribute a blank copy of the db and the account fills it up as they need. Supposedly, we're the only ones that can actually access it through the SQL Server tools. In reality, there are several accounts that have fiddled with it rendering our application unstable.

Does that make sense? Is that normal?

Jeff, that sounds like a great idea. We sort of do something similar, in that we have a corporate database which stores all of the information from all of our accounts in one location. It's kind of their backup (since many of them don't keep local backups

) and several company executives have expressed interest in mining that data; kind of neat that I get to learn that I became DBA at the right time!

em