SQL query help to save my bacon

Question

SQL query help to save my bacon

javawanabe

SSC Rookie

Points: 49
More actions
March 7, 2004 at 8:39 am

#161230

Hi,
I made a boo-boo with an import query and would like some help. I maintain a hockey stats database, and for somereason the IS NULL command did not work on my prod server (worked great on my test system) and as a result, I imported the stats of some 450 players in DUPLICATE.
So, I am hoping that I can write a query which will allow me to identify the duplicates and delete them. The stats table has a team_id, player_id and season_id. So I know that if there is more than 1 instance of the same values for those 3 columns, it is a duplicate.
I am ssuming I need to write a query that groups on team_id, player_id and season_id where the count for each of those items is > 1. But I have had no luck so far.
Any help appreciated
JW

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

ray_higdon SSCarpal Tunnel Points: 4589 More actions · Answer 1

There are other cooler ways to do this but the fastest way might be to do a select distinct (those 3 columns) into a second table, dump the original table and re-import from the second table into the original

------------

Ray Higdon MCSE, MCDBA, CCNA

fperkins Ten Centuries Points: 1029 More actions · Answer 2

fperkins

Ten Centuries

Points: 1029

March 7, 2004 at 7:47 pm

#497732

Discussion of some alternatives

fperkins Ten Centuries Points: 1029 More actions · Answer 3

http://www.databasejournal.com/features/mssql/article.php/1438651

duh!! Sending the link might help.

javawanabe SSC Rookie Points: 49 More actions · Answer 4

Hmm, that was too easy. I'm always looking for the hard way!

JW

Jack Corbett SSC Guru Points: 184394 More actions · Answer 5

You could save yourself a lot of heartache if you put a unique index (constraint) on team_id, player_id and season_id then you could not insert duplicates without an error.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

ricks779 Newbie Points: 4 More actions · Answer 6

Deleting duplicates can be done in a single T-SQL command,

when the table has two unique keys that don't have any columns in common.

For a table with only one unique key, add an identity column,

perform the delete, then remove the identity column.

create table sample_tbl

(

id int identity,

key1 int,

key2 int,

attr1 varchar (12),

attr2 varchar (12),

attr3 varchar (12)

)

Remove the duplicate rows, keeping only one of each

delete a

from dbo.sample_tbl as a

join dbo.sample_tbl as b

on a.key1 = b.key1

and a.key2 = b.key2

where a.id < b.id -- "<" keeps the row with the highest identity value

-- ">" keeps the row with the lowest identity value

If you want to review the data before deleting...

select num_dups, *

from sample_tbl as a

-- get duplicate keys

join (select key1, key2, count (*) as num_dups

from dbo.sample_tbl

group by key1, key2

having count (*) > 1

) as b

on a.key1 = b.key1

and a.key2 = b.key2

order by a.key1, a.key2

vijendersaini SSC Rookie Points: 34 More actions · Answer 7

Hi !

If the table structure is like.

Tbl1

(

pid integer,

.... some fields

)

then u can delete the duplicates as

delete from tbl1 where pid not in (select max(pid) from tbl1)

If u don;t have one then u can just add one identity column with seed 1 and it will work like pid.

It shud work...

--Vijendra

"Beginning is Half Done"

ray_higdon SSCarpal Tunnel Points: 4589 More actions · Answer 8

"delete from tbl1 where pid not in (select max(pid) from tbl1)"

Wouldn't that delete all rows exept the one with the max PID?

------------

Ray Higdon MCSE, MCDBA, CCNA

vijendersaini SSC Rookie Points: 34 More actions · Answer 9

I think that was needed !!!

regards

Vijendra...

"Beginning is Half Done"