Select data but exclude Duplicates

Question

Post reply

Select data but exclude Duplicates

Charlottecb

Hall of Fame

Points: 3932
More actions
June 29, 2005 at 4:55 am

#165235

Hi all (SQL Server 2000)
Can someone help me with the following problem.
table (tblSiblings) structure as follows:
intSiblingID,
intStudent1ID
intStudent2ID

Data looks like the following
1    student 1    student 2
2     student 2    student 1
3     student 3    student 4
4    student 4    student 3
5     student x    student y
6     student y    student x
This was set up by a previous developer. I want to extract 1 of every unique relationship in a select statement. At present, every (as far as I am aware) relationship is set up twice (the second version being inverted). I don't care if I select the normal version or the inverted version, I just need one of each relationship eg 1 to 2, 3 to 4, x to y.
Any suggestions on how to code this. I was thinkinbg of a not in sub-select but can't get this to work.
any help appreciated.
thanks
CCB

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 1

With the data you provided, this should work:

SELECT x.intStudent1ID, x.intStudent2ID

FROM tblSiblings x

INNER JOIN tblSiblings y

ON x.intStudent1ID = y.intStudent2ID

AND x.intSiblingID < y.intSiblingID

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/

Charlottecb Hall of Fame Points: 3932 More actions · Answer 2

Charlottecb

Hall of Fame

Points: 3932

June 29, 2005 at 5:22 am

#569260

Perfect.

Thanks for the quick reply

CCb.

Anna Pimenova Mr or Mrs. 500 Points: 561 More actions · Answer 3

You may try the following code:

select S1.intSiblingID,

S1.intStudent1ID,

S1.intStudent2ID

from #stud S1 inner join #stud S2 on

S1.intStudent1ID = S2.intStudent2ID

and S1.intStudent2ID = S2.intStudent1ID

and S1.intSiblingID < S2.intSiblingID

order by S1.intSiblingID

But it will work for those data rows that *have* duplicates

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 4

Maybe you could add a little check constraint to avoid those kind of dulicates :

check : intStudent1ID <= intStudent2ID

that way 1, 1 is valid (but only once because I assume that you have a unique constraint in there... by the way this should really be your primary key clustered). Also 1, 2 is valid but 2, 1 is not so that solve your problem for good.

Charlottecb Hall of Fame Points: 3932 More actions · Answer 5

Hi Remi,

Thanks for the advice. I am actual extracting the data from a legacy DB and uploadinging it into a new one where the proper constraints are set up.

Regards

CCB

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 6

Happy to hear that .

Did you also know that intColName is a worst practice (in case the datatype changes down the line)??

Charlottecb Hall of Fame Points: 3932 More actions · Answer 7

Charlottecb

Hall of Fame

Points: 3932

June 29, 2005 at 9:50 am

#569384

Hi Remi,

No I d'dn't know that.

What do you use ?

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 8

The actual name that makes sens. Here's an article on this matter :

Worst Practices - Part 1 of a Very Long Series!

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 9

Hungarian notation in the way it was recommended by Microsoft (it is no longer recommended) is wrong in every way and has always been. That goes for general programming and even more so for naming database objects. Columns should be named to show what property of the entity they hold, nothing more.

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 10

I agree for database objects. But I like to name my variables bVerdict for booleans, sSomething for strings. Saves a lot of time on long procedures. But I definitly stay away from that when naming columns. However I prefix the views with vw, the functions with fn and table functions with fnvw. That's it .

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 11

Joel Spolsky has a great article that discusses why Hungarian notation is stupid. Or rather the Hungarian notation as it was recommended (and used) by Microsoft, which is not at all the same as the inventor meant. The article is really about a bigger subject (still very interesting), so if only interested in Hungarian notation then skip to the end (the section called "I'm Hungary").

Making Wrong Code Look Wrong

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/