Subquery syntax?

Question

Subquery syntax?

kirk-510417

SSC Eights!

Points: 809
More actions
January 17, 2008 at 11:19 am

#180761

I am trying to learn the basics of subqueries. Here i want to return all columns from MyTable for aspecific value of Specialty for distinct name IDs.
NameID is not unique in MyTable. So WHERE Specialty = 'X1' will return many rows with the same NameID. I need one instance of each NameID where Specialty = 'X1'. What should the syntax be?
SELECT *
FROM MyTable
WHERE IN (SELECT DISTINCT NameID FROM MyTable WHERE Specialty = 'X1')

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

Adam Haines SSC-Insane Points: 23217 More actions · Answer 1

The problem with this syntax is you can return a name id without Specialty = 'X1' because the outerquery is only looking for NameIDs. If a NameID has two entries one with Specialty = 'X1' , while the other does not, both are still returned.

You should have an additional filter in the where clause

Subquery Solution:

SELECT *

FROM MyTable

WHERE NameID IN (SELECT DISTINCT NameID FROM MyTable WHERE Specialty = 'X1')

AND Specialty = 'X1'

You should be able to query like this though.

SELECT DISTINCT NameID, Specialty, Col3, Col4 etc

FROM MyTable

WHERE Specialty = 'X1'

My blog: http://jahaines.blogspot.com

Lynn Pettis SSC Guru Points: 442470 More actions · Answer 2

It would help us help you if you provided the DDL for your tables, some sample data, and what your epected output would look like based on the sample data.

Take some time to read this article: http://www.sqlservercentral.com/articles/Best+Practices/61537/

😎

GSquared SSC Guru Points: 260824 More actions · Answer 3

Select Distinct on dbo.MyTable should give you what you want.

Using "Select Distinct" in a subquery for an "in" clause doesn't actually do anything except make more work for the server.

Think of it this way:

Select *

from dbo.MyTable

where name in ('Joe', 'Joe', 'Joe', 'Bob')

vs

Select *

from dbo.MyTable

where name in ('Joe', 'Bob')

Both will give exactly the same result, you'll get all the rows from dbo.MyTable where the name is either "Joe" or "Bob".

Adding "select distinct" to the subquery just tells SQL to get rid of the extra "Joe"s from the list. It doesn't affect the end result, and it actually slows the query down, because removing the duplicates takes more effort than the original match does.

So, if you want distinct results in your end result, put them in the outer query, not the subquery.

Just think of subqueries in your Where statement (the sample you are using), as a list of "these are all the values I want", just like a regular "In" statement. Don't use distinct, or group by, or anything else, just tell the server the list of values that are okay. It will work out how to use that list.

(I'm going into more detail on this because you said you were new to subqueries, so I'm explaining a bit more about them than is necessary for the exact example you gave. I hope that helps.)

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

kirk-510417 SSC Eights! Points: 809 More actions · Answer 4

I am new to MS SQL and am trying to learn how things should be done. Given there are many ways to do the samne thing.

The solution you suggest is exactly what I needed and your explanation has helped me understand both subquery and DISTINCT.

Thank you very much!

kirk-510417 SSC Eights! Points: 809 More actions · Answer 5

GSquared

I understand your explanation. Makes perfect sense. The problem I have had when working with DISTINCT is that all columns in the select list are engaged in the distinct. I have had requests, for example, for distinct NameID and Specialty but include all additional columns in the table. If the data were…

NameID…….Specialty……….ColX

67……………..X1………………A

67……………..X1………………B

These two rows would be unique and appear in the result set. So I could not use a DISTINCT. That is why I went with a subquery and * for the outer select field list. Is there a better way to get all columns in the output but “distinct” on only a few columns?

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 6

This kind of CTE would do the trick:

With myCTE (nameID, name, address, etc,RN) AS

(

Select MT.nameid,

MT.name,

MT.address,

MT.etc,

MT.ROW_NUMBER() OVER (partition by MT.nameID)

from dbo.MyTable MT

)

select myCTE.nameID,myCTE.name,myCTE.address,myCTE.etc

from MyCTE

WHERE myCTE.RN=1

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

JeffB Hall of Fame Points: 3541 More actions · Answer 7

kirk (1/17/2008)
GSquared
I understand your explanation. Makes perfect sense. The problem I have had when working with DISTINCT is that all columns in the select list are engaged in the distinct. I have had requests, for example, for distinct NameID and Specialty but include all additional columns in the table. If the data were…
NameID…….Specialty……….ColX
67……………..X1………………A
67……………..X1………………B
These two rows would be unique and appear in the result set. So I could not use a DISTINCT. That is why I went with a subquery and * for the outer select field list. Is there a better way to get all columns in the output but “distinct” on only a few columns?

Which one of the 2 rows would you want displayed then?

kirk-510417 SSC Eights! Points: 809 More actions · Answer 8

JeffB –

The requester did not care which row was returned. All of ColX was populated. He just wanted a value from ColX.

Matt –

You have blown past any thing I know. I will be looking up what CTE is, and then figure out what your doing here. Thanks for the homework. LOL

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 9

Sorry - I've been getting acquainted with them as well.

Look up Common Table Expressions in BOL to get a primer on how they work. Not bad for some things, but not great for others too....

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

GSquared SSC Guru Points: 260824 More actions · Answer 10

kirk (1/17/2008)
GSquared
I understand your explanation. Makes perfect sense. The problem I have had when working with DISTINCT is that all columns in the select list are engaged in the distinct. I have had requests, for example, for distinct NameID and Specialty but include all additional columns in the table. If the data were…
NameID…….Specialty……….ColX
67……………..X1………………A
67……………..X1………………B
These two rows would be unique and appear in the result set. So I could not use a DISTINCT. That is why I went with a subquery and * for the outer select field list. Is there a better way to get all columns in the output but “distinct” on only a few columns?

If they don't care which one of the ColX columns is displayed, you might try:

select NameID, Specialty, min(ColX) as ColX

from dbo.MyTable

where Specialty = 'X1'

group by NameID, Specialty

That will get you:

NameID Specialty ColX

67 X1 A

Will that do what you need? If so, then just use min() on each column other than the ones you want to group by (NameID and Specialty in this case).

Another solution would be:

select top 1 NameID, Specialty, ColX

from dbo.MyTable

where specialty = 'X1'

That will only return the top row, but it might be easier to build. The first one, you can modify the Where clause to allow for things like, "Where Specialty in ('X1', 'X2')", which would get you one row for each NameID and each specialty.

The second one will only return the first NameID also.

Do you have a unique key (primary key) on the table you are selecting from? If so, which column(s) is it on? Or a unique ID column? In either case, a join on a subquery (I can explain what that means) can be done which will get you the best possible results.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

kirk-510417 SSC Eights! Points: 809 More actions · Answer 11

There is a primary key in the table, MyTableID. Your description if the ways to do this is very good. I am learing a lot. How would the primary key be used to provide asolution?

GSquared SSC Guru Points: 260824 More actions · Answer 12

select NameID, Specialty, ColX

from dbo.MyTable

inner join

(select min(MyTableID) as MinID

from dbo.MyTable

where Specialty = 'X1'

group by NameID) Sub1

on MyTable.MyTableID = Sub1.MinID

That will give you the first instance of each NameID where the Specialty is X1.

The sub-query gets a list of the first MyTableID for each NameID for that specialty. The outer query then gives you the extra columns you want. If you want to add more to the Where clause of the sub-query, you can do so. It's a pretty flexible approach to this kind of thing.

Want the last entry instead of the first: change to max(MyTableID).

Want the specialty to be in a range: change Where to "Where Specialty in (...)"

And so on.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

kirk-510417 SSC Eights! Points: 809 More actions · Answer 13

Got it 🙂 You have really added to my knowledge base and tool bag.

Thank you very much!

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 14

Kirk -

I'm adding to your homework:).

The same idea can be achieve with the CTE (to ensure you always get the same record back within a group), as so (note that the only difference from before is the extra ORDER BY):

With myCTE (nameID, name, address, etc,RN) AS

(

Select MT.nameid,

MT.name,

MT.address,

MT.etc,

MT.ROW_NUMBER() OVER (partition by MT.nameID ORDER BY MyTableID) --<--right here

from dbo.MyTable MT

)

select myCTE.nameID,myCTE.name,myCTE.address,myCTE.etc

from MyCTE

WHERE myCTE.RN=1

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?