inner join vs correlated query

Question

Post reply

inner join vs correlated query

Raghavendra-499237

Default port

Points: 1435
More actions
December 17, 2010 at 12:52 pm

#229716

Hi All,
I want to fetch records from 4 tables. Fetching a single field from first 3 tables and more fields from 4th table.
Query using join is like this
Select a.field,b.field,c.field,d.* from d inner join a on a.id=d.id inner join c on a.id=c.id inner join b on a.id=b.id
Query using correlated query is like this
select (select field from a where a.id=d.id) as field1,(select field from b where b.id=d.id) as field2,
(select field from c where c.id=d.id) as field3,d.* from d
can anyone suggest me which approach is better
thanks in advance
Raghavendra

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Johan Bijnens SSC Guru Points: 135661 More actions · Answer 1

- Personally I don't like those column level select statements in the select clause.

Maybe the engine will even parse it into a join for you, but IMO it is a pain for maintenance afterward.

- What are your test results with regards to statistics IO and time ?

- Does sqlserver handle the query as you would expect ?

-- if not: why not ?

You may even want to post both sqlplans of your actual queries and have us take a look at them. ( post as attachment )

That provides the huge advantage of us being able to interpret the statistics sqlserver used to compile your query.

You should expect the column level correlated selects will produce RBAR behaviour. ( Read Jeff Modens articles on how to prevent that ! )

Always start with the simplest version. ( KISS )

i.e. select columnlist from object1 joinclause object2 onclause ...

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data and code to get the best help

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Who am I ? Sometimes this is me but most of the time this is me

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 2

ALZDBA (12/17/2010)[hrYou should expect the column level correlated selects will produce RBAR behaviour. ( Read Jeff Modens articles on how to prevent that ! )

"It Depends". The joins used in correlated subqueries won't cause the subquery to necessarily act as RBAR and. you are correct, correlated subqueries are frequently converted to joins. Sometimes, they're not and sometimes they can be even faster than a regular join. For example, even a WHERE IN can be faster than an INNER JOIN (lots of code races on this site to substantiate that claim) although the two operate slightly diffently when it comes to NULLS.

I can, however, just about guarantee that a correlated subquery with a SUM() or COUNT() along with an inequality in the correlation will cause a Triangular Join. Unless you're as tricky with internals as Paul White and know exactly which index to use, then that's where you run into problems with correlation.

The bottom line is to always test with enough data to show if something is scalable or not or get real good at reading the arrows in the actual execution plan.

That being said... I do agree that I prefer the "look" of joins over correlated subqueries and have found that joins are a lot easier to troubleshoot, as well. That's just a personal opinion.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Raghavendra-499237 Default port Points: 1435 More actions · Answer 3

Hi,

Please find attached scripts to create and insert data into tables, also I uploaded execution plan for both the queries.

I have written 2 queries one with inner join and other with correlated query, both are taking same amount of time to retrieve 1 million records.

first query

Select pname,location,scheduleinfo from providerschedule as ps with (nolock)

inner join provider as p with (nolock) on ps.pid=p.pid

inner join m_locations as ml with (nolock) on ps.locationid=ml.locationid

second query

Select (select pname from provider (nolock) where pid=ps.pid) as pname,

(select location from m_locations (nolock) where locationid=ps.locationid)as location,

scheduleinfo from providerschedule as ps with (nolock)

Johan Bijnens SSC Guru Points: 135661 More actions · Answer 4

I stick with my first advise on this one.

Keep it simple and code regular joins.

The column level nested selects will only give you benefits if cardinality meets a treshhold.

I certainly wouldn't accept your second query unless is was preceded with a nice comment on why this columnar select has been used.

( Although I see them on a regular basis, they always result in a remark and are a PITA for maintenance and performance tuning. Figuring out what the programmer wanted to do is always hard with these.)

As you have coded, you only expect No rows or a single row of data in your nested selects. :crazy:

You would expect the same result as a left join, but since it is at row level, multiple return values will generate an error.