removal of multiple joins

Question

removal of multiple joins

Sekhar-310813

Newbie

Points: 7
More actions
July 10, 2008 at 6:13 am

#184498

I have created two tables and entered the data as shown below.
Create table tablea(col1 varchar(10), col2 varchar(10), col3 varchar(10))
Create table tableb(col1 varchar(10), col2 varchar(10), col3 varchar(10))
insert into tablea
select 'a1','a2','a3'
union
select 'b1','b2','b3'
union
select 'd1','d2','d3'
union
select 'e1','e2','e3'
insert into tableb
select 'a1','a2','a3'
select 'a1','c2','c3'
select 'b11','d2','d3'
select 'b12','b2','b3'
The below query provides me the required output.
select a.*,b.col1, c.col2, d.col3 from tablea a
left outer join tableb b on a.col1=b.col1
left outer join tableb c on a.col2 = c.col2
left outer join tableb d on a.col3=d.col3
Can the above query be written in a different manner by restricting to a single join instead of multiple joins with tableb (may be with group by) ? Is it possible.
Thanks in advance.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Grant Fritchey SSC Guru Points: 398916 More actions · Answer 1

So, you've got two tables that have three different and distinct relationships between the tables? I'm pretty sure that violates one of the rules of normalization, but I'm not sure which one. I'd be curious what business rule you're trying to meet with that design.

I tried this:

select a.*,b.col1, b.col2, b.col3

from tablea a

left join tableb b

on b.col1 = a.col1

or b.col2=a.col2

or b.col3 = a.col3

It gets the same data back and based on my tests took the query from 272ms to 80ms because instead of 3 scans it only does 1. However, I wouldn't want to try to scale this solution.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Sekhar-310813 Newbie Points: 7 More actions · Answer 2

There is a difference of two rows to the main query and the updated query when i used EXCEPT.

This query is an existing query and as it contains millions of records, i was looking for optimization of this query.

Any ideas?

Grant Fritchey SSC Guru Points: 398916 More actions · Answer 3

Well, eliminating all the excess JOINS does improve performance threefold in this example, so I think you can reasonably expect the something similar with larger data sets.

Because of the OR clause though, it's not going to use indexes very well. Just playing with the example, I actually saw the reads increase when I started putting indexes on the table. However, for the example, we're dealing with very small data sets, so indexes won't help much anyway. The one thing I can say is if you have at least one column that you join on that will be steady, that you can index, you can probably get decent performance out of this. The query would like something like this:

select a.*,b.col1, b.col2, b.col3

from tablea a

left join tableb b

on b.col1 = a.col1

AND (b.col2=a.col2

or b.col3 = a.col3)

Of course, from the example, this would change the result set. The thing is, because this is a violation of database design, it's going to continue to create problems for you however you go. You shouldn't have three independent relationships between tables.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

GSquared SSC Guru Points: 260824 More actions · Answer 4

This will probably perform better:

select a.*,b.col1, b.col2, b.col3

from tablea a

inner join tableb b

on a.col1=b.col1

union

select a.*,c.col1, c.col2, c.col3

from tablea a

inner join tableb c

on a.col2 = c.col2

union

select a.*,d.col1, d.col2, d.col3

from tablea a

inner join tableb d

on a.col3=d.col3

It depends on the indexes available. But, assuming your joins have indexes on them (pretty standard practice), this will allow index seeks, instead of index scans.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon