In this article we will look at JOINS. We will primarily focus towards beginners, but this may be a refresher for the experienced. We will see how each type of JOIN works. Specifically we will discuss these: INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN , SELF JOIN and then wind up with CROSS JOINS.
Note there are so many other terminologies used, like equi join, non-equi join, inner join etc, but there are only the 6 types of joins as mentioned above.
Let's start by creating the tables/data required to understand JOINS.
Step - 1
Use the following script to create the table and data. You can just cut and paste this code into Query Analyzer and execute it.
drop table t1,t2,t3 create table t1(f1 int) create table t2(f1 int) create table t3(f1 int) --=========================== set nocount on declare @j int set @j = 1 while @j <= 10 begin insert into t1 values(@j) set @j = @j + 1 end --=========================== set nocount on declare @k int set @k = 6 while @k <= 15 begin insert into t2 values(@k) set @k = @k + 1 end --=========================== insert into t3 values(6) insert into t3 values(2) insert into t3 values(3) insert into t3 values(12) insert into t3 values(13) --===========================
You will now have data like this in the three tables. Please note that the empty spaces in rows are only to show a pictorial mapping. There are only 10 records in t1, 10 records in t2 and 5 records in t3 (no empty rows or NULL values in any field)
t1t2t3 f1f1f1 ------ 1 22 33 4 5 666 77 88 99 1010 11 1212 1313 14 15
Step - 2
Try these queries
select * from t1 JOIN t2 on t1.f1 = t2.f1 select * from t1 INNER JOIN t2 on t1.f1 = t2.f1 select * from t1,t2 where t1.f1 = t2.f1
All the three queries above are essentially the same. The last one is referred to as the equi-join. What if you want all rows in t1 that are not in t2 (non-equi joins ?). Will this query work?
select * from t1,t2 where t1.f1 <> t2.f1
It did not quite do what you expected, huh?
Here is one way to achieve this:
select * from t1 where t1.f1 not in (select t2.f1 from t2)
Step - 3 - LEFT OUTER JOIN
Now suppose you want all the records from t1, whether they have a corresponding entry in t2 or not. Why will we ever want to do this? A simple situation could be that, you want a list of all the departments (t1) whether there are employees or not in that department (t2). For this example assume t1.f2 holds department ids, and t2.f1 as department ids fk'd into t1.f1. Here is the query
select t1.f1, t2.f1 from t1 LEFT OUTER JOIN t2 on (t1.f1 = t2.f1)
This is exactly similar in functionality to the above.
select t1.f1, t2.f1 from t1 LEFT JOIN t2 on (t1.f1 = t2.f1) t1.f1 t2.f1 ===== ===== 1 NULL 2 NULL 3 NULL 4 NULL 5 NULL 6 6 7 7 8 8 9 9 10 10
Step - 4 - RIGHT OUTER JOIN
Ok, now the reverse, assume we want all the employees, whether or not they have been assigned to any departments. Both the queries below are similar again
select t1.f1, t2.f1 from t1 RIGHT OUTER JOIN t2 on (t1.f1 = t2.f1) select t1.f1, t2.f1 from t1 RIGHT JOIN t2 on (t1.f1 = t2.f1) t1.f1 t2.f1 ===== ===== 6 6 7 7 8 8 9 9 10 10 NULL 11 NULL 12 NULL 13 NULL 14 NULL 15
Step - 5 - FULL OUTER JOIN
Ah, what if you now want a list of all rows in t1 (whether or not there are rows in t2) and all the rows in t2 (whether or not there are rows in t1)
select t1.f1, t2.f1
from t1 FULL OUTER JOIN t2
on (t1.f1 = t2.f1)
order by t1.f1
select t1.f1, t2.f1
from t1 FULL JOIN t2
on (t1.f1 = t2.f1)
order by t1.f1
t1.f1 t2.f1
===== =====
NULL 11
NULL 12
NULL 13
NULL 14
NULL 15
1 NULL
2 NULL
3 NULL
4 NULL
5 NULL
6 6
7 7
8 8
9 9
10 10
Step - 6
JOINS make much sense between two tables, but can obviously be extended to
more than two tables. However, just to demonstrate, as to how
complicated/confusing they can become, here is a FULL OUTER JOIN example
between t1, t2 and t3.
select a.f1, b.f1, c.f1 from t1 a FULL OUTER JOIN t2 b on (a.f1 = b.f1) FULL OUTER JOIN t3 c on (a.f1 = c.f1) t1.f1 t2.f1 t3.f1 ===== ===== ===== 6 6 6 7 7 NULL 8 8 NULL 9 9 NULL 10 10 NULL NULL 11 NULL NULL 12 NULL NULL 13 NULL NULL 14 NULL NULL 15 NULL 5 NULL NULL 4 NULL NULL 3 NULL 3 2 NULL 2 1 NULL NULL NULL NULL 13 NULL NULL 12
Observe the query and output carefully...and then see if you can get the following output, which you will agree makes more sense.
t1.f1 t2.f1 t3.f1 ===== ===== ===== 6 6 6 7 7 NULL 8 8 NULL 9 9 NULL 10 10 NULL NULL 11 NULL NULL 12 12 NULL 13 13 NULL 14 NULL NULL 15 NULL 5 NULL NULL 4 NULL NULL 3 NULL 3 2 NULL 2 1 NULL NULL
That's it for OUTER JOINS for now.
Step - 7 - SELF JOINS
When the same table is used in a query with two different aliases, such a join is referred to as a self-join. Let us see this case with an example. Create the following table and data.
create table jk_Workers(Worker int, Manager int) insert into jk_Workers values(111,NULL) insert into jk_Workers values(222,111) insert into jk_Workers values(333,222) insert into jk_Workers values(444,222) insert into jk_Workers values(555,222) insert into jk_Workers values(666,111) insert into jk_Workers values(777,111) insert into jk_Workers values(888,333) insert into jk_Workers values(999,222)
All the workers with their respective managers are stored in this table. And if we wish to list out the managers and the workers working under them, we could use a query similar to the following.
select b.Manager,b.Worker from jk_Workers a, jk_Workers b where a.Worker = b.Manager or b.manager is null and a.Manager is null order by b.Manager Manager Worker NULL 111 111 222 111 666 111 777 222 999 222 333 222 444 222 555 333 888
Step - 8 - CROSS JOINS
A cross join is referred to as a Cartesian product, which means, for each element in set-A pick all the values from set-B
select t1.f1, t2.f1 from t1 cross join t2 select t1.f1, t2.f1 from t1,t2
Both the above queries are same. The output of this query will be 100 rows. This is because, for each row in t1, all the rows in t2 would be matched. Unless you know what you are doing, you got to be careful in building your queries, to ensure you are not working on Cartesian products.
I am not sure of a real business need for a Cartesian product (I can be wrong as ever), but I know mathematically it is valid to have the same y values for different values of x (m=0), basically a horizontal line on the Cartesian plane.
Joins could be as complicated/confusing in real world scenarios, but if your fundamental understanding of how they behave is clear, you should be able to handle any situation.
Thankz for reading!