In the previous article, you saw how the BETWEEN operator could be used in joins to solve problems dealing with range-based data. In this article, I will show you how to take joins even further by using multiple criteria in joins as well as using the greater than, less than, and not equals operators in joins.
Compound Joins
Compound joins are joins which use multiple criteria combined with a logical operator such as AND. This is a relatively simple concept and is commonly used in database systems that employ compound primary keys.
For a simple example of a database schema in which compound joins are necessary, consider a school management system where one of the features is tracking which classes are taught in which classrooms. The system must match up the features of the classrooms to the needs of the classes. In order to perform these functions, the following two tables are defined:
CREATE TABLE tb_Classroom( BuildingName char(10) NOT NULL, RoomNumber int NOT NULL, RoomCapacity int NOT NULL, HasLabEquip smallint NOT NULL, CONSTRAINT PK_Classroom PRIMARY KEY(BuildingName, RoomNumber) ) CREATE TABLE tb_ClassSection( CourseID char(5) NOT NULL, SectionNumber smallint NOT NULL, BuildingName char(10) NOT NULL, RoomNumber int NOT NULL, InstructorID int NOT NULL, ScheduleID int NOT NULL, SectionCapacity int NOT NULL, RequiresLabEquip smallint NOT NULL, CONSTRAINT PK_ClassSection PRIMARY KEY(CourseID, SectionNumber), CONSTRAINT FK_ClassSection_Classroom FOREIGN KEY(BuildingName, RoomNumber) REFERENCES tb_Classroom(BuildingName, RoomNumber) )
In this example, the tb_Classroom table defines a list of classrooms in which classes are taught. The tb_ClassSection table contains instances of various courses taught at the school. A class section is taught in a particular classroom by an instructor according to a standard class schedule. Both the tb_Classroom and tb_ClassSection tables use natural compound primary keys.
One of the reports in the school management system lists the class sections being taught along with the capacity of their respective classrooms. In order to construct this report, the tb_ClassSection table must be joined the tb_Classroom table based upon the compound primary key of the tb_Classroom table. This can be accomplished by using a compound join to return rows where both the BuildingName AND RoomNumber columns match.
SELECT s.CourseID, s.SectionNumber, c.RoomCapacity FROM tb_ClassSection s INNER JOIN tb_Classroom c ON( s.BuildingName = c.BuildingName AND s.RoomNumber = s.RoomNumber )
This query is relatively straightforward. If you've been using SQL for a while, chances are you've seen queries like it. The query is a simple equijoin that uses the AND logical operator to include multiple criteria. Despite its simplicity, this example provides the basis for a much more powerful query construction tool.
Joins Using Inequality Comparison Operators
The school management system from the first example also contains a report listing all class sections in which the classroom is not large enough to accommodate the maximum number of students for the class section. To determine which class sections meet these criteria, the system must compare the class section's capacity to the capacity of the classroom in which it is being taught. If the classroom's capacity is less than the class section's capacity, then the class section should be included in the result set.
With this query, the trick is to first join each class section to the classroom in which is being taught and then add the additional criterion that the classroom's capacity is less than that of the class section. To do this, simply take the query from the last example and add the additional criterion.
SELECT s.CourseID, s.SectionNumber, c.RoomCapacity, s.SectionCapacity FROM tb_ClassSection s INNER JOIN tb_Classroom c ON( s.BuildingName = c.BuildingName AND s.RoomNumber = s.RoomNumber AND c.RoomCapacity < s.SectionCapacity )
A common mistake when constructing queries such as this is not including the equijoin criteria necessary to match up the rows to be compared by the inequality operator. If only the inequality comparison is included in the criteria, the query returns all the rows where a classroom's capacity is less than that of any class section, regardless of whether or not the class section was taught in that classroom.
Not Equals Joins
You may be wondering how in the world it could be useful to use the not equals operator in a join. For an example, consider another report in the school management system in which you must the misallocation of laboratory-equipped classrooms. This report must list all of the class sections that require laboratory equipment, but are scheduled to be taught in classrooms that do not have it. The report should also include all non-lab class sections being taught in laboratory classrooms. In the tables, the class sections that require laboratory equipment are indicated with a RequiresLabEquip value of 1, and the classrooms equipped with laboratory equipment are indicated with a HasLabEquip value of 1.
This problem follows a similar pattern to that of the capacity problem. The only difference is the use of the not equals operator in place of the less than operator. After matching the class section with the classroom in which it is being taught, the value of the RequiresLabEquip column must be compared with the HasLabEquip column. If these values are not equal, there is a laboratory equipment allocation problem and the class section should be included on the report. Applying these criteria result in the following query:
SELECT s.CourseID, s.SectionNumber, c.HasLabEquip, s.RequiresLabEquip FROM tb_ClassSection s INNER JOIN tb_Classroom c ON( s.BuildingName = c.BuildingName AND s.RoomNumber = s.RoomNumber AND c.HasLabEquip <> s.RequiresLabEquip )
When using the not equals operator in joins, it is even more vital to remember to use additional join criteria than it is when using the greater than and less than operators. In this case, if only the not equals criterion was specified, the query would perform a cross join and then exclude only the class section-classroom pairs where the laboratory indicator was not equal. If there were 100 classrooms and 500 class sections, this could possibly return a result set of 25,000 - 50,000 rows – definitely not what was intended.
Beyond the Basics
Compound joins, especially those employing the inequality and not equals operators, can be used with other advanced SQL techniques to construct queries that are even more powerful. The key to leveraging these advanced joins is to spell out the requirements of the problem clearly in terms of relating the available sets of data. Once the relationships between the sets of data are understood, these relationships can be encoded into join criteria. This technique, along with testing and validation of the output, will enable you to solve complicated problems effectively.