August 23, 2006 at 4:51 am
Nice article.
Another way of writing the same, for those who prefer the old convention (I dont recommend it).
SELECT
s.*, se.* FROM Students s ,
(SELECT * FROM StudentExam WHERE ExamName='SQL Server') se
where
s.StID !=se.StID
August 23, 2006 at 5:14 am
I'm not sure why one would use precisely this form... IMHO it is more complicated than necessary, and the complication does not bring any advantages :
SELECT s.* FROM Students s
LEFT JOIN (SELECT StID FROM StudentExam WHERE ExamName='SQL Server') se ON s.StID=se.StID
WHERE se.StID IS NULL
I would prefer to use one of these variants:
variant 1) - simplest
SELECT s.* FROM Students s
LEFT JOIN StudentExam E ON E.stID=s.stID AND ExamName='SQL Server'
WHERE E.StID IS NULL
variant 2) - to avoid possible duplicates
SELECT s.* FROM Students s
LEFT JOIN (SELECT DISTINCT StID FROM StudentExam WHERE ExamName='SQL Server') se
ON s.StID=se.StID
WHERE se.StID IS NULL
Is there any reason why to use the code as shown in the article?
Yes I know that if condition is NOT, there can't be any duplicities because I only take those that don't have any corresponding rows... but generally when speaking about similar JOINs, duplicities are things that can cause problems - so I thought I'll mention that... also because it is the main reason why/when I would use the derived table. Otherwise, variant 1 should be good enough.
August 23, 2006 at 7:10 am
SELECT s.* FROM Students s
JOIN StudentExam se
ON s.StID=se.StID
WHERE se.ExamName<>'SQL Server'
is definitely NOT the query to be used to retrieve:
The students that has not taken "SQL Server" exam
in the given data architecture...
It is a query to get:
The students that has taken an exam that is NOT "SQL Server".
And it successfully delivers the expected result.
What if there are students that has not taken ANY exams ?
So, I don't think this is a pitfall of "<>" operator at all !
The query should be in the final form that the author suggests to begin with....
August 23, 2006 at 7:19 am
Here is how I would answer the original question:
select s.* from Students s
where not exists
(select 1 from StudentExam se
where se.ExamName = 'SQL Server'
and s.StID=se.StID)
While I have seen some performance issues when using an in (select ...) clause exists seems to work great.
[font="Tahoma"]Bryant E. Byrd, BSSE MCDBA MCAD[/font]
Business Intelligence Administrator
MSBI Administration Blog
August 23, 2006 at 7:39 am
Here's your sample tables with indexes on them. Should eliminate the table scans in your query plan.
CREATE TABLE Students (
StID INT NOT NULL PRIMARY KEY NONCLUSTERED,
StName NVARCHAR(50) NOT NULL)
GO
CREATE CLUSTERED INDEX IX_Students
ON Students (StName)
GO
INSERT Students VALUES (1,'Jack')
INSERT Students VALUES (2,'Anna')
INSERT Students VALUES (3,'Bob')
GO
CREATE TABLE StudentExam (
StID INT NOT NULL,
ExamName VARCHAR(50) NOT NULL,
PRIMARY KEY (StID, ExamName))
GO
INSERT StudentExam VALUES (1,'SQL Server')
INSERT StudentExam VALUES (2,'VB.NET')
INSERT StudentExam VALUES (2,'C#.NET')
INSERT StudentExam VALUES (1,'XML')
GO
Then run your query and see if it comes back with a more efficient query plan.
August 23, 2006 at 8:36 am
Subqueries are usually a bad idea as a first option- how about a left join, perhaps with a DISTINCT in the SELECT clause?
August 23, 2006 at 8:50 am
The point of the article is excellent. It reinforces the fact that we as SQL programmers need to be careful and be sure of the results that our queries will return. It's a reminder to us veterans to not be complacent and whip out some crappy code when we're in a hurry. It's a nice lesson for new programmers who could easily fall into the trap of using the "Not Equal To" operator when they should really be using a sub-query or outer join.
August 23, 2006 at 8:56 am
Ehsan,
Bob does not show up in your results.
Regards,
Don
August 23, 2006 at 9:20 am
If you "Check out" Ehsan's example you'll find Jack is retrieved even though he did take the class and, even though Bob did not take the class, he is not retrieved.
"Stupid people compose a query and expect other unusual result "
Vladan's variant 1 example returns the correct results because the ExamName criteria is in the join, NOT the where clause. This causes all StudentExam columns in the join's result set to be NULL for students without a 'SQL Server' exam. The example then filters out the non-null rows from the result set with the where clause.
In Ehsan's example, putting the ExamName <> 'SQL Server' criteria in the where clause filters out all rows where ExamName = 'SQL Server'. This allows Jack to still be retrieved because there is a row in the result set where ExamName <> 'SQL Server'.
This also causes Bob to be filtered out because all of the StudentExam columns in the result set are NULL for Bob (who didn't take any exams). ANY COMPARISON OF A NON-NULL TO A NULL IS NULL. So, ExamName <> 'SQL Server' evaluates to NULL even though it seems like it should evaluate to TRUE and the row is filtered out. Bye bye Bob.
There's more good stuff in BOL on NULL comparisons in the "Comparison Search Conditions" subtopic under "null values" in the index.
Finally, Ehsan's example will return duplicates for students taking more than one exam other than 'SQL Server'. The beauty of Vladan's example is that you only get one row in the result set for each row in the "left" table without a match in the "right" table so "select distinct" is not required.
August 23, 2006 at 9:23 am
The question is one of membership, and the IN clause seems to be a very clear way to pose that question. After proper indexing, it yields the same query plan as the methods proposed, but in a more understandable query.
/* Students that have taken the SQL Server test */
SELECT s.*
FROM dbo.Students AS s
WHERE s.StID IN (SELECT StID FROM dbo.StudentExam WHERE ExamName = 'SQL Server')
/* Students that have not taken the SQL Server test */
SELECT s.*
FROM dbo.Students AS s
WHERE s.StID NOT IN (SELECT StID FROM dbo.StudentExam WHERE ExamName = 'SQL Server')
August 23, 2006 at 9:38 am
Regardless of the material, a thanks to Amin Sobati for writing it. I often wonder why any of these contributing authors would take the time to produce material when too often that maetrial recieves not just constructive criticism but out right crude & rude behaivor. Is it necessary to sue workds like 'stupid' in a constructive criticism?
Kindest Regards,
Just say No to Facebook!August 23, 2006 at 9:58 am
This is a classic example of an exclusion list and point sout the limitiations of the inequality operator. What is presented in this example is that members in one are are to be excluded from from others. (Note that the example is students who have not taken the SQL Server exam. Bob gets omitted. Does he get omitted because he has not taken an exam?). And as the author points out, "you can simply use a subquery to build the list of students who have taken SQL Server exam" and then build your exclusion using the NOT EXISTS clause:
SELECT s.*
FROM Students s
where
not exists ( select stid from studentExam se where s.StID = stid and examname = 'SQL Server' )
August 23, 2006 at 10:04 am
"ANY COMPARISON OF A NON-NULL TO A NULL IS NULL"
Actually comparisons of a NULL (or a non-NULL value) to a NULL results in UNKNOWN. Your point is taken though: The WHERE clause only returns results for rows comparisons that evaluate to TRUE. Comparisons that evaluate to FALSE or UNKNOWN are dropped from the results.
August 23, 2006 at 11:21 am
I 100% agree with the comment made by Duray AKAR. The query listed will never retrieve the results "the students that has not taken the "SQL Server" exam...will only return those results for students that have taken an exam that is not SQL Server. Very good point Duray AKAR.
Tim
Viewing 15 posts - 1 through 15 (of 34 total)
You must be logged in to reply to this topic. Login to reply