December 6, 2011 at 5:50 am
honza.mf (12/6/2011)
Probably I will use right joins and nested joins only in case of self-defense. I prefer the readability and I think from left to right.
Absolutely true. This question was intended to illustrate how a real-life problem sometimes forces you to choose the lesser of two evils. Both nested joins and right outer joins reduce readability and ease of understanding of a query - so pick your poison. I have been in this position a few times, and whatever I chose, I always felt it was not the optimal choice.
BTW, now that I no longer have to support SQL Server 2000, I can use a better solution using a Common Table Expression:
WITH CustomersWithRecentOrders AS
(SELECT c.Name, o.OrderDate, c.SalesRep
FROM dbo.Customers AS c
INNER JOIN dbo.Orders AS o
ON o.CustomerID = c.CustomerID
AND o.OrderDate > DATEADD(month, -4, CURRENT_TIMESTAMP))
SELECT e.Name AS SalesRep, co.Name AS Customer, co.OrderDate
FROM dbo.Employees AS e
LEFT JOIN CustomersWithRecentOrders AS co
ON co.SalesRep = e.EmployeeID
WHERE e.Position = 'SalesRep';
Note that this query does NOT produce the exact same execution plan as the original, but it DOES produce the exact same results. And of all the options, I think this one is the most readable.
December 6, 2011 at 5:51 am
SQL Kiwi (12/6/2011)
Hi Hugo,Writing a question to demonstrate the equivalence of SQL expressions is very brave, so well done for that. Small point: in the explanation you say, "the only way to rewrite the query is to reverse the table order", but this code expresses the same SQL semantic I think:
Would this not work as well?
;WITH cte_Subquery AS
(SELECT c.SalesRep, c.Name, o.OrderDate
FROM dbo.Customers AS c
INNER JOIN dbo.Orders AS o
ON o.CustomerID = c.CustomerID
AND o.OrderDate > DATEADD(month, -4, CURRENT_TIMESTAMP))
SELECT e.Name AS SalesRep, s.Name AS Customer, s.OrderDate
FROM dbo.Employees AS e
LEFT JOIN cte_Subquery AS s
ON e.EmployeeID = s.SalesRep
WHERE e.Position = 'SalesRep';
I always go with CTEs in this situation when possible, but I would love to know if I'm off the mark on this one.
EDIT: Of course I posted this 45 seconds after Hugo beat me to it. At least I know I was thinking along the right lines.
ron
-----
a haiku...
NULL is not zero
NULL is not an empty string
NULL is the unknown
December 6, 2011 at 5:52 am
Good hard question, made me think. I missed it but I did learn something from the excellent explanation. Thanks for submitting.
http://brittcluff.blogspot.com/
December 6, 2011 at 5:58 am
ronmoses (12/6/2011)
Would this not work as well?
Yup! (As you already found out after posting).
The only thing I don't like about your query is the semicolon before WITH. Semicolons belong at the end of each statement, not at the start.
December 6, 2011 at 6:16 am
Hugo Kornelis (12/6/2011)
BTW, now that I no longer have to support SQL Server 2000, I can use a better solution using a Common Table Expression:
Even in SQL Server 2000, you can write the equivalent subquery/derived table as expressed by the CTE:
SELECT
SalesRep = e.Name,
Customer = co.Name,
co.OrderDate
FROM dbo.Employees AS e
LEFT JOIN
(
SELECT
c.Name,
o.OrderDate,
c.SalesRep
FROM dbo.Customers AS c
JOIN dbo.Orders AS o ON
o.CustomerID = c.CustomerID
AND o.OrderDate > DATEADD(MONTH, -4, CURRENT_TIMESTAMP)
) AS co ON
co.SalesRep = e.EmployeeID
WHERE
e.Position = 'SalesRep';
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
December 6, 2011 at 6:17 am
Cadavre (12/6/2011)
I assumed that understanding the logic of join operators is integral to the majority of the visitors to this site.
Not, I believe, a safe assumption. In my daily routine as an "Accidental DBA", I write relatively few queries, and very, very few complex queries with joins. I visit the site to learn, so that I have a chance of writing the queries quickly and efficiently when needs arise. (Of course, something is apparently rubbing off, 'cause I apparently got today's question right AND learned something!)
December 6, 2011 at 6:17 am
michael.kaufmann (12/6/2011)
One minor flaw I'd attribute to speedily writing this query--the above one will only yield the same result as the original one if only employees are the customers and the IDs as customer and employee are equal.
Thanks Michael - I did indeed write the syntax quickly, just from the execution plan, before Hugo so helpfully posted table definitions and sample data. That'll teach me, I guess.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
December 6, 2011 at 6:19 am
Thanks Hugo,
Great question, and between your form and Paul's I now have 2 more cases that my T-SQL formatter doesn't handle correctly; at least it's not mangling the SQL, but the resulting indentation makes little sense (and the parse tree is just wrong).
Any objection to my adding it to my test suite (which is also the demo list on poorsql.com) once I've fixed it?
http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.
December 6, 2011 at 6:22 am
Tao Klerks (12/6/2011)
Any objection to my adding it to my test suite (which is also the demo list on poorsql.com) once I've fixed it?
No problem for me!
Sorry for breaking your formatter - and good luck getting it fixed. 😉
December 6, 2011 at 6:27 am
Hugo Kornelis (12/6/2011)
The few timies where I did choose to use nested joins in actual code, I have always used parentheses and indentation for easier reading. In fact, my first draft of this question did use exactly that. I just took it out before submitting the question, because I am mean.:Whistling:
Yes you are mean! :laugh:
Actually, I think I rather misunderstood the point of the question. I thought you were intending to illustrate SQL/relational equivalences or something like that, but I now see you were aiming at the 'nested join' idea. I'll have to think about that a bit more to see if I understand the point completely.
Hugo Kornelis (12/6/2011)
I stiill stand by what I intended to write - that the only way to rewrite the query to a form that doesn't use nested joins involves reordering the table order and changing the left outer join to the far less cocmmon and harder to grasp right outer join form.
Hmm - but doesn't the CTE/derived table example break that assertion? Or are they still 'nested joins' somehow. I think that's the point I'm struggling with - what precisely 'nested joins' means.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
December 6, 2011 at 6:48 am
SQL Kiwi (12/6/2011)
Hmm - but doesn't the CTE/derived table example break that assertion?
It does. I only realised the CTE possibility later, after I already posted that earlier post.
(For some reason I can't really explain, I am not very fond of the derived table version - I've used derived tables when needed, but always found the syntax to be lacking readability).
December 6, 2011 at 7:13 am
Hugo Kornelis (12/6/2011)
(For some reason I can't really explain, I am not very fond of the derived table version - I've used derived tables when needed, but always found the syntax to be lacking readability).
I'm somewhere in the middle on this one. I'd like to get you and Adam Machanic together in a room on this point though - he's a fierce advocate of the benefits of derived tables over CTEs. Something about the ability to select blocks of code to run easily I think is one main argument in favour.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
December 6, 2011 at 7:25 am
This is one of the best QOTDs I have ever seen here. I had to really think this through to get it right. I ALMOST just checked #1 out of first impulse but figured it couldn't be that easy, especially once I saw the RIGHT JOIN answers. Wondering what would cause a DBA to make it that tough, I read the first query very carefully and realized that that the inner join would happen first and wouldn't happen the same on the query #1. So I ruled that out and #2 was easy to rule out. #3 screwed up the WHERE clause, so it had to be #4. A lot of work for first thing in the morning, but a good set of work as in my experience developers use nested joins not intentionally but by accident, as they are not up to speed on ANSI joins and are used to doing joins in the WHERE clause. And when they move to ANSI they sometimes do things like this. So very practical for real life troubleshooting.
December 6, 2011 at 7:30 am
Hugo Kornelis (12/6/2011)
honza.mf (12/6/2011)
Probably I will use right joins and nested joins only in case of self-defense. I prefer the readability and I think from left to right.Absolutely true. This question was intended to illustrate how a real-life problem sometimes forces you to choose the lesser of two evils. Both nested joins and right outer joins reduce readability and ease of understanding of a query - so pick your poison. I have been in this position a few times, and whatever I chose, I always felt it was not the optimal choice.
Yes, if I need to choose a poison, I use some antidote. Here it can be anything that makes query more readable: right join with conditions near the table, parentheses around the nested join, CTE, or subquery. Some are better for hand made queries, some for generated code, it depends.
Viewing 15 posts - 16 through 30 (of 43 total)
You must be logged in to reply to this topic. Login to reply