February 25, 2015 at 10:30 pm
Comments posted to this topic are about the item The Potential of Joins
February 26, 2015 at 3:31 am
Generally a good article, however there is a potential pitfall which should have been mentioned. This is that when using joins you do always have to be aware of the potential for unintential duplicates, which is not the case with exists/in
February 26, 2015 at 3:36 am
the general ideas outlined in this article are worth thinking over them.
However, what is stated therein, cannot be taken as matter of fact, because the optimizer
has its own ways.
Just an example: the queries (used my own database)
select * from kunden_zp_status where id_zp in (select id_zp from kd_zp where zp_name like 'aso%')
select s.* from kunden_zp_status s join kd_zp z on z.id_zp = s.id_zp and z.zp_name like 'aso%' option
behave alike : index seek on kd_zp - loop join - clustered index seek on kunden_zp_status ,
neither execution plan does show a sub-select
if you, however, force the optimizer to do what you have ordered it to do,
you will see the penalty from the sub-select
select * from kunden_zp_status where id_zp in (select id_zp from kd_zp where zp_name like 'aso%') option (force order)
select s.* from kunden_zp_status s join kd_zp z on z.id_zp = s.id_zp and z.zp_name like 'aso%' option (force order)
now the execution plan has a loop join from a sub-select for the first query, and a merge join for the second.
February 26, 2015 at 4:17 am
I'm curious as to why in the last example he used a LEFT OUTER JOIN, although in other queries in the article he used LEFT JOIN. From my knowledge, there is no difference between LEFT OUTER and LEFT type of JOINs in SQL Server.
Assuming that it doesn't return the same data, wouldn't it be a redundant condition to put WHERE Q.CustomerName IS NULL
if you have already used LEFT OUTER JOIN?
~ Just some guy trying to tune queries ~
February 26, 2015 at 5:11 am
radu.gheorghiu (2/26/2015)
I'm curious as to why in the last example he used a LEFT OUTER JOIN, although in other queries in the article he used LEFT JOIN. From my knowledge, there is no difference between LEFT OUTER and LEFT type of JOINs in SQL Server.Assuming that it doesn't return the same data, wouldn't it be a redundant condition to put
WHERE Q.CustomerName IS NULL
if you have already used LEFT OUTER JOIN?
I would not say so.
SELECT T.CustomerName,T.CustomerResponse,T.CalledOn
FROM #TMP T
LEFT OUTER JOIN (SELECT CustomerName
FROM #TMP
WHERE CustomerResponse = 'Not Interested'
) Q
ON Q.CustomerName = T.CustomerName
INNER JOIN (SELECT CustomerName,
MAX(CONVERT(DATE,CalledOn)) AS MaxCalledOn
FROM #TMP
GROUP BY CustomerName
)Q2
ON Q2.CustomerName = T.CustomerName
WHERE Q.CustomerName IS NULL
AND T.CalledOn = Q2.MaxCalledOn
First the inner join (Q2) restricts data to the last called date
Then this record set is joined against the main data set and finally the where statement filters out records that are not in the left join (Q)
February 26, 2015 at 6:37 am
I think a previous post alluded to this, but the last bit of SQL is fine where a customer would only be called once on a given date. If a customer could have been called twice, either the customer would appear twice or there would need to be a max/group in the code to pick one of the outcomes randomly.
February 26, 2015 at 6:43 am
The English of this article needs to be cleaned up. I'm thinking the author does not speak English as a primary language, which is totally fine, but if you're going to *publish* an article in English, the extra effort needs to be made in order to be professional.
There were also some inconsistencies such as the one radu pointed out, and one in scenario 2 where it was about a "NOT IN" sub query, but the query was put together "... where IN (select... where <> "Mountain Works")", which is slightly different.
Also, there are no benchmarks or Execution plans to prove the point, so we're just supposed to take the author's word for it???
Finally, I'm wondering how sub queries stack up when they are used in conjunction w/EXISTS and/or NOT EXISTS conditions.
February 26, 2015 at 7:34 am
It has been my experience when I have to refactor scripts that use sub-queries, that the use of join's has improved the efficiency. It would be nice though to see some stats that prove it out.
February 26, 2015 at 8:21 am
Thinking of the aggregate query.. curious if efficiencies could be gained using a WITH Statement or temp table and joining.
With Statement:
With S as (select ProductID,SalesOrderID,SUM(OrderQty) AS OrderQty FROM Sales.SalesOrderDetail GROUP BY ProductID,SalesOrderID)
Then join it as a table
Inner join S
ON P.ProductID = S.ProductID
Or create a temp table and you could take advantage of creating an index off your results if you happen to be linking off some data that isn't a key (like Customer_ID from an Order table)
February 26, 2015 at 8:24 am
Scenario 2 doesn't have a NOT IN in the original statement...I believe it belongs here:
select DISTINCT Name
from Production.Product
where ProductID IN (Select ProductID from Purchasing.ProductVendor
where VendorID NOT IN (select VendorID from Purchasing.Vendor
where Name<>'Mountain Works'
)
)
February 26, 2015 at 9:01 am
Yeah, scenario 2 examples are the same as scenario 1.
February 27, 2015 at 10:49 am
My boss is a subquery junkie. When I try to make sense of the queries she has saved as examples for me, I find this confusing. You've intrigued me with your use of joins especially in regards to a substitute for multiple group by's.
THANKS!
February 28, 2015 at 3:51 pm
I realize that the point of this article was to compare joins to subqueries but I would have thought to use the rank function. Here is how I would have done it:
;WITH Test AS (
SELECT
CustomerName,
CalledOn,
CustomerResponse,
RANK() OVER (PARTITION BY customername ORDER BY CONVERT(DATE,CalledOn) DESC) AS Rank
FROM #TMP
)
SELECT CustomerName,CustomerResponse,CalledOn
FROM Test
WHERE Rank=1
AND CustomerResponse<>'Not Interested'
There weren't enough records in the table to know which is fastest. I am curious to know what people think of using the Rank function for these purposes instead.
March 1, 2015 at 8:04 am
I had the same thought as you but I think you would want to use row_number not rank. Here is a good example of the difference between rank, and row_number.
http://stackoverflow.com/questions/7747327/sql-rank-versus-row-number
May 5, 2015 at 2:51 pm
Joins vs Exists. Neither is always right. Just be flexible.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply