January 17, 2006 at 2:03 pm
Hi guyz, please see if you can help me....
I have the following query which joins 4 tables.
SELECT A.julian_month, B.month_year, C.lienholder, C.product_credit_type
FROM A INNER JOIN B ON A.Orig_Day_Key = B.Day_Key
INNER JOIN C ON A.Requested_Product_Key = C.Product_Key
INNER JOIN D ON A.Application_Key = D.Application_Key
INNER JOIN E ON B.Julian_Month = E.Julian_Month
WHERE D.valid_application_flag = 'Y'
Now if I rewrite this query by using a subquery instead of a Join to the 'D' table, the query takes half the time to execute.
SELECT A.julian_month, B.month_year, C.lienholder, C.product_credit_type AS Requested_Product_Credit_Type, C.Period
FROM A INNER JOIN B ON A.Orig_Day_Key = B.Day_Key
INNER JOIN C ON A.Requested_Product_Key = C.Product_Key
INNER JOIN E ON B.Julian_Month = E.Julian_Month
WHERE A.Application_Key In (Select Application_Key From D
Where valid_application_flag = 'Y')
Now my question is, is there any issue in using a subquery instead of an Inner Join here? Also is there any chance that it will affect the functionality in the future (because there is one more join after the 'D' table Join).
Thanks in advance.....
January 17, 2006 at 2:27 pm
In the second query you can replace
WHERE A.Application_Key In (Select Application_Key From peoplefirst_datawarehouse.dbo.application
Where valid_application_flag = 'Y')
WHERE EXISTS (Select Application_Key From peoplefirst_datawarehouse.dbo.application
Where valid_application_flag = 'Y' AND Application_key=A.Application_Key)
*note your second query is more like
SELECT A.julian_month, B.month_year, C.lienholder, C.product_credit_type
FROM A INNER JOIN B ON A.Orig_Day_Key = B.Day_Key
INNER JOIN C ON A.Requested_Product_Key = C.Product_Key
INNER JOIN D ON A.Application_Key = D.Application_Key
AND D.valid_application_flag = 'Y'
INNER JOIN E ON B.Julian_Month = E.Julian_Month
which makes a differences because the join (first) & where (after all joins) are evaluated at a different time although the results are the same.
January 17, 2006 at 2:43 pm
>>Now my question is, is there any issue in using a subquery instead of an Inner Join here?
There is a huge difference depending on the cardinality of the data.
If table D does not have unique values for Application_Key, then the 2 queries will produce a different result set, which may be the reason for the performance difference.
January 17, 2006 at 3:14 pm
Hi OldHand, Thnx for the reply...
Do u mean to say to rewrite the query like this?
SELECT A.julian_month, B.month_year, C.lienholder, C.product_credit_type AS Requested_Product_Credit_Type, C.Period
FROM A INNER JOIN B ON A.Orig_Day_Key = B.Day_Key
INNER JOIN C ON A.Requested_Product_Key = C.Product_Key
INNER JOIN E ON B.Julian_Month = E.Julian_Month
WHERE EXISTS (Select Application_Key From peoplefirst_datawarehouse.dbo.application
Where valid_application_flag = 'Y' AND Application_key=A.Application_Key)
What is the advantage of using Where Exists here? I tried it this way & it is taking more time to execute this query now.
January 17, 2006 at 3:16 pm
PW, Application_Key is the primary key on table 'D'. So it will have unique values. Will the new query make an issue now?
January 17, 2006 at 3:56 pm
>>What is the advantage of using Where Exists here?
The advantage probably only appears once you reach a certain volume of data being returned from the IN () sub-query. How many applications are currently in table D with valid_application_flag = 'Y' ? How many do you foresee being in there in the future ?
January 17, 2006 at 4:07 pm
Currently, there are 7333689 rows in table 'D' with 'valid_application_flag' = Y.
This can increase in the future...
I am just concerned with this....
If we see the sequence of execution of statements in the first query,
SELECT A.julian_month, B.month_year, C.lienholder, C.product_credit_type
FROM A INNER JOIN B ON A.Orig_Day_Key = B.Day_Key
INNER JOIN C ON A.Requested_Product_Key = C.Product_Key
INNER JOIN D ON A.Application_Key = D.Application_Key
INNER JOIN E ON B.Julian_Month = E.Julian_Month
WHERE D.valid_application_flag = 'Y'
1. A is joined with B
2. B is joined with C
3. A is joined with D
4. B is joined with E
5. The resultset is Filtered with the 'Where' clause (WHERE D.valid_application_flag = 'Y').
Now in the query, which I rewrote,
SELECT A.julian_month, B.month_year, C.lienholder, C.product_credit_type AS Requested_Product_Credit_Type, C.Period
FROM A INNER JOIN B ON A.Orig_Day_Key = B.Day_Key
INNER JOIN C ON A.Requested_Product_Key = C.Product_Key
INNER JOIN E ON B.Julian_Month = E.Julian_Month
WHERE A.Application_Key In (Select Application_Key From D
Where valid_application_flag = 'Y')
the sequence will be like this...
1. A is joined with B
2. B is joined with C
3. B is joined with E
4. The resultset is then Filtered with the 'Where' clause (WHERE A.Application_Key In (Select Application_Key From D Where valid_application_flag = 'Y')).
Im just worried, whether this will affect the functionality, because the sequence of operation is not the same in both the cases..... I am getting the same results in both cases though.
I dont have much idea about the sequence in which the Joins, subqueries and 'Where' clause are executed.. Thanks....
January 18, 2006 at 1:27 am
sequence:
shopping cart : adding things that are in promotion and minimum profit 5 $
insert into shopping_cart
from promotion
inner join articles on articles.promotionid=promotion.id and promotion.discount>5$
->before adding it filters all promotions that has a discount>5 (no items to check afterwards)
insert into shopping_cart
from promotion
inner join articles on articles.promotionid=promotion.id
WHERE promotion.discount>5$
->adds all promotions (->lot more items)
->once filled,check if the discount>5 (if not take it out the shopping cart)
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply