January 20, 2009 at 12:55 pm
I have a query like this:
Select import.productid, import.clientid, .....,
Coalesce(industry.productdescription, client.productdescription, outside.productdescription) as productdescription
from import
left outer join dbo.product as industry
on import.productid = industry.productid
and industry.source=NULL
left outer join product as client
on import.productid = client.productid
and client.source=import.clientid
left outer join product as outside
on import.productid = outside.productid
and outside.source=99999
A product can have a description from an industry database and/or client and/or 3rd-part-external source. Only one description will be used, Preference is to use industry if exists, then client if exists, then external, else keep it NULL.
The statement above works, but takes quite a bit of time to run, and it gets repeated weekly for each client (85 clients). import has 20 million records, product has 31 million records.
Instead of joining 3 timesand using Coalesce to get the field, can this be a conditional join?For each import record
Join to product for industry description
If no industry record found
Join to product for client description
If no client record found
Join to product for external description
Note that this is a small piece of the query. There are more fields like productdescription and are also being currently handled with coalesce.
These other fields have the same preference for industry vs. client vs. external, so when the preferred
record is found for productdescription then I've got the values for the other fields as well.
January 20, 2009 at 1:23 pm
From your narrative - I am not sure that what you want to do is possible. Review the article in my signature to see how to post and get a better answer.
A couple of things I noticed right away are:
Usage of = NULL -- will not work, should be IS NULL
Wrong alias used in join (probably your biggest problem).
and industry.source=NULL
left outer join product as client
on import.productid = national.productid
and national.source=import.clientid
Product table is aliased as 'client' but you used the alias 'national' in the join criteria. This could potentially be a massive join - but, this also could just be a typo. I really can't tell because you didn't post sample tables and data to test with.
Another issue is not prefixing your object names with the schema. Using the two-part name and including the schema is a recommended best practice from Microsoft. Not using it can lead to multiple plans in the cache and cause performance issues.
Now, from what I can see - there is no need for a conditional join. You are joining to the Product table on the same key, so you really only need a single join. With that, you then use CASE and/or COALESCE to get the correct values. However, if you are joining to a different row based upon the source column - then, yeah - you need the multiple joins.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
January 20, 2009 at 1:35 pm
You're right on what you point out. This was a quick-typed sample query to demonstrate my question, so things like IsNull, object owners, etc are actually good in the real code, even if I typed it wrong here.
Each record in import is always going to the same product table by productid, it's the source column that is different for each of the 3 hits. So, this leads to the 3 separate joins, each potentially bringing back a record (hence the coalesce).
(import.productid=product.productid )
and (product.source IS NULL
or product.source = 99999
or product.source = import.clientid )
A join like the above would be great if there were only 1 record per productid, but since there can be zero to three records in Product for a productid, it seems that would be problematic for the updates).
January 20, 2009 at 1:42 pm
Based upon what you have stated, then you need to have the Product table joined three times.
I don't think your problems with the query are due to the multiple joins. I would be more concerned with the where clause and indexes on the columns involved in the where clause and the joins.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
January 20, 2009 at 1:55 pm
It'll be a clustered scan on Import, so that doesn't get any better.
The biggest goal is to reduce I/O, so I agree that the indexing is all-crucial for the Product selection. My hope was that there's some tricks I didn't know that could make the 2nd & 3rd joins conditional.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply