Conditional join

  • I have a query like this:

    Select import.productid, import.clientid, .....,

    Coalesce(industry.productdescription, client.productdescription, outside.productdescription) as productdescription

    from import

    left outer join dbo.product as industry

    on import.productid = industry.productid

    and industry.source=NULL

    left outer join product as client

    on import.productid = client.productid

    and client.source=import.clientid

    left outer join product as outside

    on import.productid = outside.productid

    and outside.source=99999

    A product can have a description from an industry database and/or client and/or 3rd-part-external source. Only one description will be used, Preference is to use industry if exists, then client if exists, then external, else keep it NULL.

    The statement above works, but takes quite a bit of time to run, and it gets repeated weekly for each client (85 clients). import has 20 million records, product has 31 million records.

    Instead of joining 3 timesand using Coalesce to get the field, can this be a conditional join?For each import record

    Join to product for industry description

    If no industry record found

    Join to product for client description

    If no client record found

    Join to product for external description

    Note that this is a small piece of the query. There are more fields like productdescription and are also being currently handled with coalesce.

    These other fields have the same preference for industry vs. client vs. external, so when the preferred

    record is found for productdescription then I've got the values for the other fields as well.

  • From your narrative - I am not sure that what you want to do is possible. Review the article in my signature to see how to post and get a better answer.

    A couple of things I noticed right away are:

    Usage of = NULL -- will not work, should be IS NULL

    Wrong alias used in join (probably your biggest problem).

    and industry.source=NULL

    left outer join product as client

    on import.productid = national.productid

    and national.source=import.clientid

    Product table is aliased as 'client' but you used the alias 'national' in the join criteria. This could potentially be a massive join - but, this also could just be a typo. I really can't tell because you didn't post sample tables and data to test with.

    Another issue is not prefixing your object names with the schema. Using the two-part name and including the schema is a recommended best practice from Microsoft. Not using it can lead to multiple plans in the cache and cause performance issues.

    Now, from what I can see - there is no need for a conditional join. You are joining to the Product table on the same key, so you really only need a single join. With that, you then use CASE and/or COALESCE to get the correct values. However, if you are joining to a different row based upon the source column - then, yeah - you need the multiple joins.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • You're right on what you point out. This was a quick-typed sample query to demonstrate my question, so things like IsNull, object owners, etc are actually good in the real code, even if I typed it wrong here.

    Each record in import is always going to the same product table by productid, it's the source column that is different for each of the 3 hits. So, this leads to the 3 separate joins, each potentially bringing back a record (hence the coalesce).

    (import.productid=product.productid )

    and (product.source IS NULL

    or product.source = 99999

    or product.source = import.clientid )

    A join like the above would be great if there were only 1 record per productid, but since there can be zero to three records in Product for a productid, it seems that would be problematic for the updates).

  • Based upon what you have stated, then you need to have the Product table joined three times.

    I don't think your problems with the query are due to the multiple joins. I would be more concerned with the where clause and indexes on the columns involved in the where clause and the joins.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • It'll be a clustered scan on Import, so that doesn't get any better.

    The biggest goal is to reduce I/O, so I agree that the indexing is all-crucial for the Product selection. My hope was that there's some tricks I didn't know that could make the 2nd & 3rd joins conditional.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply