LEFT JOIN with OR - performance

  • Hi,

    When I add an OR into a LEFT JOIN condition, it becomes horrendously slow. It's not just twice as slow, but orders of magnitude slower. What other approach can I take?

    Example:

    SELECT A.ID,

    SUM(CASE WHEN B.ID IS NOT NULL THEN 1 ELSE 0 END) AS [CountOfB]

    FROM A

    LEFT JOIN B

    ON A.ID = B.SomeColumn

    This will return in approximately 1 second.

    SELECT A.ID,

    SUM(CASE WHEN B.ID IS NOT NULL THEN 1 ELSE 0 END) AS [CountOfB]

    FROM A

    LEFT JOIN B

    ON (A.ID = B.SomeColumn OR A.ID = B.AnotherC)

    This will return in around 7 minutes!

    Can anyone suggest an alternate approach? Btw, in reality there are ten columns to test matches against in table B.

    Thanks.

  • EDIT: Sorry I just wasn't happy with my original solution.


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • Sorry, just typed in a simulation of what I actually have. Group is there 🙂

    Nice thinking. I was trying to avoid writing a ton of unions, I'll try your suggestion.

  • Why you are using SUM and CASE to count not null rows? Just use COUNT instead on a specyfic column and you will have same result:

    DECLARE @T1 TABLE (ID INT)

    DECLARE @T2 TABLE (ID1 INT, ID2 INT)

    INSERT INTO @T1 VALUES

    (1),(2),(3),(4),(5),(6),(7),(8)

    INSERT INTO @T2 VALUES

    (1,NULL),(1,NULL),(1,NULL),(3,NULL),(3,8),(5,8)

    SELECT T1.ID, COUNT(ISNULL(T2.ID1,T3.ID2))

    FROM @T1 T1

    LEFT JOIN @T2 T2 ON T1.ID = T2.ID1

    LEFT JOIN @T2 T3 ON T1.ID = T3.ID2

    GROUP BY T1.ID

  • The real joins actually have four other parts. It seemed unrealistic to post the entire schema and would have taken an age to anonymise.

    The originally suggested solution appears to work well, avoiding the need for loads of unions and an outer query and grouping.

  • CraigIW (6/25/2015)


    The real joins actually have four other parts. It seemed unrealistic to post the entire schema and would have taken an age to anonymise.

    The originally suggested solution appears to work well, avoiding the need for loads of unions and an outer query and grouping.

    Well I'm glad it worked out for you. Reason I pulled down my answer is I was focused on your performance issue that I didn't pay attention to what you were trying to pull in. So I wanted to test that the results would in fact come out as you wanted. That's why you should always provide some sample data.

    For simplicity I borrowed some Pan's code and modified as it looked like it was missing another column to join on.

    DECLARE @T1 TABLE (ID INT)

    DECLARE @T2 TABLE (SomeColumn INT, AnotherC INT)

    INSERT INTO @T1 VALUES

    (1),(2),(3),(4),(5),(6),(7),(8)

    INSERT INTO @T2 VALUES

    (1,9),(1,10),(1,4),(3,12),(3,14),(5,6)

    This was your code:

    SELECT A.ID,SUM(CASE WHEN B.SomeColumn IS NOT NULL THEN 1 ELSE 0 END) AS [CountOfB]

    FROM @T1 A

    LEFT JOIN @T2 B ON B.SomeColumn = A.ID OR B.AnotherC = A.ID

    GROUP BY A.ID

    My Solution:

    SELECT A.ID,COUNT(B.SomeColumn) AS [CountOfB]

    FROM @T1 A

    LEFT JOIN @T2 B ON B.SomeColumn = A.ID

    LEFT JOIN @T2 B2 ON B2.AnotherC = A.ID

    GROUP BY A.ID

    That should perform better and as it were, it sounds like that did the trick for you. Notice that Pan was right in that you don't need to use SUM as COUNT will do the trick just fine.

    Cheers,


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • My Solution:

    SELECT A.ID,COUNT(B.SomeColumn) AS [CountOfB]

    FROM @T1 A

    LEFT JOIN @T2 B ON B.SomeColumn = A.ID

    LEFT JOIN @T2 B2 ON B2.AnotherC = A.ID

    GROUP BY A.ID

    That should perform better and as it were, it sounds like that did the trick for you. Notice that Pan was right in that you don't need to use SUM as COUNT will do the trick just fine.

    Cheers,

    you have to consider B2.AnotherC as well, remember these are left joins and some of the B.SomeColumn may be nulls and vice versa. That's why i used ISNULL in my code above.

  • Kutang Pan (6/25/2015)


    My Solution:

    SELECT A.ID,COUNT(B.SomeColumn) AS [CountOfB]

    FROM @T1 A

    LEFT JOIN @T2 B ON B.SomeColumn = A.ID

    LEFT JOIN @T2 B2 ON B2.AnotherC = A.ID

    GROUP BY A.ID

    That should perform better and as it were, it sounds like that did the trick for you. Notice that Pan was right in that you don't need to use SUM as COUNT will do the trick just fine.

    Cheers,

    you have to consider B2.AnotherC as well, remember these are left joins and some of the B.SomeColumn may be nulls and vice versa. That's why i used ISNULL in my code above.

    You are right, I did miss that. However, you did originally miss the OP's second join condition. 😉

    Two heads are better than one.


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply