June 25, 2015 at 8:03 am
Hi,
When I add an OR into a LEFT JOIN condition, it becomes horrendously slow. It's not just twice as slow, but orders of magnitude slower. What other approach can I take?
Example:
SELECT A.ID,
SUM(CASE WHEN B.ID IS NOT NULL THEN 1 ELSE 0 END) AS [CountOfB]
FROM A
LEFT JOIN B
ON A.ID = B.SomeColumn
This will return in approximately 1 second.
SELECT A.ID,
SUM(CASE WHEN B.ID IS NOT NULL THEN 1 ELSE 0 END) AS [CountOfB]
FROM A
LEFT JOIN B
ON (A.ID = B.SomeColumn OR A.ID = B.AnotherC)
This will return in around 7 minutes!
Can anyone suggest an alternate approach? Btw, in reality there are ten columns to test matches against in table B.
Thanks.
June 25, 2015 at 8:24 am
EDIT: Sorry I just wasn't happy with my original solution.
June 25, 2015 at 8:36 am
Sorry, just typed in a simulation of what I actually have. Group is there 🙂
Nice thinking. I was trying to avoid writing a ton of unions, I'll try your suggestion.
June 25, 2015 at 8:47 am
Why you are using SUM and CASE to count not null rows? Just use COUNT instead on a specyfic column and you will have same result:
DECLARE @T1 TABLE (ID INT)
DECLARE @T2 TABLE (ID1 INT, ID2 INT)
INSERT INTO @T1 VALUES
(1),(2),(3),(4),(5),(6),(7),(8)
INSERT INTO @T2 VALUES
(1,NULL),(1,NULL),(1,NULL),(3,NULL),(3,8),(5,8)
SELECT T1.ID, COUNT(ISNULL(T2.ID1,T3.ID2))
FROM @T1 T1
LEFT JOIN @T2 T2 ON T1.ID = T2.ID1
LEFT JOIN @T2 T3 ON T1.ID = T3.ID2
GROUP BY T1.ID
June 25, 2015 at 8:54 am
The real joins actually have four other parts. It seemed unrealistic to post the entire schema and would have taken an age to anonymise.
The originally suggested solution appears to work well, avoiding the need for loads of unions and an outer query and grouping.
June 25, 2015 at 11:32 am
CraigIW (6/25/2015)
The real joins actually have four other parts. It seemed unrealistic to post the entire schema and would have taken an age to anonymise.The originally suggested solution appears to work well, avoiding the need for loads of unions and an outer query and grouping.
Well I'm glad it worked out for you. Reason I pulled down my answer is I was focused on your performance issue that I didn't pay attention to what you were trying to pull in. So I wanted to test that the results would in fact come out as you wanted. That's why you should always provide some sample data.
For simplicity I borrowed some Pan's code and modified as it looked like it was missing another column to join on.
DECLARE @T1 TABLE (ID INT)
DECLARE @T2 TABLE (SomeColumn INT, AnotherC INT)
INSERT INTO @T1 VALUES
(1),(2),(3),(4),(5),(6),(7),(8)
INSERT INTO @T2 VALUES
(1,9),(1,10),(1,4),(3,12),(3,14),(5,6)
This was your code:
SELECT A.ID,SUM(CASE WHEN B.SomeColumn IS NOT NULL THEN 1 ELSE 0 END) AS [CountOfB]
FROM @T1 A
LEFT JOIN @T2 B ON B.SomeColumn = A.ID OR B.AnotherC = A.ID
GROUP BY A.ID
My Solution:
SELECT A.ID,COUNT(B.SomeColumn) AS [CountOfB]
FROM @T1 A
LEFT JOIN @T2 B ON B.SomeColumn = A.ID
LEFT JOIN @T2 B2 ON B2.AnotherC = A.ID
GROUP BY A.ID
That should perform better and as it were, it sounds like that did the trick for you. Notice that Pan was right in that you don't need to use SUM as COUNT will do the trick just fine.
Cheers,
June 25, 2015 at 12:13 pm
My Solution:
SELECT A.ID,COUNT(B.SomeColumn) AS [CountOfB]
FROM @T1 A
LEFT JOIN @T2 B ON B.SomeColumn = A.ID
LEFT JOIN @T2 B2 ON B2.AnotherC = A.ID
GROUP BY A.ID
That should perform better and as it were, it sounds like that did the trick for you. Notice that Pan was right in that you don't need to use SUM as COUNT will do the trick just fine.
Cheers,
you have to consider B2.AnotherC as well, remember these are left joins and some of the B.SomeColumn may be nulls and vice versa. That's why i used ISNULL in my code above.
June 25, 2015 at 12:40 pm
Kutang Pan (6/25/2015)
My Solution:
SELECT A.ID,COUNT(B.SomeColumn) AS [CountOfB]
FROM @T1 A
LEFT JOIN @T2 B ON B.SomeColumn = A.ID
LEFT JOIN @T2 B2 ON B2.AnotherC = A.ID
GROUP BY A.ID
That should perform better and as it were, it sounds like that did the trick for you. Notice that Pan was right in that you don't need to use SUM as COUNT will do the trick just fine.
Cheers,
you have to consider B2.AnotherC as well, remember these are left joins and some of the B.SomeColumn may be nulls and vice versa. That's why i used ISNULL in my code above.
You are right, I did miss that. However, you did originally miss the OP's second join condition. 😉
Two heads are better than one.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply