August 13, 2011 at 6:11 am
In terms of performance which Join is best while updating the records.
I do have two table Tab1(at most 1000 records) and Tab2(1million records)
UPDATE Tab1
SET COL2=TAB2.COL2
FROM TAB1
LEFT JOIN Tab2
ON Tab1.col1=Tab2.col1
or
UPDATE Tab1
SET COL2=TAB2.COL2
FROM TAB1
INNER JOIN Tab2
ON Tab1.col1=Tab2.col1
I need to know what exactly sql internals do when we go for Left join instead of INNER JOIN
August 13, 2011 at 8:08 am
nageshp (8/13/2011)
In terms of performance which Join is best while updating the records.I do have two table Tab1(at most 1000 records) and Tab2(1million records)
UPDATE Tab1
SET COL2=TAB2.COL2
FROM TAB1
LEFT JOIN Tab2
ON Tab1.col1=Tab2.col1
or
UPDATE Tab1
SET COL2=TAB2.COL2
FROM TAB1
INNER JOIN Tab2
ON Tab1.col1=Tab2.col1
I need to know what exactly sql internals do when we go for Left join instead of INNER JOIN
I have to ask, do you know what the difference between the following two queries?
SELECT
t1.COL1,
t2.COL1,
t1.COL2,
t2.COL2
FROM
TAB1 t1
INNER JOIN TAB2 t2
ON (t1.COL1 = t2.COL2)
;
SELECT
t1.COL1,
t2.COL1,
t1.COL2,
t2.COL2
FROM
TAB1 t1
LEFT OUTER JOIN TAB2 t2
ON (t1.COL1 = t2.COL2)
;
August 13, 2011 at 8:08 am
It has nothing to do with performance. Those two queries are not equivalent. You use the one that produces the results you need.
Inner join returns only matching records. Left join returns all the rows from the left table with matching values from the right or null. Depending which of the two you want, you use the appropriate join.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 13, 2011 at 11:21 am
Thanks Gila
Lynn
Regarding your query the first Inner Joins lists records that matches in both tables.LEFT OUTER JOIN returns the all records from table tab1 irrespective of matching records in tab2
Here my concern is that, Inner Join tries to find the matching records from tab2 (1 million records)It means excluding the 1000 records from tab1 it also looks for the remaining mach records from tab2
Scanning for the records that aren't present in tab1 is costly or not?
August 13, 2011 at 11:28 am
This has nothing to do with cost. It has everything to do with what's needed. If you need just matching rows, use an inner join. If you need all rows from left and matching from right tables, use a left join.
Performance comes after correct results. Returning incorrect results fast is worthless.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 13, 2011 at 11:43 am
Per yout post, Tab1 has 1000 records and Tab2 has a million records.
An inner join between those two tables could generate anywhere from 0 records (no matches between Col1 in both tables) to one million records (1 to many match between all the records).
What you have to determine is what are you looking for between these tables?
What you really should do is give us the actual problem you are trying to solve. Before doing that, please read the first two articles I reference below in my signature block. They will help you to understand what you need to provide us to get the help you really need to solve your problem or issue.
August 13, 2011 at 11:52 am
Gail,
Am getting the same result when I select only col1 from both Left and inner join like the below
SELECT Tab1.col1
FROM Tab1
LEFT JOIN Tab2
ON Tab1.col1=Tab2.col1
or
SELECT Tab1.col1
FROM Tab1
INNER JOIN Tab2
ON Tab1.col1=Tab2.col1
From the above two quires which one you prefer?
I have gone through the execution plans and Statistics IO and Time on for the Update statement s in my first post. I haven’t seen any differences in execution plan but through SET STATISITCS IO and TIME on there is difference. Left join elapsed time shows 7 ms where as right join shows 63 ms .
August 13, 2011 at 12:00 pm
nageshp (8/13/2011)
Gail,Am getting the same result when I select only col1 from both Left and inner join like the below
SELECT Tab1.col1
FROM Tab1
LEFT JOIN Tab2
ON Tab1.col1=Tab2.col1
or
SELECT Tab1.col1
FROM Tab1
INNER JOIN Tab2
ON Tab1.col1=Tab2.col1
From the above two quires which one you prefer?
I have gone through the execution plans and Statistics IO and Time on for the Update statement s in my first post. I haven’t seen any differences in execution plan but through SET STATISITCS IO and TIME on there is difference. Left join elapsed time shows 7 ms where as right join shows 63 ms .
Right Join? All I see is an INNER JOIN and a LEFT JOIN.
I will also tell you this, it is hard to help when we can't see what you see. All we can do at this point in time is guess.
Please read the articles I referenced in my earlier post.
August 13, 2011 at 12:05 pm
Thanks Lynn I have gone through your post(articles).Let me update the post with exact data and input .Typo it's not right join it's inner join
August 13, 2011 at 12:25 pm
nageshp (8/13/2011)
Gail,Am getting the same result when I select only col1 from both Left and inner join like the below
That means this is a very specific case, it is not the general case that they return the same.
If there is a row in Tab1 that does not have a matching row in Tab2, do you want it returned or not? If you do, you need a left join. If not, that's an inner join.
Be careful running one test and drawing wide-reaching conclusions based one test. The execution times will vary. Depends if data is in cache, queries are compiled, other stuff is running. Always discard the first result (data caching, plan caching) and then run at least 5 tests and look at the averages.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 13, 2011 at 12:52 pm
Thanks Gail for your update.
Lynn,
Here is the input.
CREATE TABLE TAB1(COL1 INT IDENTITY(1,1),COL2 DATETIME)
DECLARE @I INT
SET @I=1
WHILE (@I<1000)
BEGIN
INSERT INTO TAB1
SELECT GETDATE()
SET @I=@I+1
END
SELECT * FROM TAB1
CREATE TABLE TAB2(COL1 INT IDENTITY(1,1),COL2 DATETIME)
DECLARE @I INT
SET @I=1
WHILE (@I<50000)
BEGIN
INSERT INTO TAB2
SELECT GETDATE()
SET @I=@I+1
END
SELECT count(*) FROM TAB2
SELECT Tab1.col1
FROM Tab1
LEFT JOIN Tab2
ON Tab1.col1=Tab2.col1
or
SELECT Tab1.col1
FROM Tab1
INNER JOIN Tab2
ON Tab1.col1=Tab2.col1
From the above two quires which one you prefer?
UPDATE Tab1
SET COL2=Tab2.COL2
FROM Tab1
LEFT JOIN Tab2
ON Tab1.col1=Tab2.col1
or
UPDATE Tab1
SET COL2=Tab2.COL2
FROM Tab1
INNER JOIN Tab2
ON Tab1.col1=Tab2.col1
from the above which one will be the best?
August 13, 2011 at 1:09 pm
Again, this has nothing to do with preference or best. It has to do with getting the correct results.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 13, 2011 at 10:14 pm
The result set is same for the both the queries.
August 14, 2011 at 3:07 am
nageshp,
it is possible that you get same results for your UPDATE when using either LEFT JOIN or INNER JOIN. However, this is more accidental than reliable. This happens if all keys of your left table are available in your right table. If there are keys missing on the right side, your left table (the target) will be updated with NULL values.
Please copy-paste this into your SSMS and execute it. It will show you a (possible) difference in results:
DECLARE @target TABLE (
Id INT
,AnyText VARCHAR(10)
);
DECLARE @source TABLE (
Id INT
,AnyText VARCHAR(10)
);
-- ////////////////////////////
-- INNER JOIN
INSERT INTO @target VALUES
(1, 'abc')
,(2, 'def');
INSERT INTO @source VALUES
(1, 'xyz');
UPDATE t SET t.AnyText = s.AnyText
FROM @target t
INNER JOIN @source s ON t.Id = s.Id;
SELECT 'INNER JOIN', * FROM @target
-- ////////////////////////////
-- Clean up
DELETE FROM @target;
DELETE FROM @source;
-- ////////////////////////////
-- INNER JOIN
INSERT INTO @target VALUES
(1, 'abc')
,(2, 'def');
INSERT INTO @source VALUES
(1, 'xyz');
UPDATE t SET t.AnyText = s.AnyText
FROM @target t
LEFT JOIN @source s ON t.Id = s.Id;
SELECT 'LEFT JOIN', * FROM @target
Greets
Flo
August 14, 2011 at 5:13 am
The different types of JOIN operations are not interchangeable. They mean, and do, different things. You need to focus on that first. Changing the JOIN type as a means of performance tuning is more than a little bit crazy because it will result in different data being returned when the criteria is right. You need consistency in your queries and the only way to ensure that is to use the constructs the way they are designed.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply