December 9, 2014 at 2:16 am
I'm looking at a poor performing stored proc and I see a "where not exists in (select...)", So I rewrite it as an outer join.
But there is no change in performance.
Is that expected behaviour?
It is not what I was expecting, but I'm often wrong and that's why I test ☺
here is the code I'm using
prep the tables
if exists (select Name from sys.tables where object_id = object_id('MyTableOne'))
begin
drop table MyTableOne
end
;
;WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^4 or 10,000 rows
E8(N) AS (SELECT 1 FROM E4 a, E4 b) -- 1*10^8 or 100,000,000 rows
SELECT TOP (1000000)
RowID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
,RandomID = newid()
,RandomDate =getdate() + checksum(NEWID())%365
,RandomLetter = CHAR(65 + abs(checksum(NEWID()))%26)
into MyTableOne
FROM E8
go
alter table MyTableOne
alter column RowID int not null
go
ALTER TABLE [dbo].MyTableOne ADD CONSTRAINT [PK_MyTableOne] PRIMARY KEY CLUSTERED
(
RowID ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IDX_MyTableOne] ON [dbo].[MyTableOne]
(
RowID ASC, RandomLetter ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
GO
-- **********************
if exists (select Name from sys.tables where object_id = object_id('MyTableTwo'))
begin
drop table MyTableTwo
end
;
;WITH
E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 1*10^1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), -- 1*10^2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), -- 1*10^4 or 10,000 rows
E8(N) AS (SELECT 1 FROM E4 a, E4 b) -- 1*10^8 or 100,000,000 rows
SELECT TOP (1000000)
RowID = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
,RandomID = newid()
,RandomDate =getdate() + checksum(NEWID())%365
,RandomLetter = CHAR(65 + abs(checksum(NEWID()))%26)
into MyTableTwo
FROM E8
go
alter table MyTableTwo
alter column RowID int not null
go
ALTER TABLE [dbo].MyTableTwo ADD CONSTRAINT [PK_MyTableTwo] PRIMARY KEY CLUSTERED
(
RowID ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IDX_MyTableTwo] ON [dbo].[MyTableTwo]
(
RowID ASC, RandomLetter ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
GO
Here is the test Please note I flush the proc cache, so carefull where you run this!
set language british
go
-- Flush the procedure cache for one database only
DECLARE @intDBID INT;
SET @intDBID = (SELECT [dbid]
FROM master.dbo.sysdatabases
WHERE name = DB_NAME());
DBCC FLUSHPROCINDB (@intDBID);
set statistics time on;
set statistics io on;
select randomdate,RandomLetter
from MyTableOne T1
WHERE NOT EXISTS ( SELECT *
FROM MyTableTwo T2
where T1.RowID = T2.RowID
and T1.RandomLetter = T2.RandomLetter )
-- Flush the procedure cache for one database only
DBCC FLUSHPROCINDB (@intDBID);
select T1.randomdate,T1.RandomLetter
from MyTableOne T1
left outer join MyTableTwo T2
on T1.RowID = T2.RowID
and T1.RandomLetter = T2.RandomLetter
where T2.RowID is null
set statistics time off;
set statistics io off;
and my results are
Changed language setting to British.
(1 row(s) affected)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(961577 row(s) affected)
Table 'MyTableTwo'. Scan count 17, logical reads 1580, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'MyTableOne'. Scan count 17, logical reads 5295, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 2370 ms, elapsed time = 5653 ms.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 4 ms.
(961577 row(s) affected)
Table 'MyTableTwo'. Scan count 17, logical reads 1580, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'MyTableOne'. Scan count 17, logical reads 5295, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 2952 ms, elapsed time = 5325 ms.
Thank you for the time to read this,
The cross joins I used to populate the tables is based on a script by Itzik Ben-Gan.
December 9, 2014 at 2:18 am
Ian_McCann (12/9/2014)
I'm looking at a poor performing stored proc and I see a "where not exists in (select...)", So I rewrite it as an outer join.But there is no change in performance.
Is that expected behaviour?
No. The outer joins are usually very slightly slower than NOT EXISTS. Not much, just a bit.
http://sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/
http://sqlinthewild.co.za/index.php/2010/04/27/in-exists-and-join-a-roundup/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 9, 2014 at 2:19 am
Just to clarify, I thought the sub select would perform badly because it references the outer select and some kind of recursion would happen.
December 9, 2014 at 2:20 am
Except if you look at the execution plan, there's no 'recursion' anywhere and correlated subqueries are not automatically horrible things.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 9, 2014 at 2:25 am
Thank you very much for your answer Gail.
( I've got no idea what recursion looks like in an exection plan, I 'll go away and build one :-))
December 9, 2014 at 2:30 am
btw, you say there's no change in performance in your tests, but there is.
Ian_McCann (12/9/2014)
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(961577 row(s) affected)
Table 'MyTableTwo'. Scan count 17, logical reads 1580, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'MyTableOne'. Scan count 17, logical reads 5295, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 2370 ms, elapsed time = 5653 ms.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 4 ms.
(961577 row(s) affected)
Table 'MyTableTwo'. Scan count 17, logical reads 1580, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'MyTableOne'. Scan count 17, logical reads 5295, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 2952 ms, elapsed time = 5325 ms.
2.3 seconds CPU time for the NOT EXISTS, 2.9 seconds CPU time for the join.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 9, 2014 at 4:43 am
You are absolutly correct Gail.
In the example I've posted that is a significant difference.
In the proc that I'm tuning the duration is 50 minutes and the difference is less signficant.
On the plus side, that did give me 50 minutes to read through your links
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply