January 22, 2009 at 12:45 pm
Added missing T1 alias in first query
;WITH TABLE_ONE AS (
SELECT ROW_NUMBER() OVER(ORDER BY INTFIELD, VARCHARFIELD, DATEFIELD) AS RN, T1.*
FROM TBL1 T1
),
MATCH_TWO AS (
SELECT T1.*
FROM TABLE_ONE AS T1 INNER JOIN TBL2 AS T2
ON T1.INTFIELD = T2.INTFIELD AND
T1.VARCHARFIELD = T2.VARCHARFIELD AND
T1.DATEFIELD BETWEEN DATEADD(minute,-5,T2.DATEFIELD) AND DATEADD(minute,5,T2.DATEFIELD)
)
SELECT *
FROM TABLE_ONE
WHERE RN NOT IN (SELECT RN FROM MATCH_TWO)
This time it returns the right records with my test data. The problem is this would be a ressource monster with real data.
Just look at the estimated subtree cost ratio
0.0066852 (Left Outer Join)
0.0572076 (CTE)
In both cases it does a full table scan on Tbl1 and Tbl2 (my test tables have no key/indexes) but with the CTE it also builds a Work Table and puts a lot of extra effort there.
I am including my test script, try both our queries with statistics (time and IO) on and also compare execution plan, you will see what I mean.
/*
Create table Tbl1(
ColumnA varchar(20),
ColumnB datetime,
ColumnC int
)
insert Tbl1
select 'abc','20090120 08:17:59.120',1
union
select 'def','20090121 08:17:59.120',2
union
select 'ghi','20090122 08:17:59.120',3
union
select 'jkl','20090123 08:17:59.120',4
Create table Tbl2(
ColumnA varchar(20),
ColumnB datetime,
ColumnC int
)
insert Tbl2
select 'abc','20090120 08:17:59.120',1
union
select 'def','20090121 08:22:59.120',2
union
select 'ghi','20090122 08:12:58.120',3
union
select 'jkl','20090123 08:27:59.120',4
*/
January 22, 2009 at 1:25 pm
Maxim Picard (1/22/2009)
The problem is this would be a ressource monster with real data.Just look at the estimated subtree cost ratio
0.0066852 (Left Outer Join)
0.0572076 (CTE)
In both cases it does a full table scan on Tbl1 and Tbl2 (my test tables have no key/indexes) but with the CTE it also builds a Work Table and puts a lot of extra effort there.
I am including my test script, try both our queries with statistics (time and IO) on and also compare execution plan, you will see what I mean.
I ran both and WHERE NOT EXISTS and EXCEPT. The Left Join and WHERE NOT EXISTS are the best in performance and, IMO, readability, and maintainability.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
January 23, 2009 at 10:00 am
Dear All
It appears the problem has generated quite an interest. I am therefore attaching some sample data for Tbl1 and Tbl2. These are csv files, but as .csv files cannot be uploaded I have changed the ext to .doc. So if there is any problem change .doc to .csv ext type
Thanks and Regards every one
Viewing 3 posts - 16 through 17 (of 17 total)
You must be logged in to reply to this topic. Login to reply