September 4, 2008 at 4:12 pm
PW (9/4/2008)
>>but then the filter "D2.CORPOID IS NULL AND D.CORPOID IS NOT NULL" negates the join! The query will never return any rows!Nope, that's not true in this case.
The check on D2.CORPOID IS NULL is how the query is implementing "NOT EXISTS" functionality.
When you want to perform "NOT EXISTS" in SQL there are several options, here are 3 of them:
1. NOT EXISTS ( subquery )
2. NOT IN ( subquery )
3. LEFT JOIN Table2 and check NULL on 1 of the columns in Table2
The query is written to use method #3. The condition D2.CORPOID IS NULL evaluates to True in cases where the LEFT JOIN does not join to any matching rows.
I'm not sure I agree.
I tried this test.
Created 2 tables test1 and test2.
CREATE TABLE [dbo].[test1](
[oid] [int] NULL,
[corpoid] [int] NULL,
[data] [varchar](50) NULL
)
CREATE TABLE [dbo].[test2](
[oid] [int] NULL,
[corpoid] [int] NULL,
[data] [varchar](50) NULL
)
insert into test1 values (1, 10, 'a')
insert into test1 values (2, 20, 'b')
insert into test1 values (3, 30, 'c')
insert into test2 values (1, NULL, 'd')
insert into test2 values (2, NULL, 'e')
insert into test2 values (3, 30, 'f')
This returns no rows:
select * from test1
left join test2
on test1.corpoid = test2.corpoid
where test1.corpoid is null and test2.corpoid is not null
This returns the following 3 rows (as expected):
select * from test1
left join test2
on test1.corpoid = test2.corpoid
oid corpoid data oid corpoid data
----------- ----------- -------------------------------------------------- ----------- ----------- --------------------------------------------------
1 10 a NULL NULL NULL
2 20 b NULL NULL NULL
3 30 c 3 30 f
(3 row(s) affected)
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
September 4, 2008 at 4:21 pm
OK, more correctly re-stating #3:
3. LEFT JOIN Table2 and check NULL on 1 of the non-nullable columns in Table2
It wasn't clear that the column being joined on allows Nulls.
September 4, 2008 at 4:22 pm
I'm sorry, you are absolutely right, this does work indeed (returns 2 rows as expected):
select * from test1
left join test2
on test1.corpoid = test2.corpoid
where test1.corpoid is NOT null and test2.corpoid is null
My original query in the test had the "NOT" misplaced.
I'm getting tired! :w00t:
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
September 5, 2008 at 12:24 am
I hope this settles things!
CREATE TABLE#Sample
(
coID INT,
cotID INT,
ccbID INT,
mcT CHAR(1)
)
INSERT#Sample
SELECT1, 1, 100000, 'R' UNION ALL-- Test case 1 pass
SELECT2, 1, 100000, 'y' UNION ALL-- Test case 2 fail
SELECT3, 1, 100000, 'F' UNION ALL-- Test case 3 fail
SELECT3, 2, 100000, 'y' UNION ALL-- Test case 3 fail
SELECT4, 9, 100000, 'f' UNION ALL-- Test case 4 pass
SELECT4, 8, 100000, 'g'-- Test case 4 pass
-- Original query
SELECTd.coID,
MIN(d.cotID)
FROM#Sample AS d
LEFT JOIN(
SELECTcoID
FROM#Sample
WHEREccbID = 100000
AND mcT = 'Y'
GROUP BYcoID
) AS d2 ON d2.coID = d.coID
WHEREd.ccbID = 100000
AND d2.coID IS NULL
AND d.coID IS NOT NULL
GROUP BYd.coID
-- Peso
SELECTcoID,
MIN(cotID)
FROM#Sample
WHEREccbID = 100000
AND coID IS NOT NULL
GROUP BYcoID
HAVINGMAX(CASE WHEN mcT = 'Y' THEN 1 ELSE 0 END) = 0
DROP TABLE#Sample
Original query execution plan is
|--Stream Aggregate(GROUP BY)
|--Sort(ORDER BY)
|--Filter
|--Nested Loops(Left Outer Join)
|--Table Scan
|--Compute Scalar
|--Stream Aggregate
|--Table Scan
And the query plan for my suggestion
|--Filter
|--Stream Aggregate(GROUP BY)
|--Sort(ORDER BY)
|--Compute Scalar
|--Table Scan
And if I also create the suggested index, this is the execution plan for the original query
|--Stream Aggregate(GROUP BY)
|--Filter
|--Nested Loops(Left Outer Join)
|--Index Seek
|--Compute Scalar
|--Stream Aggregate
|--Table Scan
And this is the execution plan for my suggestion with the suggested index in place
|--Filter
|--Stream Aggregate(GROUP BY)
|--Compute Scalar
|--Index Seek
N 56°04'39.16"
E 12°55'05.25"
September 5, 2008 at 1:46 am
And for closure, this is a comparison between all queries with and without indexes.
CREATE TABLE#Sample_noindex
(
coID INT,
cotID INT,
ccbID INT,
mcT CHAR(1)
)
INSERT#Sample_noindex
SELECT1, 1, 100000, 'R' UNION ALL-- Test case 1 pass
SELECT2, 1, 100000, 'y' UNION ALL-- Test case 2 fail
SELECT3, 1, 100000, 'F' UNION ALL-- Test case 3 fail
SELECT3, 2, 100000, 'y' UNION ALL-- Test case 3 fail
SELECT4, 9, 100000, 'f' UNION ALL-- Test case 4 pass
SELECT4, 8, 100000, 'g'-- Test case 4 pass
CREATE TABLE#Sample_index
(
coID INT,
cotID INT,
ccbID INT,
mcT CHAR(1)
)
INSERT#Sample_index
SELECT1, 1, 100000, 'R' UNION ALL-- Test case 1 pass
SELECT2, 1, 100000, 'y' UNION ALL-- Test case 2 fail
SELECT3, 1, 100000, 'F' UNION ALL-- Test case 3 fail
SELECT3, 2, 100000, 'y' UNION ALL-- Test case 3 fail
SELECT4, 9, 100000, 'f' UNION ALL-- Test case 4 pass
SELECT4, 8, 100000, 'g'-- Test case 4 pass
CREATE NONCLUSTERED INDEX IX_tblName ON #Sample_index (ccbID, coID, cotID, mcT)
-- Original query without index
SELECTd.coID,
MIN(d.cotID)
FROM#Sample_noindex AS d
LEFT JOIN(
SELECTcoID
FROM#Sample_noindex
WHEREccbID = 100000
AND mcT = 'Y'
GROUP BYcoID
) AS d2 ON d2.coID = d.coID
WHEREd.ccbID = 100000
AND d2.coID IS NULL
AND d.coID IS NOT NULL
GROUP BYd.coID
-- Peso without index
SELECTcoID,
MIN(cotID)
FROM#Sample_noindex
WHEREccbID = 100000
AND coID IS NOT NULL
GROUP BYcoID
HAVINGMAX(CASE WHEN mcT = 'Y' THEN 1 ELSE 0 END) = 0
-- Original query with index
SELECTd.coID,
MIN(d.cotID)
FROM#Sample_index AS d
LEFT JOIN(
SELECTcoID
FROM#Sample_index
WHEREccbID = 100000
AND mcT = 'Y'
GROUP BYcoID
) AS d2 ON d2.coID = d.coID
WHEREd.ccbID = 100000
AND d2.coID IS NULL
AND d.coID IS NOT NULL
GROUP BYd.coID
-- Peso with index
SELECTcoID,
MIN(cotID)
FROM#Sample_index
WHEREccbID = 100000
AND coID IS NOT NULL
GROUP BYcoID
HAVINGMAX(CASE WHEN mcT = 'Y' THEN 1 ELSE 0 END) = 0
DROP TABLE#Sample_noindex,
#Sample_index
N 56°04'39.16"
E 12°55'05.25"
September 5, 2008 at 1:58 am
And the final test, as realistic as possible without index.
CREATE TABLE#Original
(
RowID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
coID INT,
cotID INT,
ccbID INT,
mcT CHAR(1)
)
INSERT#Original
(
coID,
cotID,
ccbID,
mcT
)
SELECTTOP 2402039
ABS(CHECKSUM(NEWID())) % 88589,
ABS(CHECKSUM(NEWID())) % 93500,
99932 + ABS(CHECKSUM(NEWID())) % 136,
CHAR(88 + ABS(CHECKSUM(NEWID())) % 2)
FROMsysobjects AS so1
CROSS JOINsysobjects AS so2
-- Original query
SELECTd.coID,
MIN(d.cotID)
FROM#Original AS d
LEFT JOIN(
SELECTcoID
FROM#Original
WHEREccbID = 100000
AND mcT = 'Y'
GROUP BYcoID
) AS d2 ON d2.coID = d.coID
WHEREd.ccbID = 100000
AND d2.coID IS NULL
AND d.coID IS NOT NULL
GROUP BYd.coID
-- Peso
SELECTcoID,
MIN(cotID)
FROM#Original
WHEREccbID = 100000
AND coID IS NOT NULL
GROUP BYcoID
HAVINGMAX(CASE WHEN mcT = 'Y' THEN 1 ELSE 0 END) = 0
DROP TABLE#Original
N 56°04'39.16"
E 12°55'05.25"
September 5, 2008 at 1:59 am
And as realistic as possible with index
CREATE TABLE#Original
(
RowID INT IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
coID INT,
cotID INT,
ccbID INT,
mcT CHAR(1)
)
INSERT#Original
(
coID,
cotID,
ccbID,
mcT
)
SELECTTOP 2402039
ABS(CHECKSUM(NEWID())) % 88589,
ABS(CHECKSUM(NEWID())) % 93500,
99932 + ABS(CHECKSUM(NEWID())) % 136,
CHAR(88 + ABS(CHECKSUM(NEWID())) % 2)
FROMsysobjects AS so1
CROSS JOINsysobjects AS so2
CREATE NONCLUSTERED INDEX IX_tblName ON #Original (ccbID, coID, cotID, mcT)
-- Original query
SELECTd.coID,
MIN(d.cotID)
FROM#Original AS d
LEFT JOIN(
SELECTcoID
FROM#Original
WHEREccbID = 100000
AND mcT = 'Y'
GROUP BYcoID
) AS d2 ON d2.coID = d.coID
WHEREd.ccbID = 100000
AND d2.coID IS NULL
AND d.coID IS NOT NULL
GROUP BYd.coID
-- Peso
SELECTcoID,
MIN(cotID)
FROM#Original
WHEREccbID = 100000
AND coID IS NOT NULL
GROUP BYcoID
HAVINGMAX(CASE WHEN mcT = 'Y' THEN 1 ELSE 0 END) = 0
DROP TABLE#Original
N 56°04'39.16"
E 12°55'05.25"
September 5, 2008 at 3:58 am
Thank you for all this work!
I will talk to the developer about changing the query accordingly.
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
September 5, 2008 at 6:25 am
Thank you for the feedback.
It's nice to see the query went from about 800 ms / 15500 reads down to 15 ms / 56 reads.
N 56°04'39.16"
E 12°55'05.25"
September 5, 2008 at 11:16 am
Peso (9/5/2008)
Thank you for the feedback.It's nice to see the query went from about 800 ms / 15500 reads down to 15 ms / 56 reads.
The only drawback I see with your solution - and don't take this the wrong way 😉 - is that it is not as readable/understandable as the original query.
I don't think I understand that last line fully:
HAVING MAX(CASE WHEN mcT = 'Y' THEN 1 ELSE 0 END) = 0
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
September 5, 2008 at 11:05 pm
What the query does, is that it select all records where coID is not NULL and where ccbID = 100000.
For this set of records, the query then group them over coID.
The HAVING PART then checks if any of the records for every group has a mcT = 'Y'.
If so, the query translate it to a 1, all other mcT values get a 0.
To be a valid group, no record must have a mcT value of 'Y', ie 1.
So max for any group need to be 0, ie no record for any group has a mcT value of 'Y'. Which is exactly what the LEFT JOIN is doing.
However, if the two only possible values for mCT are 'Y' and 'N', you can simplify the HAVING part as
HAVING MAX(mCT) = 'N'
N 56°04'39.16"
E 12°55'05.25"
September 6, 2008 at 5:02 am
Peso (9/5/2008)
What the query does, is that it select all records where coID is not NULL and where ccbID = 100000.For this set of records, the query then group them over coID.
The HAVING PART then checks if any of the records for every group has a mcT = 'Y'.
If so, the query translate it to a 1, all other mcT values get a 0.
To be a valid group, no record must have a mcT value of 'Y', ie 1.
So max for any group need to be 0, ie no record for any group has a mcT value of 'Y'. Which is exactly what the LEFT JOIN is doing.
However, if the two only possible values for mCT are 'Y' and 'N', you can simplify the HAVING part as
HAVING MAX(mCT) = 'N'
I see, it's clearer for me now. Thanks again!
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
September 6, 2008 at 6:08 pm
Heh... if YOU had a question on what it does, others will to. Time to break out the grossly underused "dash dash" function. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
September 7, 2008 at 5:01 am
hi
September 7, 2008 at 5:07 am
Jeff Moden (9/6/2008)
Heh... if YOU had a question on what it does, others will to. Time to break out the grossly underused "dash dash" function. 😉
I still don't fully get it, it's a tough one... 😉
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
Viewing 15 posts - 16 through 30 (of 35 total)
You must be logged in to reply to this topic. Login to reply