April 12, 2016 at 4:14 am
Hello, all!
Here is the table:
If OBJECT_ID('Test') Is Not Null
Drop Table Test
GO
Create Table Test
(
ID INT Identity (1,1) Primary Key Clustered,
Value INT not Null
)
Declare @i int = 0
While @i <= 100
Begin
Insert Into
dbo.Test(Value)
Select
ABS(CheckSUM(NEWID()) / 100000000)
Set @i += 1
End
This is the query:
--Set Statistics Profile Off
Select
Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)
--Set Statistics Profile Off
And use the actual plan or statistic profile.
Could anyone explain why does Optimizer seeks the index twice to get ID.
P.S. If you rewrite the query like
Select
IsNull ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)
you see just one seek and here is the difference between IsNull and Coalesce. Because Coalesce is the other form of the Case-operator. But the question is the different.
April 12, 2016 at 6:05 am
Can I suggest a non-looping way of doing this?
INSERT dbo.test
(
Value
)
SELECT TOP 100
ABS(CHECKSUM(NEWID()) / 100000000)
FROM sys.columns c1;
I think you'll find it's somewhat faster.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 12, 2016 at 6:11 am
Yeah, you are right 🙂
April 12, 2016 at 2:30 pm
Peter Shilovich (4/12/2016)
This is the query:
--Set Statistics Profile Off
Select
Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)
--Set Statistics Profile Off
And use the actual plan or statistic profile.
Could anyone explain why does Optimizer seeks the index twice to get ID.
That is because COALESCE(Expr1, Expr2, ...) is defined to be equal to CASE WHEN Expr1 IS NOT NULL THEN Expr1 WHEN Expr2 IS NOT NULL THEN Expr2 ... ELSE NULL END. As you see, each expression is listed twice. And SQL Server faithfully evaluates it that way. In some cases that could be optimized. And in some cases that is not even possible because there are some builtin functions(**) that can return a different result on the second execution - and as unintuitive as it sounds, based on the official definitioin of COALESCE it has to be implemented like that.
(**) I am a bit tired now and my head will not clear. I cannot remember if NEWID() is one of those functions. If that is the case, then you might find some very unexpected results when you run this
April 12, 2016 at 3:00 pm
NewID() doesn't matter in this case
I rewrote it like
Select
Coalesce ((Select Value From Test Where ID = -2), -1)
or even
Select
Case
When
(Select Value From Test Where ID = -2) is not Null
Then
(Select Value From Test Where ID = -2)
Else
-1
End
and the result is the same. It seems that the (Select Value From Test Where ID = -2) is calculated every time though it makes sense in two cases, when the expression is nondeterministic or when there is no transaction. But here we have outer Select that means implicit transaction and the result is obviously deterministic. So there's a way Optimizer works.
April 12, 2016 at 3:09 pm
Peter Shilovich (4/12/2016)
NewID() doesn't matter in this case
Yes, I know. Perhaps I phrased it in a confusing way. What I wanted to say is that (a) there are cases when evaluating once or twice can make a difference; and (b) the optimizer will always evaluate twice, even in cases where it cannot make a difference.
April 13, 2016 at 7:13 pm
Peter Shilovich (4/12/2016)
NewID() doesn't matter in this case
Actually NEWID() does matter in this case.
Check the execution plan for
Select
Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)
Select
Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)
|--Compute Scalar(DEFINE: ([Expr1010]=CASE WHEN [Expr1012] IS NOT NULL THEN [Expr1014] ELSE (-1) END))
|--Nested Loops(Left Outer Join, PASSTHRU: (IsFalseOrNull [Expr1012] IS NOT NULL))
|--Nested Loops(Left Outer Join)
| |--Constant Scan
| |--Assert(WHERE: (CASE WHEN [Expr1011]>(1) THEN (0) ELSE NULL END))
| |--Stream Aggregate(DEFINE: ([Expr1011]=Count(*), [Expr1012]=ANY([tempdb].[dbo].[Test].[Value])))
| |--Nested Loops(Inner Join, OUTER REFERENCES: ([Expr1003]))
| |--Compute Scalar(DEFINE: ([Expr1003]=abs(checksum(newid())/(100000000))))
| | |--Constant Scan
| |--Clustered Index Seek(OBJECT: ([tempdb].[dbo].[Test].[PK__Test__3214EC270CBAE877]), SEEK: ([tempdb].[dbo].[Test].[ID]=[Expr1003]) ORDERED FORWARD)
|--Assert(WHERE: (CASE WHEN [Expr1013]>(1) THEN (0) ELSE NULL END))
|--Stream Aggregate(DEFINE: ([Expr1013]=Count(*), [Expr1014]=ANY([tempdb].[dbo].[Test].[Value])))
|--Nested Loops(Inner Join, OUTER REFERENCES: ([Expr1008]))
|--Compute Scalar(DEFINE: ([Expr1008]=abs(checksum(newid())/(100000000))))
| |--Constant Scan
|--Clustered Index Seek(OBJECT: ([tempdb].[dbo].[Test].[PK__Test__3214EC270CBAE877]), SEEK: ([tempdb].[dbo].[Test].[ID]=[Expr1008]) ORDERED FORWARD)
Because you use subquery it's executed on every evaluation attempt.
And on every other execution NEWID() returns different value, so the expression is different from time to time.
So, if you run it often enough you'll certainly see NULL returned by this COALESCE from time to time.
Try to run this:
SELECT *
FROM dbo.Test t
INNER JOIN (Select ABS(CheckSUM(NEWID())/100000000)) V (ID) ON V.ID = t.ID
It returns totally unpredictable number (well, limited by the number of rows in Test) of different rows from the table Test.
It happens because on every run of nested loop (INNER JOIN ) the subquery is re-executed with different resulting value.
Be careful with NEWID() and avoid using it in SELECT parts of queries.
_____________
Code for TallyGenerator
April 14, 2016 at 3:45 am
Yes, in this respect it matters and makes sense 'cause the subquery (Select ABS(CheckSUM(NEWID()) / 100000000)) is nondeterministic.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply