Why Query Optimizer doesn't reuse value

Question

Post reply

Why Query Optimizer doesn't reuse value

Peter Shilovich

SSC Eights!

Points: 882
More actions
April 12, 2016 at 4:14 am

#315504

Hello, all!
Here is the table:
If OBJECT_ID('Test') Is Not Null
Drop Table Test
GO
Create Table Test
(
ID INT Identity (1,1) Primary Key Clustered,
Value INT not Null
)
Declare @i int = 0
While @i <= 100
Begin
Insert Into
dbo.Test(Value)
Select
ABS(CheckSUM(NEWID()) / 100000000)
Set @i += 1
End
This is the query:
--Set Statistics Profile Off
Select
Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)
--Set Statistics Profile Off
And use the actual plan or statistic profile.
Could anyone explain why does Optimizer seeks the index twice to get ID.
P.S. If you rewrite the query like
Select
IsNull ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)
you see just one seek and here is the difference between IsNull and Coalesce. Because Coalesce is the other form of the Case-operator. But the question is the different.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Phil Parkin SSC Guru Points: 246955 More actions · Answer 1

Can I suggest a non-looping way of doing this?

INSERT dbo.test

(

Value

)

SELECT TOP 100

ABS(CHECKSUM(NEWID()) / 100000000)

FROM sys.columns c1;

I think you'll find it's somewhat faster.

The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin

Peter Shilovich SSC Eights! Points: 882 More actions · Answer 2

Peter Shilovich

SSC Eights!

Points: 882

April 12, 2016 at 6:11 am

#1871353

Yeah, you are right 🙂

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 3

Peter Shilovich (4/12/2016)
This is the query:
--Set Statistics Profile Off
Select
Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)
--Set Statistics Profile Off
And use the actual plan or statistic profile.
Could anyone explain why does Optimizer seeks the index twice to get ID.

That is because COALESCE(Expr1, Expr2, ...) is defined to be equal to CASE WHEN Expr1 IS NOT NULL THEN Expr1 WHEN Expr2 IS NOT NULL THEN Expr2 ... ELSE NULL END. As you see, each expression is listed twice. And SQL Server faithfully evaluates it that way. In some cases that could be optimized. And in some cases that is not even possible because there are some builtin functions(**) that can return a different result on the second execution - and as unintuitive as it sounds, based on the official definitioin of COALESCE it has to be implemented like that.

(**) I am a bit tired now and my head will not clear. I cannot remember if NEWID() is one of those functions. If that is the case, then you might find some very unexpected results when you run this

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Peter Shilovich SSC Eights! Points: 882 More actions · Answer 4

NewID() doesn't matter in this case

I rewrote it like

Select

Coalesce ((Select Value From Test Where ID = -2), -1)

or even

Select

Case

When

(Select Value From Test Where ID = -2) is not Null

Then

(Select Value From Test Where ID = -2)

Else

-1

End

and the result is the same. It seems that the (Select Value From Test Where ID = -2) is calculated every time though it makes sense in two cases, when the expression is nondeterministic or when there is no transaction. But here we have outer Select that means implicit transaction and the result is obviously deterministic. So there's a way Optimizer works.

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 5

Peter Shilovich (4/12/2016)
NewID() doesn't matter in this case

Yes, I know. Perhaps I phrased it in a confusing way. What I wanted to say is that (a) there are cases when evaluating once or twice can make a difference; and (b) the optimizer will always evaluate twice, even in cases where it cannot make a difference.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Sergiy SSC Guru Points: 110208 More actions · Answer 6

Peter Shilovich (4/12/2016)
NewID() doesn't matter in this case

Actually NEWID() does matter in this case.

Check the execution plan for

Select

Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)

Select

Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)

|--Compute Scalar(DEFINE: ([Expr1010]=CASE WHEN [Expr1012] IS NOT NULL THEN [Expr1014] ELSE (-1) END))

|--Nested Loops(Left Outer Join, PASSTHRU: (IsFalseOrNull [Expr1012] IS NOT NULL))

|--Nested Loops(Left Outer Join)

| |--Constant Scan

| |--Assert(WHERE: (CASE WHEN [Expr1011]>(1) THEN (0) ELSE NULL END))

| |--Stream Aggregate(DEFINE: ([Expr1011]=Count(*), [Expr1012]=ANY([tempdb].[dbo].[Test].[Value])))

| |--Nested Loops(Inner Join, OUTER REFERENCES: ([Expr1003]))

| |--Compute Scalar(DEFINE: ([Expr1003]=abs(checksum(newid())/(100000000))))

| | |--Constant Scan

| |--Clustered Index Seek(OBJECT: ([tempdb].[dbo].[Test].[PK__Test__3214EC270CBAE877]), SEEK: ([tempdb].[dbo].[Test].[ID]=[Expr1003]) ORDERED FORWARD)

|--Assert(WHERE: (CASE WHEN [Expr1013]>(1) THEN (0) ELSE NULL END))

|--Stream Aggregate(DEFINE: ([Expr1013]=Count(*), [Expr1014]=ANY([tempdb].[dbo].[Test].[Value])))

|--Nested Loops(Inner Join, OUTER REFERENCES: ([Expr1008]))

|--Compute Scalar(DEFINE: ([Expr1008]=abs(checksum(newid())/(100000000))))

| |--Constant Scan

|--Clustered Index Seek(OBJECT: ([tempdb].[dbo].[Test].[PK__Test__3214EC270CBAE877]), SEEK: ([tempdb].[dbo].[Test].[ID]=[Expr1008]) ORDERED FORWARD)

Because you use subquery it's executed on every evaluation attempt.

And on every other execution NEWID() returns different value, so the expression is different from time to time.

So, if you run it often enough you'll certainly see NULL returned by this COALESCE from time to time.

Try to run this:

SELECT *

FROM dbo.Test t

INNER JOIN (Select ABS(CheckSUM(NEWID())/100000000)) V (ID) ON V.ID = t.ID

It returns totally unpredictable number (well, limited by the number of rows in Test) of different rows from the table Test.

It happens because on every run of nested loop (INNER JOIN ) the subquery is re-executed with different resulting value.

Be careful with NEWID() and avoid using it in SELECT parts of queries.

_____________
Code for TallyGenerator

Peter Shilovich SSC Eights! Points: 882 More actions · Answer 7

Yes, in this respect it matters and makes sense 'cause the subquery (Select ABS(CheckSUM(NEWID()) / 100000000)) is nondeterministic.