Why Query Optimizer doesn't reuse value

  • Hello, all!

    Here is the table:

    If OBJECT_ID('Test') Is Not Null

    Drop Table Test

    GO

    Create Table Test

    (

    ID INT Identity (1,1) Primary Key Clustered,

    Value INT not Null

    )

    Declare @i int = 0

    While @i <= 100

    Begin

    Insert Into

    dbo.Test(Value)

    Select

    ABS(CheckSUM(NEWID()) / 100000000)

    Set @i += 1

    End

    This is the query:

    --Set Statistics Profile Off

    Select

    Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)

    --Set Statistics Profile Off

    And use the actual plan or statistic profile.

    Could anyone explain why does Optimizer seeks the index twice to get ID.

    P.S. If you rewrite the query like

    Select

    IsNull ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)

    you see just one seek and here is the difference between IsNull and Coalesce. Because Coalesce is the other form of the Case-operator. But the question is the different.

  • Can I suggest a non-looping way of doing this?

    INSERT dbo.test

    (

    Value

    )

    SELECT TOP 100

    ABS(CHECKSUM(NEWID()) / 100000000)

    FROM sys.columns c1;

    I think you'll find it's somewhat faster.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Yeah, you are right 🙂

  • Peter Shilovich (4/12/2016)


    This is the query:

    --Set Statistics Profile Off

    Select

    Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)

    --Set Statistics Profile Off

    And use the actual plan or statistic profile.

    Could anyone explain why does Optimizer seeks the index twice to get ID.

    That is because COALESCE(Expr1, Expr2, ...) is defined to be equal to CASE WHEN Expr1 IS NOT NULL THEN Expr1 WHEN Expr2 IS NOT NULL THEN Expr2 ... ELSE NULL END. As you see, each expression is listed twice. And SQL Server faithfully evaluates it that way. In some cases that could be optimized. And in some cases that is not even possible because there are some builtin functions(**) that can return a different result on the second execution - and as unintuitive as it sounds, based on the official definitioin of COALESCE it has to be implemented like that.

    (**) I am a bit tired now and my head will not clear. I cannot remember if NEWID() is one of those functions. If that is the case, then you might find some very unexpected results when you run this


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • NewID() doesn't matter in this case

    I rewrote it like

    Select

    Coalesce ((Select Value From Test Where ID = -2), -1)

    or even

    Select

    Case

    When

    (Select Value From Test Where ID = -2) is not Null

    Then

    (Select Value From Test Where ID = -2)

    Else

    -1

    End

    and the result is the same. It seems that the (Select Value From Test Where ID = -2) is calculated every time though it makes sense in two cases, when the expression is nondeterministic or when there is no transaction. But here we have outer Select that means implicit transaction and the result is obviously deterministic. So there's a way Optimizer works.

  • Peter Shilovich (4/12/2016)


    NewID() doesn't matter in this case

    Yes, I know. Perhaps I phrased it in a confusing way. What I wanted to say is that (a) there are cases when evaluating once or twice can make a difference; and (b) the optimizer will always evaluate twice, even in cases where it cannot make a difference.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Peter Shilovich (4/12/2016)


    NewID() doesn't matter in this case

    Actually NEWID() does matter in this case.

    Check the execution plan for

    Select

    Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)

    Select

    Coalesce ((Select Value From Test Where ID = (Select ABS(CheckSUM(NEWID()) / 100000000))), -1)

    |--Compute Scalar(DEFINE: ([Expr1010]=CASE WHEN [Expr1012] IS NOT NULL THEN [Expr1014] ELSE (-1) END))

    |--Nested Loops(Left Outer Join, PASSTHRU: (IsFalseOrNull [Expr1012] IS NOT NULL))

    |--Nested Loops(Left Outer Join)

    | |--Constant Scan

    | |--Assert(WHERE: (CASE WHEN [Expr1011]>(1) THEN (0) ELSE NULL END))

    | |--Stream Aggregate(DEFINE: ([Expr1011]=Count(*), [Expr1012]=ANY([tempdb].[dbo].[Test].[Value])))

    | |--Nested Loops(Inner Join, OUTER REFERENCES: ([Expr1003]))

    | |--Compute Scalar(DEFINE: ([Expr1003]=abs(checksum(newid())/(100000000))))

    | | |--Constant Scan

    | |--Clustered Index Seek(OBJECT: ([tempdb].[dbo].[Test].[PK__Test__3214EC270CBAE877]), SEEK: ([tempdb].[dbo].[Test].[ID]=[Expr1003]) ORDERED FORWARD)

    |--Assert(WHERE: (CASE WHEN [Expr1013]>(1) THEN (0) ELSE NULL END))

    |--Stream Aggregate(DEFINE: ([Expr1013]=Count(*), [Expr1014]=ANY([tempdb].[dbo].[Test].[Value])))

    |--Nested Loops(Inner Join, OUTER REFERENCES: ([Expr1008]))

    |--Compute Scalar(DEFINE: ([Expr1008]=abs(checksum(newid())/(100000000))))

    | |--Constant Scan

    |--Clustered Index Seek(OBJECT: ([tempdb].[dbo].[Test].[PK__Test__3214EC270CBAE877]), SEEK: ([tempdb].[dbo].[Test].[ID]=[Expr1008]) ORDERED FORWARD)

    Because you use subquery it's executed on every evaluation attempt.

    And on every other execution NEWID() returns different value, so the expression is different from time to time.

    So, if you run it often enough you'll certainly see NULL returned by this COALESCE from time to time.

    Try to run this:

    SELECT *

    FROM dbo.Test t

    INNER JOIN (Select ABS(CheckSUM(NEWID())/100000000)) V (ID) ON V.ID = t.ID

    It returns totally unpredictable number (well, limited by the number of rows in Test) of different rows from the table Test.

    It happens because on every run of nested loop (INNER JOIN ) the subquery is re-executed with different resulting value.

    Be careful with NEWID() and avoid using it in SELECT parts of queries.

    _____________
    Code for TallyGenerator

  • Yes, in this respect it matters and makes sense 'cause the subquery (Select ABS(CheckSUM(NEWID()) / 100000000)) is nondeterministic.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply