Strange result using SUM with ISNULL(NULLIF...

  • Hi,

    I'm very grateful for any light that can be shed on this. My code was per the following:

    SUM( ISNULL( NULLIF(VAL1,0), VAL2) )

    My understanding is that this should return the SUM of (VAL1 if it's not 0, otherwise VAL2). However what we're getting if VAL1 is not 0, is VAL1 + VAL2!?

    I've got the correct values coming out with the following:

    ISNULL( NULLIF( SUM(VAL1),0), SUM(VAL2) )

    I really just want to understand what's happening here because until now, I thought I did :ermm:

  • david.jack (10/27/2016)


    Hi,

    I'm very grateful for any light that can be shed on this. My code was per the following:

    SUM( ISNULL( NULLIF(VAL1,0), VAL2) )

    My understanding is that this should return the SUM of (VAL1 if it's not 0, otherwise VAL2). However what we're getting if VAL1 is not 0, is VAL1 + VAL2!?

    I've got the correct values coming out with the following:

    ISNULL( NULLIF( SUM(VAL1),0), SUM(VAL2) )

    I really just want to understand what's happening here because until now, I thought I did :ermm:

    The first formula, SUM( ISNULL( NULLIF(VAL1,0), VAL2) ) is correct, take a look at this example

    😎

    USE TEEST;

    GO

    SET NOCOUNT ON;

    ;WITH SAMPLE_DATA(VAL1,VAL2) AS

    (

    SELECT 10 ,1 UNION ALL

    SELECT 20 ,2 UNION ALL

    SELECT 0 ,3 UNION ALL

    SELECT 40 ,4 UNION ALL

    SELECT 50 ,5

    )

    SELECT

    SUM( ISNULL( NULLIF(SD.VAL1,0), SD.VAL2) ) AS FIRST_SUM

    ,ISNULL( NULLIF( SUM(VAL1),0), SUM(VAL2) ) AS SECOND_SUM

    FROM SAMPLE_DATA SD;

    Output

    FIRST_SUM SECOND_SUM

    ----------- -----------

    123 120

    The second formula is skipping the VAL2 value where VAL1 = 0!

  • Hmmmm... it doesn't work in my scenario but, as per your assessment, the formula does seem to be correct, so my results must be the product of joins/grouping I guess. I'll investigate further with that in mind. Thank you for the feedback. Really helpful 😀

  • david.jack (10/27/2016)


    Hmmmm... it doesn't work in my scenario but, as per your assessment, the formula does seem to be correct, so my results must be the product of joins/grouping I guess. I'll investigate further with that in mind. Thank you for the feedback. Really helpful 😀

    You are very welcome.

    😎

    Quick question, do you have ANSI_WARNINGS on? Are you getting any ANSI NULL warnings like Null value is eliminated by an aggregate or other SET operation?

  • Seems more confusing than the natural CASE statement to me:

    SUM(CASE WHEN VAL1 <> 0 THEN VAL1 ELSE VAL2 END)

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (10/27/2016)


    Seems more confusing than the natural CASE statement to me:

    SUM(CASE WHEN VAL1 <> 0 THEN VAL1 ELSE VAL2 END)

    VAL1 can return NULL unfortunately so using a comparison operator won't work. We did have the code saying:

    SUM(CASE WHEN (VAL1 = 0 or VAL1 IS NULL).... but this is exactly what ISNULL(NULLIF... is useful for IMO.

    In the end I tracked this down to an error in my understanding of the source data. With multiple VAL1 and VAL2 rows returned, with some VAL1s being 0 or NULL and others not, within the same grouping, my formula was evaluating to both cases being true for the given grouping and providing a SUM of both values, which I didn't want, so my second formula actually does what's required. :crazy:

    Very grateful for the input.

    Cheers

  • david.jack (10/28/2016)


    ScottPletcher (10/27/2016)


    Seems more confusing than the natural CASE statement to me:

    SUM(CASE WHEN VAL1 <> 0 THEN VAL1 ELSE VAL2 END)

    VAL1 can return NULL unfortunately so using a comparison operator won't work. We did have the code saying:

    SUM(CASE WHEN (VAL1 = 0 or VAL1 IS NULL)....

    It doesn't matter if the field can be NULL if you set up your CASE expression correctly. Scott specifically set it up so that FALSE and UNKNOWN (NULL) values were grouped together, and therefore fell under the ELSE clause taking advantage of the three-value logic. You set it up so that TRUE and UNKNOWN needed to be grouped together, so you specifically had to test for NULL, thereby forcing the three-value logic into a two-value system.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • david.jack (10/28/2016)


    ScottPletcher (10/27/2016)


    Seems more confusing than the natural CASE statement to me:

    SUM(CASE WHEN VAL1 <> 0 THEN VAL1 ELSE VAL2 END)

    VAL1 can return NULL unfortunately so using a comparison operator won't work. We did have the code saying:

    SUM(CASE WHEN (VAL1 = 0 or VAL1 IS NULL).... but this is exactly what ISNULL(NULLIF... is useful for IMO.

    In the end I tracked this down to an error in my understanding of the source data. With multiple VAL1 and VAL2 rows returned, with some VAL1s being 0 or NULL and others not, within the same grouping, my formula was evaluating to both cases being true for the given grouping and providing a SUM of both values, which I didn't want, so my second formula actually does what's required. :crazy:

    Very grateful for the input.

    Cheers

    I assumed a NULL value would be possible for VAL1, based on your original code.

    That's why I didn't use the otherwise obvious "WHEN VAL1 = 0" but rather "VAL1 <> 0". If VAL1 is NULL, then "VAL1 <> 0" will not be true, causing the ELSE value to be used. That's because NULL is never "=" or "<>" to anything.

    Similarly, if a column has a NULL value in it and you write:

    SELECT * FROM table_name WHERE column_name <> 5

    rows where column_name has a NULL value will not be included in the results of that SELECT either.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • drew.allen (10/28/2016)


    david.jack (10/28/2016)


    ScottPletcher (10/27/2016)


    Seems more confusing than the natural CASE statement to me:

    SUM(CASE WHEN VAL1 <> 0 THEN VAL1 ELSE VAL2 END)

    VAL1 can return NULL unfortunately so using a comparison operator won't work. We did have the code saying:

    SUM(CASE WHEN (VAL1 = 0 or VAL1 IS NULL)....

    It doesn't matter if the field can be NULL if you set up your CASE expression correctly. Scott specifically set it up so that FALSE and UNKNOWN (NULL) values were grouped together, and therefore fell under the ELSE clause taking advantage of the three-value logic. You set it up so that TRUE and UNKNOWN needed to be grouped together, so you specifically had to test for NULL, thereby forcing the three-value logic into a two-value system.

    Drew

    Hi Drew,

    Thanks again, however, as the next poster pointed out - it does matter if the value is a NULL. NULL is not 0, it's NULL. So no value can be < or > it. That's the only point I was making. By using a case statement, due to the possible NULL values, we have to have an extra condition (IS NULL or IS NOT NULL) as well as =0 or <>0 specified, making the whole statement longer still. Using ISNULL(NULLIF(VAL1,0),VAL2) makes VAL1 evaluate to NULL if it's 0, causing the ISNULL to kick in and use VAL2.

    I got to the bottom of my problem in the end and it is now resolved, so thanks again for your input.

  • ScottPletcher (10/28/2016)


    david.jack (10/28/2016)


    ScottPletcher (10/27/2016)


    Seems more confusing than the natural CASE statement to me:

    SUM(CASE WHEN VAL1 <> 0 THEN VAL1 ELSE VAL2 END)

    VAL1 can return NULL unfortunately so using a comparison operator won't work. We did have the code saying:

    SUM(CASE WHEN (VAL1 = 0 or VAL1 IS NULL).... but this is exactly what ISNULL(NULLIF... is useful for IMO.

    In the end I tracked this down to an error in my understanding of the source data. With multiple VAL1 and VAL2 rows returned, with some VAL1s being 0 or NULL and others not, within the same grouping, my formula was evaluating to both cases being true for the given grouping and providing a SUM of both values, which I didn't want, so my second formula actually does what's required. :crazy:

    Very grateful for the input.

    Cheers

    I assumed a NULL value would be possible for VAL1, based on your original code.

    That's why I didn't use the otherwise obvious "WHEN VAL1 = 0" but rather "VAL1 <> 0". If VAL1 is NULL, then "VAL1 <> 0" will not be true, causing the ELSE value to be used. That's because NULL is never "=" or "<>" to anything.

    Similarly, if a column has a NULL value in it and you write:

    SELECT * FROM table_name WHERE column_name <> 5

    rows where column_name has a NULL value will not be included in the results of that SELECT either.

    Thanks Scott,

    I understand the limitations imposed by the possible NULL values, but thanks for the clarification. In the end it was my understanding of the data that was wrong. I understood that, for a given grouping, all VAL1 values would be either (0 or NULL) or a value greater than 0, across the board, which would have worked for my original formula. However, it's actually possible for VAL1 column to have both 0/NULL values AND values greater than 0, so both conditions of my formula were true in many cases, which is why I was getting the SUM of both VAL1 and VAL2 columns. What I wanted to do was return the SUM of VAL2 if the SUM of VAL1 was 0 or NULL E.g.

    VAL1, VAL2

    1, 1

    2, 2

    3, 3

    NULL, 4

    0, 5

    With my original formula SUM(ISNULL(NULLIF(VAL1,0),VAL2), I was getting a VAL1 value and a VAL2 value (6 + 15 = 21), added together. With my new formula ISNULL(NULLIF(SUM(VAL1),0),SUM(VAL2)), I'm only getting the VAL2 total if the total for VAL1 column is NULL or 0, which is what I want.

    Thanks again for you help with this - your example really helped put me on the right track.

  • david.jack (10/31/2016)


    ScottPletcher (10/28/2016)


    david.jack (10/28/2016)


    ScottPletcher (10/27/2016)


    Seems more confusing than the natural CASE statement to me:

    SUM(CASE WHEN VAL1 <> 0 THEN VAL1 ELSE VAL2 END)

    VAL1 can return NULL unfortunately so using a comparison operator won't work. We did have the code saying:

    SUM(CASE WHEN (VAL1 = 0 or VAL1 IS NULL).... but this is exactly what ISNULL(NULLIF... is useful for IMO.

    In the end I tracked this down to an error in my understanding of the source data. With multiple VAL1 and VAL2 rows returned, with some VAL1s being 0 or NULL and others not, within the same grouping, my formula was evaluating to both cases being true for the given grouping and providing a SUM of both values, which I didn't want, so my second formula actually does what's required. :crazy:

    Very grateful for the input.

    Cheers

    I assumed a NULL value would be possible for VAL1, based on your original code.

    That's why I didn't use the otherwise obvious "WHEN VAL1 = 0" but rather "VAL1 <> 0". If VAL1 is NULL, then "VAL1 <> 0" will not be true, causing the ELSE value to be used. That's because NULL is never "=" or "<>" to anything.

    Similarly, if a column has a NULL value in it and you write:

    SELECT * FROM table_name WHERE column_name <> 5

    rows where column_name has a NULL value will not be included in the results of that SELECT either.

    Thanks Scott,

    I understand the limitations imposed by the possible NULL values, but thanks for the clarification. In the end it was my understanding of the data that was wrong. I understood that, for a given grouping, all VAL1 values would be either (0 or NULL) or a value greater than 0, across the board, which would have worked for my original formula. However, it's actually possible for VAL1 column to have both 0/NULL values AND values greater than 0, so both conditions of my formula were true in many cases, which is why I was getting the SUM of both VAL1 and VAL2 columns. What I wanted to do was return the SUM of VAL2 if the SUM of VAL1 was 0 or NULL E.g.

    VAL1, VAL2

    1, 1

    2, 2

    3, 3

    NULL, 4

    0, 5

    With my original formula SUM(ISNULL(NULLIF(VAL1,0),VAL2), I was getting a VAL1 value and a VAL2 value (6 + 15 = 21), added together. With my new formula ISNULL(NULLIF(SUM(VAL1),0),SUM(VAL2)), I'm only getting the VAL2 total if the total for VAL1 column is NULL or 0, which is what I want.

    Thanks again for you help with this - your example really helped put me on the right track.

    Ah, thanks for the clarification, I understand better now. You're quite right, you needed to adjust the expression to handle those conditions. I misunderstood the data.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • david.jack (10/31/2016)


    drew.allen (10/28/2016)


    david.jack (10/28/2016)


    ScottPletcher (10/27/2016)


    Seems more confusing than the natural CASE statement to me:

    SUM(CASE WHEN VAL1 <> 0 THEN VAL1 ELSE VAL2 END)

    VAL1 can return NULL unfortunately so using a comparison operator won't work. We did have the code saying:

    SUM(CASE WHEN (VAL1 = 0 or VAL1 IS NULL)....

    It doesn't matter if the field can be NULL if you set up your CASE expression correctly. Scott specifically set it up so that FALSE and UNKNOWN (NULL) values were grouped together, and therefore fell under the ELSE clause taking advantage of the three-value logic. You set it up so that TRUE and UNKNOWN needed to be grouped together, so you specifically had to test for NULL, thereby forcing the three-value logic into a two-value system.

    Drew

    Hi Drew,

    Thanks again, however, as the next poster pointed out - it does matter if the value is a NULL. NULL is not 0, it's NULL. So no value can be < or > it. That's the only point I was making. By using a case statement, due to the possible NULL values, we have to have an extra condition (IS NULL or IS NOT NULL) as well as =0 or <>0 specified, making the whole statement longer still. Using ISNULL(NULLIF(VAL1,0),VAL2) makes VAL1 evaluate to NULL if it's 0, causing the ISNULL to kick in and use VAL2.

    I got to the bottom of my problem in the end and it is now resolved, so thanks again for your input.

    You're missing my point. SQL uses three-value logic and you need to understand how that works in order to understand how the CASE statement works, so that you can leverage it. The ELSE is used when none of the previous expressions evaluates to TRUE, which is not the same as saying that all of the previous expressions evaluates to FALSE. Specifically, that means that when no previous expression can be evaluated to a definite value because of NULL values, they will be handled by the ELSE clause.

    For example, the following code does not specifically include a test for NULL values, but it still handles the NULL values correctly, because it leverages the three-value logic.

    SELECT val1, val2, CASE

    WHEN val1 <> 0 THEN val1 --Uses THEN clause: Val1

    ELSE val2 --Uses ELSE clause: Val2

    END

    FROM (

    VALUES(1, 2), (0, 3), (NULL, 4)) t(val1, val2)

    If you don't leverage the three-value logic, then you do need to include a test for NULL.

    -- without the test for null

    SELECT val1, val2, CASE

    WHEN val1 = 0 THEN val2 --Uses THEN clause: Val2

    ELSE val1 --Uses ELSE clause: Val1

    END

    FROM (

    VALUES(1, 2), (0, 3), (NULL, 4)) t(val1, val2)

    -- with the test for null

    SELECT val1, val2, CASE

    WHEN val1 = 0 OR val1 IS NULL

    THEN val2 --Uses THEN clause: Val1

    ELSE val1 --Uses ELSE clause: Val1

    END

    FROM (

    VALUES(1, 2), (0, 3), (NULL, 4)) t(val1, val2)

    In other words, you missed the key phrase in my statement, "... if you set up your CASE expression correctly."

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (10/31/2016)


    david.jack (10/31/2016)


    drew.allen (10/28/2016)


    david.jack (10/28/2016)


    ScottPletcher (10/27/2016)


    Seems more confusing than the natural CASE statement to me:

    SUM(CASE WHEN VAL1 <> 0 THEN VAL1 ELSE VAL2 END)

    VAL1 can return NULL unfortunately so using a comparison operator won't work. We did have the code saying:

    SUM(CASE WHEN (VAL1 = 0 or VAL1 IS NULL)....

    It doesn't matter if the field can be NULL if you set up your CASE expression correctly. Scott specifically set it up so that FALSE and UNKNOWN (NULL) values were grouped together, and therefore fell under the ELSE clause taking advantage of the three-value logic. You set it up so that TRUE and UNKNOWN needed to be grouped together, so you specifically had to test for NULL, thereby forcing the three-value logic into a two-value system.

    Drew

    Hi Drew,

    Thanks again, however, as the next poster pointed out - it does matter if the value is a NULL. NULL is not 0, it's NULL. So no value can be < or > it. That's the only point I was making. By using a case statement, due to the possible NULL values, we have to have an extra condition (IS NULL or IS NOT NULL) as well as =0 or <>0 specified, making the whole statement longer still. Using ISNULL(NULLIF(VAL1,0),VAL2) makes VAL1 evaluate to NULL if it's 0, causing the ISNULL to kick in and use VAL2.

    I got to the bottom of my problem in the end and it is now resolved, so thanks again for your input.

    You're missing my point. SQL uses three-value logic and you need to understand how that works in order to understand how the CASE statement works, so that you can leverage it. The ELSE is used when none of the previous expressions evaluates to TRUE, which is not the same as saying that all of the previous expressions evaluates to FALSE. Specifically, that means that when no previous expression can be evaluated to a definite value because of NULL values, they will be handled by the ELSE clause.

    For example, the following code does not specifically include a test for NULL values, but it still handles the NULL values correctly, because it leverages the three-value logic.

    SELECT val1, val2, CASE

    WHEN val1 <> 0 THEN val1 --Uses THEN clause: Val1

    ELSE val2 --Uses ELSE clause: Val2

    END

    FROM (

    VALUES(1, 2), (0, 3), (NULL, 4)) t(val1, val2)

    If you don't leverage the three-value logic, then you do need to include a test for NULL.

    -- without the test for null

    SELECT val1, val2, CASE

    WHEN val1 = 0 THEN val2 --Uses THEN clause: Val2

    ELSE val1 --Uses ELSE clause: Val1

    END

    FROM (

    VALUES(1, 2), (0, 3), (NULL, 4)) t(val1, val2)

    -- with the test for null

    SELECT val1, val2, CASE

    WHEN val1 = 0 OR val1 IS NULL

    THEN val2 --Uses THEN clause: Val1

    ELSE val1 --Uses ELSE clause: Val1

    END

    FROM (

    VALUES(1, 2), (0, 3), (NULL, 4)) t(val1, val2)

    In other words, you missed the key phrase in my statement, "... if you set up your CASE expression correctly."

    Drew

    I get you . Fair point well presented

  • david.jack (10/27/2016)


    Hi,

    I'm very grateful for any light that can be shed on this. My code was per the following:

    SUM( ISNULL( NULLIF(VAL1,0), VAL2) )

    My understanding is that this should return the SUM of (VAL1 if it's not 0, otherwise VAL2). However what we're getting if VAL1 is not 0, is VAL1 + VAL2!?

    I've got the correct values coming out with the following:

    ISNULL( NULLIF( SUM(VAL1),0), SUM(VAL2) )

    I really just want to understand what's happening here because until now, I thought I did :ermm:

    I know this problem is already solved, but when I read the various comments, it wasn't clear to me that anyone correctly assessed the original issue. If I correctly understand the problem, the objective is to SUM the VAL2 column only when the SUM of the VAL1 column is zero, which is what the poster effectively said by presenting the second set of code that produces the correct result. To explain the WHY behind it is fairly simple. When you use SUM, whatever expression appears as the argument to that function is evaluated for every record returned by the FROM and WHERE clauses, and only then, is summed. The incorrect value returned by the posters' first set of code was the sum of the individual values of the original expression that appeared inside the function. The correct value returned by the posters' second set of code performed TWO sums, and only then compared the SUM of VAL1 to 0, and nulled it out in favor of the SUM of VAL2.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply