COALESCE() - Does it calculate every value or does it stop when it gets a not NULL

  • title says it all really.

    I have a Coalesce() statement that has three fairly complex queries in it's array. One of these is extremely resource hungry and takes some 3 seconds per result, but it should not be required that often. The script seems to be taking an age. I am wondering, does the COALESCE() function evaluate each element in sequence and only move on to the next one if the previous one was NULL, or is each element being run each time and then the results coalesced?

    If it is the latter, what is the best way to re-write the script. I don't like running through the dataset multiple times but will do so if it is more efficient.

  • As it's required only for a small proportion of results, you could try a UNION

    select coalesce(a,b)

    from ...

    where coalesce(a,b) is not null

    union

    select [complex expression]

    from ...

    where coalesce(a,b) is null

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Thanks Phil,

    I'n not sure that construct will work for me as the Coalesce() is to determine a value to be updated.

    Using the Union syntax is still running down the dataset twice...

  • Try removing your third really complex piece and see if the performance improves greatly. If so, then you may need to rewrite your query. If not, then that is not the problem, and you will probably have to rewrite. 😛 It is really about figuring out where your bottleneck is.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Taking out all the left joins and replacing with joins solves the speed, but at the risk of losing records, But I think it is acceptable losses. If the users put the right data in in the first place, then all checks would work on the first rule......

  • aaron.reese (8/20/2011)


    Taking out all the left joins and replacing with joins solves the speed, but at the risk of losing records, But I think it is acceptable losses. If the users put the right data in in the first place, then all checks would work on the first rule......

    That does not sound right to me - either the query is right, or it isn't. If the query requires outer joins - changing to inner joins and 'losing' data is not the right solution.

    Relying on users to put the right data in the first place is also asking for trouble. The application should not let users put in data that is invalid - nor should the database. If a column is required and a valid value must be entered - then you either have a check constraint or a foreign key constraint that enforces the values are entered correctly.

    You also need to make sure columns do not allow nulls when that data is required.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • aaron.reese (8/19/2011)


    title says it all really.

    I have a Coalesce() statement that has three fairly complex queries in it's array. One of these is extremely resource hungry and takes some 3 seconds per result, but it should not be required that often. The script seems to be taking an age. I am wondering, does the COALESCE() function evaluate each element in sequence and only move on to the next one if the previous one was NULL, or is each element being run each time and then the results coalesced?

    If it is the latter, what is the best way to re-write the script. I don't like running through the dataset multiple times but will do so if it is more efficient.

    Hi Aaron,

    I don't know if it's actually a part of your given problem but I've seen people inadvertently cause huge performance problems by using Coalesce improperly and unwittingly. A hidden problem with Coalesce is that it will do implicit conversions as in any JOIN or WHERE clause and, if the data-types are just so, it will do the implicit conversion on whole columns of data causing table or index scans instead of index seeks. With that in mind, it's important to make sure your data-types in a Coalesce are the same or that the order of precedence will cause a scan only on the smallest of the tables.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • @jeff Williams

    The left joins are (were) only for development where I don't want to lose any records so that I can see if there are any data issues that will cause the joins to fail. the joins are mainly between derived data sets and their original source data but using the same key or joining a table to itself because of duplicated data on different keys (don't ask.....)

    @jeff Moden

    I think this is a legitimate use of Coalesce(). Each query is a less robust fallback position from the previous one: Find the referral for the contact, Find a referral for the team, Find a referral for the service, find a referral for the organisation; hence the title of the thread. There is no point in calculating the last three if the referral is already attached to the contact record.

    There should not be any conversion between data types (except possibly varchar to nvarchar - would that cause an issue?)

    Obiron

  • aaron.reese (8/22/2011)


    There should not be any conversion between data types (except possibly varchar to nvarchar - would that cause an issue?)

    Obiron

    http://www.codersrevolution.com/index.cfm/2009/2/13/SQL-Server-Gotcha-Implicit-Unicode-Conversion

  • To answer your original question

    aaron.reese (8/19/2011)


    ...I am wondering, does the COALESCE() function evaluate each element in sequence and only move on to the next one if the previous one was NULL, or is each element being run each time and then the results coalesced?

    ...

    COALESCE() is the wrapper of the CASE WHEN and (unlike to ISNULL, which is in-build) is basicaly equivalent to:

    CASE

    WHEN (exp1 IS NOT NULL) THEN exp1

    WHEN (exp2 IS NOT NULL) THEN exp2

    ...

    ELSE expN

    END

    Therefore, as soon as SQLServer will find non-null value it will stop.

    BUT!

    SQL Server needs to determine the data type to be returned and its "nullability". It does return the data type of expression with the highest data type precedence and if all expressions are nonnullable, the result is typed as nonnullable. In order to do so, SQL Server still need to check datatype and nullability of every expression once.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • You could also try recoding the coalesce() within the where. So if you have

    WHERE A = coalesce(X, Y, Z)

    You can recode this as:

    WHERE

    X is not null and A = X or

    X is null and Y is not null and A = Y or

    X is null and Y is null and Z is not null and A = Z

    (Correct me if I'm wrong, but I don't think you actually need the "is not null" bits as they are implicit in the equality comparison. I'm just putting them there to show how to recode the coalesce.)

    SQL Server should "short circuit" the where clause evaluation where it can, so it may not need to evaluate every part of the comparisons.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply