Condition positioning in where clause

  • Hi.

    Can anyone please explain me this.

    When running a query on a table with 3 conditions in the where clause, if I change the position of the conditions in the where clause it could take 2 minutes while's if i leave the conditions positions it could take 2 seconds.

    For example:

    Table 1 contains a list of contacts.

    Table 2 contains a list of phone calls.

    Table 3 contains a list of email messages.

    When searching for a contact in table1 where not is (table2) and is in(table3) it could take 1 second for results.

    But when searching for the same a contact, where is in (table3) and is not in (table2) (the same query as above just now we search with table3 first) it could take 2 minutes.

    Please advice.

  • Good question.

    Can you post the 2 queries? Maybe there's something else different that you're missing.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • The position in the where clause isn't relevant (other than when it changes the semantics, e.g with AND/OR combinations), the optimiser will use statistics and indexes to work out the best order to apply the predicates.

    What's probably happening is some form of parameter sniffing or similar where the trivial change (switching the order of the predicates) is forcing a new plan to be generated.

    Is this a repeatable phenomenon? If so, actual execution plans for both queries might help to explain what's going on...

  • Please post table and index definitions and the full query AND the actual execution plan of both queries. Then, and only then, can we really figure out what is happening and why.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Hi, Thanks for all of your replies.

    It is a simple CRM database, some clients (separate databases), when running the query one way will be painful for the other, though the same query will run, the position must be changed for it to run OK.

    The server needs to select the first ContactID from contacts

    with the following conditions

    1) the contact must have a phone number 1 or phone number 2.

    2) the contact must HAVE a specific tag (10).

    3) the contact must NOT HAVE a specific tag (20).

    4) the contact has not been contacted (does not exists in "calls" table)

    the sorting is done by CallsCount field (int) which indicates the contact we have least tried to call, since only answered phone calls enter the "calls" table, each attempt of dialing will raise the CallsCount field and therefor when requesting a contact he will bring the next with the least tries of calls attempts.

    Here is the query for database A:

    SELECT TOP 1 ContactID FROM Contacts

    WHERE

    (

    ([Phone01Number] IS NOT NULL AND [Phone01Number] <> '' AND (Phone01Active IS NULL OR Phone01Active = 1))

    OR

    ([Phone02Number] IS NOT NULL AND [Phone02Number] <> '' AND (Phone02Active IS NULL OR Phone02Active = 1))

    )

    AND --INCLUDE TAGS

    (

    ContactID IN(SELECT [ContactId] FROM [b]ContactTags[/b] WHERE TagID = 10)

    )

    AND --exclude TAGS

    (

    ContactID NOT IN(SELECT [ContactId] FROM [b]ContactTags[/b] WHERE TagID = 20)

    )

    AND

    (

    ContactID NOT IN(SELECT ContactID FROM Calls)

    )

    ORDER BY CallsCount

    In database B the code will change to :

    SELECT TOP 1 ContactID FROM Contacts

    WHERE

    (

    ContactID NOT IN(SELECT ContactID FROM Calls)

    )

    AND

    (

    ([Phone01Number] IS NOT NULL AND [Phone01Number] <> '''' AND (Phone01Active IS NULL OR Phone01Active = 1))

    OR

    ([Phone02Number] IS NOT NULL AND [Phone02Number] <> '''' AND (Phone02Active IS NULL OR Phone02Active = 1))

    )

    AND --INCLUDE TAGS

    (

    ContactID IN(SELECT [ContactId] FROM [b]ContactTags[/b] WHERE TagID = 10)

    )

    AND --exclude TAGS

    (

    ContactID NOT IN(SELECT [ContactId] FROM [b]ContactTags[/b] WHERE TagID = 20)

    )

    ORDER BY CallsCount

  • 1) Do you recall my telling you that we needed the table schema including indexes, as well as the actual execution plans for good and bad executions?

    2) The IS NULL OR constructs you are using will prevent good optimization. Those need to go.

    3) The IN/NOT INs also are likely causing optimization issues. I would try EXISTS and NOT EXISTS, but don't hold much hope.

    4) data value skew and states and data volumes can play into optimization too.

    Getting this truly fixed is well beyond a forum post. Best guess is that you need the flags you have a NULL to be NOT NULL and appropriate default values. And the external tables you IN/NOT IN need to have a flag on main table (kept in sync with triggers) to avoid the checks on other tables. If all the data necessary is on one table there can be good statistics/indexing done to make the SELECT efficient for any client's data size/value skew.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • So when you say that the performance depends on the order of the conditions, you are really talking about queries in two different databases.

    Since the optimizer determines the query plan by looking at statistics sampled about the data, there is no wonder that you get different performance in different databases, if the data have different distribution. Or for that matter different indexes.

    Note by the way, that the condition in B looks a little spooky:

    [Phone01Number] <> ''''

    shouldn't that be:

    [Phone01Number] <> ''

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply