Result set different after breaking up case expression in where

  • I'm back again with more dumb questions. This time I am dealing with refactoring a single select with nested case expressions in the where clause. I'm trying to optimize this, and in attempting to do so, I broke up the query into multiple selects w/ union alls. This does run quite a bit faster, but now I'm seeing more rows return than before, and this does not seem to be due to duplicate rows.

    There is more to this query, like creating the temp tables, etc. However, as far as I can tell, that has not had an effect on the result set. I have been able to determine that the second select of the refactored query (much of which is omitted because it is repetitive) is pulling in more results than before.

    So, where did I go wrong when pulling this apart? How does my new query/WHERE clause(s) differ logically from the original? I super appreciate your help in advance.

    /*ORIGINAL*/

    select
    /*some columns*/
    from
    #SmallerRange IPS
    inner join #LargerRange V on V.AccountNumber !=IPS.AccountNumber and V.InOrOut = IPS.InOrOut and V.LocNum = IPS.LocNum
    WHERE
    CASE
    WHEN IPS.InOrOut = 'I' THEN
    CASE
    WHEN IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.ProgramID = V.ProgramID THEN 1
    WHEN IPS.EntryDateTime = V.EntryDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ProgramID = V.ProgramID THEN 1
    WHEN IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate THEN 1
    WHEN IPS.ProgramID = V.ProgramID AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate THEN 1
    WHEN IPS.Name = V.Name AND IPS.Name <> 'N/A' AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate THEN 1
    ELSE 0
    END
    WHEN IPS.InOrOut = 'O' THEN
    CASE
    WHEN IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.DropbyDateTime = V.DropbyDateTime THEN 1
    WHEN IPS.ProgramID = V.ProgramID AND IPS.DropbyDateTime = V.DropbyDateTime AND IPS.ARRVLTIME = V.ARRVLTIME THEN 1
    WHEN IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.ARRVLTIME = V.ARRVLTIME THEN 1
    WHEN IPS.DropbyDateTime = V.DropbyDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ARRVLTIME = V.ARRVLTIME THEN 1
    WHEN IPS.DropbyDateTime = V.DropbyDateTime AND IPS.OriginDate=V.OriginDate AND IPS.Name = V.Name THEN 1
    ELSE 0
    END
    ELSE 0
    END = 1
    AND
    CASE
    WHEN V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name <> V.Name THEN 0
    WHEN V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name = 'N/A' THEN 0
    ELSE 1
    END = 1
    /*refactor*/

    select
    /*some columns*/
    from
    #SmallerRange IPS
    inner join #LargerRange V on V.InOrOut = IPS.InOrOut and V.LocNum = IPS.LocNum and V.AccountNumber != IPS.AccountNumber
    WHERE
    IPS.InOrOut = 'I' and
    IPS.EntryDate = V.EntryDate
    AND IPS.OriginDate=V.OriginDate
    AND IPS.ProgramID = V.ProgramID and not
    (V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name <> V.Name and IPS.Name = 'N/A') and not
    (V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name = 'N/A')

    union all

    select
    /*some columns*/
    from
    #SmallerRange IPS
    inner join #LargerRange V on V.InOrOut = IPS.InOrOut and V.LocNum = IPS.LocNum and V.AccountNumber != IPS.AccountNumber
    WHERE
    IPS.InOrOut = 'I' and
    IPS.EntryDate = V.EntryDate AND
    IPS.ExitDate = V.ExitDate AND
    IPS.ProgramID = V.ProgramID and not
    (V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name <> V.Name and IPS.Name = 'N/A') and not
    (V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name = 'N/A')
  • Not really sure, you seem to have used the logic from IPS.InOrOut = 'O'?

    I would probably start by converting the original CASE-based logic to something more standard, like

    select
    /*some columns*/
    from
    #SmallerRange IPS
    inner join #LargerRange V on V.AccountNumber !=IPS.AccountNumber and V.InOrOut = IPS.InOrOut and V.LocNum = IPS.LocNum
    WHERE
    ( IPS.InOrOut = 'I'
    AND ( IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.ProgramID = V.ProgramID
    OR IPS.EntryDateTime = V.EntryDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ProgramID = V.ProgramID
    OR IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate
    OR IPS.ProgramID = V.ProgramID AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate
    OR IPS.Name = V.Name AND IPS.Name <> 'N/A' AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate)

    OR IPS.InOrOut = 'O'
    AND ( IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.DropbyDateTime = V.DropbyDateTime
    OR IPS.ProgramID = V.ProgramID AND IPS.DropbyDateTime = V.DropbyDateTime AND IPS.ARRVLTIME = V.ARRVLTIME
    OR IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.ARRVLTIME = V.ARRVLTIME
    OR IPS.DropbyDateTime = V.DropbyDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ARRVLTIME = V.ARRVLTIME
    OR IPS.DropbyDateTime = V.DropbyDateTime AND IPS.OriginDate=V.OriginDate AND IPS.Name = V.Name)
    )

    AND NOT ( V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name <> V.Name
    OR V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name = 'N/A')

    Then you can start to split it up into more manageable pieces, or rewrite the logic.

  • scarr030 wrote:

    I'm back again with more dumb questions. This time I am dealing with refactoring a single select with nested case expressions in the where clause. I'm trying to optimize this, and in attempting to do so, I broke up the query into multiple selects w/ union alls. This does run quite a bit faster, but now I'm seeing more rows return than before, and this does not seem to be due to duplicate rows.

    There is more to this query, like creating the temp tables, etc. However, as far as I can tell, that has not had an effect on the result set. I have been able to determine that the second select of the refactored query (much of which is omitted because it is repetitive) is pulling in more results than before.

    So, where did I go wrong when pulling this apart? How does my new query/WHERE clause(s) differ logically from the original? I super appreciate your help in advance.

    /*ORIGINAL*/
    select
    /*some columns*/from
    #SmallerRange IPS
    inner join #LargerRange V on V.AccountNumber !=IPS.AccountNumber and V.InOrOut = IPS.InOrOut and V.LocNum = IPS.LocNum
    WHERE
    CASE
    WHEN IPS.InOrOut = 'I' THEN
    CASE
    WHEN IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.ProgramID = V.ProgramID THEN 1
    WHEN IPS.EntryDateTime = V.EntryDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ProgramID = V.ProgramID THEN 1
    WHEN IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate THEN 1
    WHEN IPS.ProgramID = V.ProgramID AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate THEN 1
    WHEN IPS.Name = V.Name AND IPS.Name <> 'N/A' AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate THEN 1
    ELSE 0
    END
    WHEN IPS.InOrOut = 'O' THEN
    CASE
    WHEN IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.DropbyDateTime = V.DropbyDateTime THEN 1
    WHEN IPS.ProgramID = V.ProgramID AND IPS.DropbyDateTime = V.DropbyDateTime AND IPS.ARRVLTIME = V.ARRVLTIME THEN 1
    WHEN IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.ARRVLTIME = V.ARRVLTIME THEN 1
    WHEN IPS.DropbyDateTime = V.DropbyDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ARRVLTIME = V.ARRVLTIME THEN 1
    WHEN IPS.DropbyDateTime = V.DropbyDateTime AND IPS.OriginDate=V.OriginDate AND IPS.Name = V.Name THEN 1
    ELSE 0
    END
    ELSE 0
    END = 1
    AND
    CASE
    WHEN V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name <> V.Name THEN 0
    WHEN V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name = 'N/A' THEN 0
    ELSE 1
    END = 1
    /*refactor*/
    select
    /*some columns*/from
    #SmallerRange IPS
    inner join #LargerRange V on V.InOrOut = IPS.InOrOut and V.LocNum = IPS.LocNum and V.AccountNumber != IPS.AccountNumber
    WHERE
    IPS.InOrOut = 'I' and
    IPS.EntryDate = V.EntryDate
    AND IPS.OriginDate=V.OriginDate
    AND IPS.ProgramID = V.ProgramID and not
    (V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name <> V.Name and IPS.Name = 'N/A') and not
    (V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name = 'N/A')

    union all

    select
    /*some columns*/from
    #SmallerRange IPS
    inner join #LargerRange V on V.InOrOut = IPS.InOrOut and V.LocNum = IPS.LocNum and V.AccountNumber != IPS.AccountNumber
    WHERE
    IPS.InOrOut = 'I' and
    IPS.EntryDate = V.EntryDate AND
    IPS.ExitDate = V.ExitDate AND
    IPS.ProgramID = V.ProgramID and not
    (V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name <> V.Name and IPS.Name = 'N/A') and not
    (V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name = 'N/A')

    The thing that's hanging you up, and causing you to loop, is the conversion and comparison of the term. All you need to do is clean those numbers up and then group by them.

    I'm not on a machine where i have SSMS right now, or I'd type you up a code sample, but consider this...you're trying to group by the left four numbers of the YYYYMMDD date, essentially. So just convert the date to varchar, and group by LEFT(termCode, 4).

    myccpay

    • This reply was modified 3 years, 1 month ago by  Finney558.
  • I now realize that my original response didn't fully appreciate that the originally case based select can't just be converted to a "normal" where clause where the criterias of the case structure are treated as OR cases with the same priority. After all the CASE structure breaks off processing subsequent WHEN evaluations once one is found that evaluates to true.

    Which no doubt was also driving your UNION approach, I now realize; even if I'm not sure the UNION ALL approach is the best one.

    So a more realistic WHERE clause than in my first response would then be something like

    select
    /*some columns*/
    from
    #SmallerRange IPS
    inner join #LargerRange V on V.AccountNumber !=IPS.AccountNumber and V.InOrOut = IPS.InOrOut and V.LocNum = IPS.LocNum
    WHERE
    ( IPS.InOrOut = 'I'
    AND ( IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.ProgramID = V.ProgramID

    OR IPS.EntryDateTime = V.EntryDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ProgramID = V.ProgramID
    AND NOT (IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.ProgramID = V.ProgramID)

    OR IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate
    AND NOT (IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.ProgramID = V.ProgramID)
    AND NOT (IPS.EntryDateTime = V.EntryDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ProgramID = V.ProgramID)

    OR IPS.ProgramID = V.ProgramID AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate
    AND NOT (IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.ProgramID = V.ProgramID)
    AND NOT (IPS.EntryDateTime = V.EntryDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ProgramID = V.ProgramID)
    AND NOT (IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate)

    OR IPS.Name = V.Name AND IPS.Name <> 'N/A' AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate
    AND NOT (IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.ProgramID = V.ProgramID)
    AND NOT (IPS.EntryDateTime = V.EntryDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ProgramID = V.ProgramID)
    AND NOT (IPS.EntryDateTime = V.EntryDateTime AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate)
    AND NOT (IPS.ProgramID = V.ProgramID AND IPS.ExitDateTime = V.ExitDateTime AND IPS.OriginDate=V.OriginDate)

    )

    OR IPS.InOrOut = 'O'
    AND ( IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.DropbyDateTime = V.DropbyDateTime

    OR IPS.ProgramID = V.ProgramID AND IPS.DropbyDateTime = V.DropbyDateTime AND IPS.ARRVLTIME = V.ARRVLTIME
    AND NOT (IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.DropbyDateTime = V.DropbyDateTime)

    OR IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.ARRVLTIME = V.ARRVLTIME
    AND NOT (IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.DropbyDateTime = V.DropbyDateTime)
    AND NOT (IPS.ProgramID = V.ProgramID AND IPS.DropbyDateTime = V.DropbyDateTime AND IPS.ARRVLTIME = V.ARRVLTIME)

    OR IPS.DropbyDateTime = V.DropbyDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ARRVLTIME = V.ARRVLTIME
    AND NOT (IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.DropbyDateTime = V.DropbyDateTime)
    AND NOT (IPS.ProgramID = V.ProgramID AND IPS.DropbyDateTime = V.DropbyDateTime AND IPS.ARRVLTIME = V.ARRVLTIME)
    AND NOT (IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.ARRVLTIME = V.ARRVLTIME)

    OR IPS.DropbyDateTime = V.DropbyDateTime AND IPS.OriginDate=V.OriginDate AND IPS.Name = V.Name
    AND NOT (IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.DropbyDateTime = V.DropbyDateTime)
    AND NOT (IPS.ProgramID = V.ProgramID AND IPS.DropbyDateTime = V.DropbyDateTime AND IPS.ARRVLTIME = V.ARRVLTIME)
    AND NOT (IPS.ProgramID = V.ProgramID AND IPS.OriginDate=V.OriginDate AND IPS.ARRVLTIME = V.ARRVLTIME)
    AND NOT (IPS.DropbyDateTime = V.DropbyDateTime AND IPS.OriginDate=V.OriginDate AND IPS.ARRVLTIME = V.ARRVLTIME)

    )
    )

    AND NOT (V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name <> V.Name)
    AND NOT (V.ProgramID LIKE 'PC%' and V.ProgramID = IPS.ProgramID and IPS.Name = 'N/A')

    Now each CASE WHEN from the original is converted into an OR with the addition of a further exclusion of previously made evaluations. That is needed to make the evaluation equal that of the comparable CASE WHEN statement.

     

  • The key thing for performance is likely to be the indexing, and here specifically the clustered indexing.

    Both IPS and V should be clustered on ( LocNum, InOrOut, AccountNumber ), in that order.

    Hopefully you should then see a merge join in the plan for the query, albeit very likely one with a lazy spool, but that can't be avoided.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply