Case when inside of partition by?

  • Hi, I have the following situation:
    CREATE TABLE #TAB1(
    ID INT
    ,MY_STATE VARCHAR(10)
    ,EVENT_TS DATETIME
    )

    INSERT INTO #TAB1(ID, MY_STATE, EVENT_TS)
         VALUES (1, 'STATE_1', GETDATE())
                 ,(1, 'STATE_1', DATEADD(DAY,-1, GETDATE()))
                 ,(1, 'STATE_2', DATEADD(DAY,-2, GETDATE()))
                 ,(1, 'STATE_3', DATEADD(DAY,-5, GETDATE()))
                 ,(2, 'STATE_1', NULL)
                 ,(2, 'STATE_2', GETDATE())

                
    what should query do: WHEN STATE_1 has EVENT_TS NOT NULL THAN take its date, otherwise search for STATE_2 and take its date

    Expected result:
    1 STATE_1 GETDATE()
    2 STATE_2 GETDATE()

    I'm thinking about  partitioning over id  and then use case when to check if a state has a date but I can't write anything that works..

  • Why not:
    [Code]
    SELECT TOP 1 WITH TIES
           ID, MY_STATE, EVENT_TS
    FROM #TAB1
    ORDER BY ROW_NUMBER() OVER (PARTITION BY ID ORDER BY EVENT_TS DESC);[/code]

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Deleted!

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • @thom-a uhm, if you insert also the following line:
    INSERT INTO #TAB1(ID, MY_STATE, EVENT_TS) VALUES (1, 'STATE_2', DATEADD(DAY,2, GETDATE()))               

    the solution you suggested returns STATE_2 also for the first id

    1 STATE_2 GETDATE()
    2 STATE_2 GETDATE()

    while I'd like it to return STATE_1

    Best

  • Data Cruncher - Wednesday, November 21, 2018 7:09 AM

    @thom-a uhm, if you insert also the following line:
    INSERT INTO #TAB1(ID, MY_STATE, EVENT_TS) VALUES (1, 'STATE_2', DATEADD(DAY,2, GETDATE()))               

    the solution you suggested returns STATE_2 also for the first id

    1 STATE_2 GETDATE()
    2 STATE_2 GETDATE()

    while I'd like it to return STATE_1

    Best

    Having sample data that explains the full scenario at the start is always helpful. 🙂

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Data Cruncher - Wednesday, November 21, 2018 7:09 AM

    @thom-a uhm, if you insert also the following line:
    INSERT INTO #TAB1(ID, MY_STATE, EVENT_TS) VALUES (1, 'STATE_2', DATEADD(DAY,2, GETDATE()))               

    the solution you suggested returns STATE_2 also for the first id

    1 STATE_2 GETDATE()
    2 STATE_2 GETDATE()

    while I'd like it to return STATE_1

    Best

    Try this:
    SELECT TOP 1 WITH TIES
       ID
    ,   MY_STATE
    ,   EVENT_TS
    FROM  #TAB1
    WHERE  NOT (
          MY_STATE = 'STATE_1'
          AND EVENT_TS IS NULL
         )
    ORDER BY ROW_NUMBER() OVER (PARTITION BY ID
              ORDER BY MY_STATE ASC
             );

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Give this one a go:

    SELECT TOP 1 WITH TIES
        ID, MY_STATE, EVENT_TS
    FROM #TAB1 T1
    WHERE T1.MY_STATE = 'STATE_1'
     OR NOT EXISTS (SELECT 1
           FROM #TAB1 e
          WHERE e.ID = T1.ID
           AND e.MY_STATE = 'STATE_1'
           AND e.EVENT_TS IS NOT NULL)
    ORDER BY ROW_NUMBER() OVER (PARTITION BY ID ORDER BY EVENT_TS DESC);

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • I believe that this will give you what you want.  You can use either BINARY or CHAR as the intermediary data type.  BINARY is usually simpler, because you don't have to worry about getting the right format for datetimes or left padding numbers to get the sorts to work correctly, HOWEVER dates should be converted to datetime before being converted to binary.


    SELECT ID,
        CAST(SUBSTRING(MIN(CAST(MY_STATE AS BINARY(10)) + CAST(EVENT_TS AS BINARY(8))), 1, 10) AS VARCHAR(10)),
        CAST(SUBSTRING(MIN(CAST(MY_STATE AS BINARY(10)) + CAST(EVENT_TS AS BINARY(8))), 11, 8) AS DATETIME)
    FROM #TAB1
    GROUP BY ID

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • I think Phil's solution works better in the case with more than one states but I'd really like to get the most of this example - for me it's so rare to find good instructive examples, like this one. Starting from Thom-A's solution (that I understand less) I'll try to explain the single steps and ask a few questions.

    a) query + subquery

    SELECT ID, MY_STATE, EVENT_TS
    FROM #TAB1 T1
    WHERE T1.MY_STATE = 'STATE_1' -- get all 'STATE_1'
     OR NOT EXISTS (SELECT 1     -- (return 1 if there is another row with 'STATE_1' and date NOT NULL)
          FROM #TAB1 e
          WHERE e.ID = T1.ID
           AND e.MY_STATE = 'STATE_1'
           AND e.EVENT_TS IS NOT NULL)

    Here I don't understand why/how the 'OR NOT EXISTS' part works? If there is a STATE_1 with a valid date I read the where condition as follows:
    "WHERE T1.MY_STATE = 'STATE_1' OR NOT EXISTS 1"
    This should return true and therefore return all rows with a valid date and state different than 'STATE_1', right?

    b) TOP (n) WITH TIES
    Didn't have a clue about this keyword. After a quick trip to MSDN I realize it's used here to get all ROW_NUMBERS() = 1

    So inside each ID you order the results by date in descending order and therefore return the state with a most recent date, is that correct?

    How could I modify the initial Thom-A's query in case I have an undefined number of states?

  • This at first seemed like there just wasn't any good description of the exact rule for choosing a given row.
    So I kept reading along, and then found what may be the key.   Try this and let me know if it works for
    all cases:
    CREATE TABLE #TAB1 (
        ID int,
        MY_STATE varchar(10),
        EVENT_TS datetime
    );
    INSERT INTO #TAB1 (ID, MY_STATE, EVENT_TS)
        VALUES    (1, 'STATE_1', GETDATE()),
                (1, 'STATE_1', DATEADD(DAY,-1, GETDATE())),
                (1, 'STATE_2', DATEADD(DAY,-2, GETDATE())),
                (1, 'STATE_3', DATEADD(DAY,-5, GETDATE())),
                (2, 'STATE_1', NULL),
                (2, 'STATE_2', GETDATE());

    WITH MAX_DATES AS (

        SELECT ID, MAX(EVENT_TS) AS MAX_EVENT_TS
        FROM #TAB1
        GROUP BY ID
    )
    SELECT    MD.ID,
            T.MY_STATE,
            MD.MAX_EVENT_TS AS EVENT_TS
    FROM MAX_DATES AS MD
        INNER JOIN #TAB1 AS T
            ON MD.ID = T.ID
            AND MD.MAX_EVENT_TS = T.EVENT_TS
    ORDER BY MD.ID;

    DROP TABLE #TAB1;

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Hi smugson,
    I'm sorry but it doesn't do the trick:

    CREATE TABLE #TAB1(
    ID INT
    ,MY_STATE VARCHAR(10)
    ,EVENT_TS DATETIME
    )

    INSERT INTO #TAB1(ID, MY_STATE, EVENT_TS)
         VALUES (1, 'STATE_1', GETDATE())
                 ,(1, 'STATE_1', DATEADD(DAY,-1, GETDATE()))
                 ,(1, 'STATE_2', DATEADD(DAY,-2, GETDATE()))
                 ,(1, 'STATE_2', DATEADD(DAY,2, GETDATE()))
                 ,(1, 'STATE_3', DATEADD(DAY,-5, GETDATE()))
                 ,(2, 'STATE_1', NULL)
                 ,(2, 'STATE_2', GETDATE())
                 ,(2, 'STATE_3', DATEADD(DAY,2,GETDATE()))

    Your query returns STATE_3 for the ID 2, while I'd like to see STATE_2 paired with ID: 2

  • Data Cruncher - Wednesday, December 5, 2018 9:35 AM

    Hi smugson,
    I'm sorry but it doesn't do the trick:

    CREATE TABLE #TAB1(
    ID INT
    ,MY_STATE VARCHAR(10)
    ,EVENT_TS DATETIME
    )

    INSERT INTO #TAB1(ID, MY_STATE, EVENT_TS)
         VALUES (1, 'STATE_1', GETDATE())
                 ,(1, 'STATE_1', DATEADD(DAY,-1, GETDATE()))
                 ,(1, 'STATE_2', DATEADD(DAY,-2, GETDATE()))
                 ,(1, 'STATE_2', DATEADD(DAY,2, GETDATE()))
                 ,(1, 'STATE_3', DATEADD(DAY,-5, GETDATE()))
                 ,(2, 'STATE_1', NULL)
                 ,(2, 'STATE_2', GETDATE())
                 ,(2, 'STATE_3', DATEADD(DAY,2,GETDATE()))

    Your query returns STATE_3 for the ID 2, while I'd like to see STATE_2 paired with ID: 2

    Have you tried the other solutions that were posted?  I believe that mine does work, but it appears that you have not even considered it.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Thanks Drew, actually your solution works well for the given dataset, but it assumes facts about data:
    If tomorrow my user decides to rename the states State_Z and State_A instead of State_1 and State_2, or if a user violates the assumption about the dates (State_1 date < State_2 date), I'm busted.

  • Data Cruncher - Friday, December 7, 2018 4:42 AM

    Thanks Drew, actually your solution works well for the given dataset, but it assumes facts about data:
    If tomorrow my user decides to rename the states State_Z and State_A instead of State_1 and State_2, or if a user violates the assumption about the dates (State_1 date < State_2 date), I'm busted.

    Which is why you should provide all necessary 'facts' up front instead of drip-feeding them. If you don't mention them, how is anyone (apart from you) going to know?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • @Phil
    You're right Phil. In an attempt to remove unnecessary details from a much more complex query, I forgot to mention some important aspects of the problem.
    I'll try to be more precise with next posts.

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply