How to write a query for this one?

  • Good day!

    I just need help in writing query.. I have records in a table below.. The condition would be no records should be displayed if the succeeding records' new_state was repeated from the previous records(new_state) and if is changed in the same date..

    here record_id 1 has gone through the ff states: 0->1->2->1->3->4->3 in the same day.. state 1 was changed to state 2 then back to state 1 again (id 2 & 3 would not be displayed).. same with state 3 (id 5 & 6 would not be displayed)..

    id | record_id| date_changed | old_state | new_state |

    1 | 1 | 2009-01-01 | 0 | 1 | displayed

    2 | 1 | 2009-01-01 | 1 | 2 | not displayed

    3 | 1 | 2009-01-01 | 2 | 1 | not displayed

    4 | 1 | 2009-01-01 | 1 | 3 | displayed

    5 | 1 | 2009-01-01 | 3 | 4 | not displayed

    6 | 1 | 2009-01-01 | 4 | 3 | not displayed

    so the result would display only 2 records for record_id=1..

    id | record_id| date_changed | old_state | new_state |

    1 | 1 | 2009-01-01 | 0 | 1 |

    4 | 1 | 2009-01-01 | 1 | 3 |

    Any help would be greatly appreciated..

    Thanks

    -----------------------------------------------------

    Sorry for that, here's the code which contains the data: (updated: 12/2/2009)

    IF OBJECT_ID('TempDB..#table','U') IS NOT NULL

    DROP TABLE #table

    CREATE TABLE #table

    (

    id INT identity primary key,

    record_id INT,

    date_changed DATETIME,

    old_state INT,

    new_state INT

    )

    INSERT INTO #table(record_id,date_changed,old_state,new_state)

    SELECT 1,'2009-01-01',0,1 UNION ALL --displayed

    SELECT 1,'2009-01-01',1,2 UNION ALL --not displayed

    SELECT 1,'2009-01-01',2,1 UNION ALL --not displayed

    SELECT 1,'2009-01-01',1,3 UNION ALL --displayed

    SELECT 1,'2009-01-01',3,4 UNION ALL --not displayed

    SELECT 1,'2009-01-01',4,3 --not displayed

    INSERT INTO #table(record_id,date_changed,old_state,new_state)

    SELECT 2,'2009-01-01',0,1 UNION ALL --displayed

    SELECT 2,'2009-01-01',1,2 UNION ALL --not displayed

    SELECT 2,'2009-01-01',2,1 UNION ALL --not displayed

    SELECT 2,'2009-01-01',1,2 UNION ALL --displayed

    SELECT 2,'2009-01-01',2,4 UNION ALL --displayed

    SELECT 2,'2009-01-01',4,3 UNION ALL --not displayed

    SELECT 2,'2009-01-01',3,4 --not displayed

    INSERT INTO #table(record_id,date_changed,old_state,new_state)

    SELECT 3,'2009-01-01',0,1 UNION ALL --displayed

    SELECT 3,'2009-01-01',1,2 UNION ALL --not displayed

    SELECT 3,'2009-01-01',2,1 UNION ALL --not displayed

    SELECT 3,'2009-01-01',1,2 UNION ALL --not displayed

    SELECT 3,'2009-01-01',2,3 UNION ALL --not displayed

    SELECT 3,'2009-01-01',3,4 UNION ALL --not displayed

    SELECT 3,'2009-01-01',4,1 --not displayed

    INSERT INTO #table(record_id,date_changed,old_state,new_state)

    SELECT 4,'2009-01-01',0,1 UNION ALL --displayed

    SELECT 4,'2009-01-01',1,2 UNION ALL --displayed

    SELECT 4,'2009-01-01',2,3 UNION ALL --not displayed

    SELECT 4,'2009-01-01',3,2 UNION ALL --not displayed

    SELECT 4,'2009-01-01',2,3 UNION ALL --displayed

    SELECT 4,'2009-01-01',3,4 UNION ALL --not displayed

    SELECT 4,'2009-01-01',4,3 --not displayed

    INSERT INTO #table(record_id,date_changed,old_state,new_state)

    SELECT 5,'2009-01-01',0,1 UNION ALL --displayed

    SELECT 5,'2009-01-01',1,2 UNION ALL --not displayed

    SELECT 5,'2009-01-01',2,3 UNION ALL --not displayed

    SELECT 5,'2009-01-01',3,1 UNION ALL --not displayed

    SELECT 5,'2009-01-01',1,2 UNION ALL --displayed

    SELECT 5,'2009-01-01',2,3 --displayed

    SELECT * FROM #table

    Thanks =)

    -----------------------------------------------------

    Special case (updated:12/2/2009)

    id | record_id| date_changed | old_state | new_state |

    1 | 4 | 2009-01-01 | 0 | 1 | displayed

    2 | 4 | 2009-01-01 | 1 | 2 | displayed

    3 | 4 | 2009-01-01 | 2 | 3 | not displayed

    4 | 4 | 2009-01-01 | 3 | 2 | not displayed

    5 | 4 | 2009-01-01 | 2 | 3 | displayed

    6 | 4 | 2009-01-01 | 3 | 4 | not displayed

    7 | 4 | 2009-01-01 | 4 | 3 | not displayed

    where new_state 3 appears on id 3,5,7.. But id 3 will not be displayed since it is between id 2 and id 4 which have the same new_state 2.. So id 5 is the one that would be displayed to represent new_state 3.. I think the sequence is essential in building the result.. Is there a need for recursive CTE?

    Updated code is placed above..

    Thanks =)

    Is Shortest Path algorithm applicable for this one?

  • It needs more time to simluate the same data at our end to come up with a script,

    please post the table scripts and some sample records as per the following Best Practices article.

    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]


    Bru Medishetty

    Blog -- LearnSQLWithBru

    Join on Facebook Page Facebook.comLearnSQLWithBru

    Twitter -- BruMedishetty

  • Can the following happen? If so, what do you actually want displayed?

    id | record_id| date_changed | old_state | new_state |

    1 | 1 | 2009-01-01 | 0 | 1 |

    2 | 1 | 2009-01-01 | 1 | 2 | not displayed

    3 | 1 | 2009-01-01 | 2 | 1 | not displayed

    4 | 1 | 2009-01-01 | 1 | 2 |

    5 | 1 | 2009-01-01 | 2 | 4 | not displayed or displayed or????

    6 | 1 | 2009-01-01 | 4 | 3 | not displayed

    7 | 1 | 2009-01-01 | 3 | 4 |

    or... how about this? Can this happen, as well?

    id | record_id| date_changed | old_state | new_state |

    1 | 1 | 2009-01-01 | 0 | 1 |

    2 | 1 | 2009-01-01 | 1 | 2 | not displayed

    3 | 1 | 2009-01-01 | 2 | 1 | not displayed

    4 | 1 | 2009-01-01 | 1 | 2 | not displayed

    5 | 1 | 2009-01-01 | 2 | 3 | not displayed

    6 | 1 | 2009-01-01 | 3 | 4 | not displayed

    7 | 1 | 2009-01-01 | 4 | 1 | not displayed

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Also, I see you're brand new here. Take a look at the first link in my signature below. It helps you and us, a lot.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh... wilst I was pondering the situation, I see Bru snuck one in on me. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I apologize for that.. I already posted the table creation script together with sample data.. Thanks!

  • Can the following happen? If so, what do you actually want displayed? Yes id5 would be displayed.. The states that would be included: 0->1->2->4.. Thanks for clarifying..

    id | record_id| date_changed | old_state | new_state |

    1 | 1 | 2009-01-01 | 0 | 1 |

    2 | 1 | 2009-01-01 | 1 | 2 | not displayed

    3 | 1 | 2009-01-01 | 2 | 1 | not displayed

    4 | 1 | 2009-01-01 | 1 | 2 |

    5 | 1 | 2009-01-01 | 2 | 4 | not displayed or displayed or????

    6 | 1 | 2009-01-01 | 4 | 3 | not displayed

    7 | 1 | 2009-01-01 | 3 | 4 |

    or... how about this? Can this happen, as well? Yes, this can happen also.. This is correct only first record will be displayed.. Those records where id>1 and id<=7 will not be displayed..

    id | record_id| date_changed | old_state | new_state |

    1 | 1 | 2009-01-01 | 0 | 1 |

    2 | 1 | 2009-01-01 | 1 | 2 | not displayed

    3 | 1 | 2009-01-01 | 2 | 1 | not displayed

    4 | 1 | 2009-01-01 | 1 | 2 | not displayed

    5 | 1 | 2009-01-01 | 2 | 3 | not displayed

    6 | 1 | 2009-01-01 | 3 | 4 | not displayed

    7 | 1 | 2009-01-01 | 4 | 1 | not displayed

  • Thanks for the update and the code to create the data... now all we need to do is come up with an answer for this. Good problem...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Just a follow up in regarding this scenario (record_id=2 from the code above):

    id | record_id| date_changed | old_state | new_state |

    1 | 1 | 2009-01-01 | 0 | 1 |

    2 | 1 | 2009-01-01 | 1 | 2 | not displayed

    3 | 1 | 2009-01-01 | 2 | 1 | not displayed

    4 | 1 | 2009-01-01 | 1 | 2 |

    5 | 1 | 2009-01-01 | 2 | 4 | not displayed or displayed or????

    6 | 1 | 2009-01-01 | 4 | 3 | not displayed

    7 | 1 | 2009-01-01 | 3 | 4 |

    The last record would not be displayed since state 4 is already repeated(id=5).. For clarity, the output will display:

    id | record_id| date_changed | old_state | new_state |

    1 | 1 | 2009-01-01 | 0 | 1 |

    4 | 1 | 2009-01-01 | 1 | 2 |

    5 | 1 | 2009-01-01 | 2 | 4 |

    I appreciate your help..

  • I have used your #table

    select * into #table1 from

    (select record_id, date_changed, min(id) min_id, max(id) max_id, new_state

    from #table B

    group by record_id, date_changed, new_state) a

    select * from

    (

    select *,

    row_number() over(partition by record_id, date_changed, new_state order by id) as rowNumber

    from #table A

    where not exists

    (

    select 1 from #table1 b

    where A.record_id = B.record_id

    and A.date_changed = B.date_changed

    and (A.id > min_id and A.id < max_id)

    )

    ) a

    where rowNumber = 1

  • Sorry, Pren... I lost track of this one. Does khawaja.irfan's solution work for you?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks khawaja.irfan.. Your query works 100% for the data the I have provided.. Thanks for providing a great solution!:-)

    There is just a special case where the query would not produce expected result.. Example would be this data:

    id | record_id| date_changed | old_state | new_state |

    1 | 4 | 2009-01-01 | 0 | 1 | displayed

    2 | 4 | 2009-01-01 | 1 | 2 | displayed

    3 | 4 | 2009-01-01 | 2 | 3 | not displayed, --this is the current min(id) for new_state 3

    4 | 4 | 2009-01-01 | 3 | 2 | not displayed

    5 | 4 | 2009-01-01 | 2 | 3 | displayed --this should be the min(id) for new_state 3

    6 | 4 | 2009-01-01 | 3 | 4 | not displayed

    7 | 4 | 2009-01-01 | 4 | 3 | not displayed

    where new_state 3 appears on id 3,5,7.. But id 3 will not be displayed, so the min(id) for new_state 3 should be next one (id 5).. I think the sequence matters..

    running your query would result into this one:

    id | record_id| date_changed | old_state | new_state |rowNumber

    1 |4 |2009-01-01 |0|1 |1

    2 |4 |2009-01-01 |1|2 |1

    7 |4 |2009-01-01 |4|3 |1

    Here's the code snippet for table creation:

    IF OBJECT_ID('TempDB..#table','U') IS NOT NULL

    DROP TABLE #table

    CREATE TABLE #table

    (

    id INT identity primary key,

    record_id INT,

    date_changed DATETIME,

    old_state INT,

    new_state INT

    )

    INSERT INTO #table(record_id,date_changed,old_state,new_state)

    SELECT 4,'2009-01-01',0,1 UNION ALL --displayed

    SELECT 4,'2009-01-01',1,2 UNION ALL --displayed

    SELECT 4,'2009-01-01',2,3 UNION ALL --not displayed

    SELECT 4,'2009-01-01',3,2 UNION ALL --not displayed

    SELECT 4,'2009-01-01',2,3 UNION ALL --displayed

    SELECT 4,'2009-01-01',3,4 UNION ALL --not displayed

    SELECT 4,'2009-01-01',4,3 --not displayed

    Now I was thinking if using recursive CTE is appropriate..

    Thanks..

  • @jeff Moden.. The solution of khawaja.irfan is great! There's just special case that it would not display the result I need..

    God bless!

    Thanks..

  • Just with a very small change i was able to produce the result you wanted

    select * from

    (

    select *,

    row_number() over(partition by record_id, date_changed, new_state order by id) as rowNumber

    from #table A

    where not exists

    (

    select 1 from #table1 b

    where A.record_id = B.record_id

    and A.date_changed = B.date_changed

    and (A.id > min_id and A.id < max_id)

    and A.new_state != B.new_state -- This one is the change from the previous query. this one will prevent grouping on all transitional states

    )

    ) a

    where rowNumber = 1

  • khawaja.irfan.. Thanks for updating your query.. The idea of using Row_Number is great!

    However, there are some instances where it would not produce expected result.. In this case,

    id | record_id| date_changed | old_state | new_state |

    1 | 5 | 2009-01-01 | 0 | 1 | displayed

    2 | 5 | 2009-01-01 | 1 | 2 | not displayed

    3 | 5 | 2009-01-01 | 2 | 3 | not displayed

    4 | 5 | 2009-01-01 | 3 | 1 | not displayed

    5 | 5 | 2009-01-01 | 1 | 2 | displayed

    6 | 5 | 2009-01-01 | 2 | 3 | displayed

    Running the query would produce:

    id | record_id| date_changed | old_state | new_state | rowNumber

    1 | 5 | 2009-01-01 | 0 | 1 | 1

    6 | 5 | 2009-01-01 | 2 | 3 | 1

    because id 5 satisfied the where exists condition (id 5 > min_id (3) and < max_id (6) and new_state 2 != new_state 3).. so it was removed..

    Here's the code snippet:

    IF OBJECT_ID('TempDB..#table','U') IS NOT NULL

    DROP TABLE #table

    CREATE TABLE #table

    (

    id INT identity primary key,

    record_id INT,

    date_changed DATETIME,

    old_state INT,

    new_state INT

    )

    INSERT INTO #table(record_id,date_changed,old_state,new_state)

    SELECT 5,'2009-01-01',0,1 UNION ALL --displayed

    SELECT 5,'2009-01-01',1,2 UNION ALL --not displayed

    SELECT 5,'2009-01-01',2,3 UNION ALL --not displayed

    SELECT 5,'2009-01-01',3,1 UNION ALL --not displayed

    SELECT 5,'2009-01-01',1,2 UNION ALL --displayed

    SELECT 5,'2009-01-01',2,3 --displayed

    I really think there's a need to check the previous records to be displayed before progressing on the next one..

    Thanks..

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply