Finding Multiple Start and End Dates

  • I have a request where I have to find the start and End statuses for a member.  The requirement is to find the End Date and then go back and find when they first started.  The member can have multiple End Dates since they can start and stop as they see fit.  There are also modifications that can happen.  If a modification happens before a start then that is considered the Start.   Here is some sample data:

     

     

    declare @t table (ProcessID int, ID int, Status nvarchar(30), ChangeDate datetime)

     

    insert into @t

    select 1, 12011510, 'Start', '2008-10-12 21:52:00.000' union all

    select 2, 12011510, 'Modify1', '2009-04-06 21:52:00.000' union all

    select 3, 12011510, 'Modify1', '2009-05-13 20:52:00.000' union all

    select 4, 12011510, 'Modify1', '2009-07-22 21:52:00.000'  union all

    select 5, 12011510, 'End', '2009-11-04 20:52:00.000' union all

    select 6, 13547452, 'Start', '2010-01-01 20:43:00.000' union all

    select 7, 13547452, 'End', '2011-01-01 20:43:00.000' union all

    select 8, 13547452, 'Start', '2011-02-01 20:43:00.000' union all

    select 9, 13547452, 'End', '2011-03-01 20:43:00.000' union all

    select 10, 13547452, 'Start', '2011-04-01 20:43:00.000' union all

    select 11, 13547452, 'Modify', '2011-05-01 20:43:00.000' union all

    select 12, 18453159, 'End' , '2009-05-21 12:15:00.000' union all

    select 13, 18453159, 'Start' , '2010-03-25 15:12:00.000' union all

    select 14, 18453159, 'End' , '2010-06-21 13:01:00.000' union all

    select 15, 18453159, 'Modify' , '2010-08-19 12:02:00.000' union all

    select 16, 18453159, 'Start' , '2010-08-29 11:21:00.000' union all

    select 17, 18453159, 'End' , '2011-03-15 21:00:00.000'

     

    select * from @t

     

     

    and my required output is:

     

     

    ProcessID            ID                    Status    ChangeDate

    1                      12011510     Start           2008-10-12 21:52:00.000

    5                      12011510     End             2009-11-04 20:52:00.000

    6                      13547452     Start           2010-01-01 20:43:00.000

    7                      13547452     End             2011-01-01 20:43:00.000

    8                      13547452     Start           2011-02-01 20:43:00.000

    9                      13547452     End             2011-03-01 20:43:00.000

    10                   13547452     Start           2011-04-01 20:43:00.000

    12                   18453159     End             2009-05-21 12:15:00.000

    13                   18453159     Start           2010-03-25 15:12:00.000

    14                   18453159     End             2010-06-21 13:01:00.000

    15                   18453159     Modify    2010-08-19 12:02:00.000

    17                   18453159     End             2011-03-15 21:00:00.000

     

     

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • this looks kinda ugly, but the best way I could figure was to find the end date ranges and go from there:

    -- try partitioning by ends

    WITH endings AS

    (SELECT ProcessID, ID, Status, ChangeDate,

    ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ChangeDate) AS sort

    FROM @t e

    WHERE e.Status = 'End'),

    ranges AS

    (SELECT e1.sort, e1.ProcessID, e1.ID, e1.Status, e1.ChangeDate, ISNULL(e2.ChangeDate, '1900-01-01') AS PriorEndDate

    FROM endings e1

    LEFT OUTER JOIN endings e2 ON e1.ID = e2.ID AND e1.sort = e2.sort + 1),

    starts AS

    (SELECT t.ProcessID, t.ID, t.Status, t.ChangeDate,

    ROW_NUMBER() OVER (PARTITION BY t.ID, ISNULL(r.sort,2000000000) ORDER BY t.ChangeDate) AS earliness

    FROM @t t

    LEFT OUTER JOIN ranges r ON t.ID = r.ID AND t.ChangeDate BETWEEN r.PriorEndDate AND r.ChangeDate

    WHERE t.Status <> 'End')

    SELECT ProcessID, ID, Status, ChangeDate

    FROM starts WHERE earliness = 1

    UNION ALL

    SELECT ProcessID, ID, Status, ChangeDate

    FROM ranges

    ORDER BY ID, ChangeDate

  • i was looking at this and seeing your code just made me realize how much i need to learn to think about problems better.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • +1 for Chris' solution. Since you're desiring them to be in rows instead of associated on the same row, some other techniques I would usually use aren't as obviously applicable, but they're similar in methodology. I looked it over and the most I could do would be minor tweaks, but it'd be the same algorithm in general. That's the best choice since it avoid triangle joins and the like on large tables.

    Nice work. 🙂


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • After I de-simplified(?) it to apply to my needs, the query worked flawlessly. Chris, thanks for your help.

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply