Finding longest period where dates are consecutive

  • Hi All,

    I have an interesting problem where I need to find the longest period of consecutive days from a group of dates. So, say we had the following dates:

    1 Jan 2009

    3 Jan 2009

    4 Jan 2009

    5 Jan 2009

    9 Jan 2009

    10 Jan 2009

    I'd like to end up with 3 result rows with the date ranges:

    1 Jan 2009 - 1 Jan 2009 (i.e. 1 day),

    3 Jan 2009 - 5 Jan 2009 (i.e. 3 days), and

    9 Jan 2009 - 10 Jan 2009 (2 days)

    I've already come up with a solution, but I'm not convinced it's the best way of doing it (see attached code).

    SET NOCOUNT ON

    CREATE TABLE test1 (id1 INT, DOS datetime)

    CREATE TABLE test2 (id1 INT, StartDate DATETIME, EndDate datetime, Days int)

    INSERT test1 VALUES (1, '1 Feb 2009')

    INSERT test1 VALUES (1, '3 Feb 2009')

    INSERT test1 VALUES (1, '4 Feb 2009')

    INSERT test1 VALUES (1, '5 Feb 2009')

    INSERT test1 VALUES (1, '9 Feb 2009')

    INSERT test1 VALUES (1, '10 Feb 2009')

    INSERT test1 VALUES (2, '7 Mar 2009')

    INSERT test1 VALUES (2, '3 Feb 2009')

    INSERT test1 VALUES (2, '4 Feb 2009')

    INSERT test1 VALUES (2, '8 Mar 2009')

    INSERT test1 VALUES (2, '9 Jun 2009')

    INSERT test1 VALUES (2, '11 Jun 2009')

    SET NOCOUNT OFF

    SELECT t1.id1, DATEDIFF(d, t1.dos, t2.dos) + 1 AS Days, t1.dos AS StartDate, t2.dos AS EndDate

    INTO #DayCalc

    FROM test1 AS t1 INNER JOIN test1 AS t2

    ON t1.id1 = t2.id1

    AND t1.DOS <= t2.DOS
    WHERE DATEDIFF(d, t1.dos, t2.dos) + 1 = (SELECT COUNT(*)
    FROM test1 AS a
    WHERE a.id1 = t1.id1
    AND a.dos BETWEEN t1.dos AND t2.dos)

    INSERT Test2
    SELECT id1, StartDate, MAX(EndDate) AS EndDate, MAX(Days) AS Days
    FROM #DayCalc
    GROUP BY id1, StartDate

    DROP TABLE #DayCalc

    DELETE FROM t1
    FROM test2 AS t1 INNER JOIN test2 AS t2
    ON t1.id1 = t2.id1
    AND t1.EndDate = t2.EndDate
    AND t1.StartDate > t2.StartDate

    SELECT *

    FROM test2

    ORDER BY id1, StartDate

    DROP TABLE test1

    DROP TABLE test2

    I know I haven't created any indexes on the above tables, but with such a small dataset it doesn't really matter. The real tables will have appropriate indexes. And the equivalent table to test1 in my example will also have around 20K rows, which is the reason I want to make sure it's a good way of doing what I need.

    So, my question is: can anyone see a better way of doing this?

    Thanks for any help you may provide,

    Frank

  • Hi ,

    see my blog here for an article on analysing contiguous ranges.



    Clear Sky SQL
    My Blog[/url]

  • Hi Dave,

    Yes, that's an elegant solution, however, I'm running this on SQL Server 2000, so...

    Frank

  • In that case you could try inserting to a temp table with an identity column rather than using row_number(). Havent tried it myself but no reason why it shouldnt be a fairly straight forward substitution.



    Clear Sky SQL
    My Blog[/url]

  • Try this

    SELECT a.id1,

    a.DOS AS StartDate,

    MIN(c.DOS) AS EndDate,

    DATEDIFF(d, a.DOS, MIN(c.DOS))+1 AS Days

    FROM test1 a

    INNER JOIN test1 c ON c.id1=a.id1 AND c.DOS>=a.DOS

    AND NOT EXISTS (SELECT * FROM test1 d WHERE d.id1=c.id1 AND d.DOS=c.DOS+1)

    WHERE NOT EXISTS (SELECT * FROM test1 b WHERE b.id1=a.id1 AND b.DOS=a.DOS-1)

    GROUP BY a.id1,a.DOS

    ORDER BY a.id1,a.DOS

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • I followed Dave blog and wrote this query.

    #1)

    create table Contig(Id integer)

    go

    insert into contig values(1)

    insert into contig values(2)

    insert into contig values(3)

    insert into contig values(5)

    insert into contig values(6)

    insert into contig values(7)

    insert into contig values(8)

    Select Id,identity(int,1,1) as seq

    into #contig

    from Contig

    order by Id desc

    Select Min(Id),Max(Id) from #contig

    group by Id+seq order by 1

    #2)

    Drop Table ContigDates

    go

    Create Table ContigDates

    (

    DateCol smalldatetime

    )

    go

    insert into ContigDates(DateCol) values('01jan2009')

    insert into ContigDates(DateCol) values('02jan2009')

    insert into ContigDates(DateCol) values('03jan2009')

    insert into ContigDates(DateCol) values('01feb2009')

    insert into ContigDates(DateCol) values('02feb2009')

    insert into ContigDates(DateCol) values('03feb2009')

    insert into ContigDates(DateCol) values('10feb2009')

    go

    select DateCol, identity(int,1,1) as Datecol_Group

    into #ContigDates

    from ContigDates

    order by 1 desc

    go

    Select Min(DateCol),Max(DateCol)

    from (select DateCol,DateCol+ Datecol_Group as grouping

    from #ContigDates)a

    group by grouping

    order by 1

    go

    karthik

  • Thanks for your help guys - I'll try out some of your suggestions and see which option works out to be the most efficient.

    Cheers,

    Frank

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply