T-SQL to find gaps in Date field of table

  • I'm inexperienced in asking these questions in a forum.  I'm sure there's a more elegant way to display this code, etc.  I hope I can explain what I'm after here.

    I have a simple table with a record ID and a date column in it.  You can run this simple code to create the table if you'd like.

    IF OBJECT_ID('tempdb..#tmptbl') IS NOT NULL

    BEGIN

    DROP TABLE #tmptbl

    END

    create table #tmptbl (recid int, docdate date)

    insert into #tmptbl

    values (1, '11/16/19'),(1, '11/15/19'),(1, '11/14/19'),(1, '11/13/19'),(1, '10/29/19'),(1, '10/27/19'),(1, '10/26/19'),(2, '10/31/19'),(2, '10/30/19'),(2, '10/29/19'),(2, '10/1/19'),(3, '11/16/19'),(3, '11/15/19'),(3, '11/13/19'),(3, '8/9/19'),(3, '8/8/19'),(3, '8/7/19')

    --select * from #tmptbl order by 1, 2 desc

    Here is a picture.  The highlighted rows are the rows I want to return in a query.

    sample table pic

    Logic for the select statement to return the rows needed:

    For each recid, determine if there is a record on 11/16/19 (this can be a passed parameter but it will always be just one particular date).  If the recid does not have a record with 11/16/19 on it, return no rows for that recid.  If it does, I need to return the consecutive dated rows up to that date.  When there is a gap in the date for the recid, I can omit the rest of the rows for that recid.  I've tried to explain the logic in comments in the picture.

    Can you help give me some examples of how to accomplish this using T-SQL?  ...Return only the consecutive dated rows for each recid up to the provided date (i.e. 11/16/19 in my example).

    Thank you.

  • drop table if exists #tmptbl;
    go
    create table #tmptbl(
    recidint,
    docdatedate,
    constraint unq_tmptbl_recid_dt unique(recid, docdate));
    go

    insert into #tmptbl values
    (1, '11/16/19'),(1, '11/15/19'),(1, '11/14/19'),(1, '11/13/19'),(1, '10/29/19'),(1, '10/27/19'),
    (1, '10/26/19'),(2, '10/31/19'),(2, '10/30/19'),(2, '10/29/19'),(2, '10/1/19'),(3, '11/16/19'),
    (3, '11/15/19'),(3, '11/13/19'),(3, '8/9/19'),(3, '8/8/19'),(3, '8/7/19');
    go

    --select * from #tmptbl order by 1, 2 desc

    declare
    @docdatedate='2019-11-16';

    with
    dt_cte(recid, docdate) as (
    select * from #tmptbl where docdate=@docdate),
    range_cte as (
    select
    t.*,
    lead(t.docdate, 1) over (partition by recid order by docdate desc) nxt_dt,
    datediff(dd, lead(t.docdate, 1) over (partition by t.recid order by t.docdate desc), t.docdate) nxt_dt_diff
    from
    #tmptbl t
    where
    t.docdate<=@docdate),
    max_cte as (
    select
    recid,
    max(docdate) max_dt
    from
    range_cte rc
    where
    docdate<@docdate
    and nxt_dt_diff<>1
    group by
    recid)
    select
    *
    from
    range_cte rc
    join
    max_cte mc on rc.recid=mc.recid
    join
    dt_cte dc on rc.recid=dc.recid
    where
    rc.docdate>=mc.max_dt;
    go

    • This reply was modified 5 years ago by  Steve Collins. Reason: Got rid of unnecessary declared variable
    • This reply was modified 5 years ago by  Steve Collins.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Thank  you.  As I was testing with it, when I run it for 11/15/19, I get the following, where the highlighted row is selected.  There is a gap before that date so it shouldn't be returned.

    nov15

  • Ok issue is the initial nxt_dt_diff is not equal to 1.  Or the code doesn't handle that properly now.  I'll update it.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Here is another option:

       With groupedDates
    As (
    Select t.recid
    , t.docdate
    , GroupID = dateadd(day, -row_number() over(Partition By t.recid Order By t.docdate), t.docdate)
    From @tmptbl t
    )
    Select gd2.recid
    , gd2.docdate
    , gd2.GroupID
    From groupedDates gd
    Inner Join groupedDates gd2 On gd2.recid = gd.recid
    And gd2.GroupID = gd.GroupID
    Where gd.docdate = '2019-11-16';

     

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • declare
    @docdatedate='2019-08-09';

    with
    range_cte(recid, docdate, nxt_dt, nxt_dt_diff) as (
    select
    t.*,
    lead(t.docdate, 1) over (partition by recid order by docdate desc) nxt_dt,
    datediff(dd, lead(t.docdate, 1) over (partition by t.recid order by t.docdate desc), t.docdate) nxt_dt_diff
    from
    #tmptbl t
    where
    t.docdate<=@docdate),
    dt_cte(recid, docdate) as (
    select
    recid, docdate
    from
    range_cte
    where
    docdate=@docdate
    and nxt_dt_diff=1),
    nxt_max_cte(recid, max_dt) as (
    select
    recid,
    max(docdate) max_dt
    from
    range_cte rc
    where
    docdate<@docdate
    and nxt_dt_diff<>1
    group by
    recid
    union all
    select
    recid,
    max(docdate) max_dt
    from
    range_cte rc
    where
    docdate<@docdate
    and nxt_dt_diff=1
    group by
    recid),
    max_cte(recid, max_dt) as (
    select
    recid,
    max(max_dt) max_dt
    from
    nxt_max_cte
    group by
    recid),
    union_cte(recid, docdate, nxt_dt, nxt_dt_diff, max_dt) as(
    select
    rc.*, mc.max_dt
    from
    range_cte rc
    left join
    max_cte mc on rc.recid=mc.recid
    join
    dt_cte dc on rc.recid=dc.recid
    where
    rc.docdate>=mc.max_dt
    union all
    select
    rc.*, mc.max_dt
    from
    range_cte rc
    left join
    max_cte mc on rc.recid=mc.recid
    where
    rc.docdate=@docdate
    and rc.nxt_dt_diff<>1)
    select
    *
    from
    union_cte
    order by
    1, 2 desc;
    go

    • This reply was modified 5 years ago by  Steve Collins. Reason: added sorted by recid, docdate descending
    • This reply was modified 5 years ago by  Steve Collins. Reason: Wasn't giving the correct output in some cases

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • This is great!  Both work great.  Thank you both.

  • These are perfect.  Thanks folks!

  • We can actually simplify this even further:

    Declare @endDate date = '2019-11-16';

    With groupedDates
    As (
    Select t.recid
    , t.docdate
    , GroupID = dateadd(day, dense_rank() over(Partition By t.recid Order By t.docdate desc) - 1, t.docdate)
    From #tmptbl t
    Where t.docdate <= @endDate
    )
    Select *
    From groupedDates gd
    Where gd.GroupID = @endDate;

    In this version - we limit the results from the source table to those rows less than or equal to our end date.  Then - we define the GroupID so it matches our end date and filter by that grouping.

    In the first version - if you add another row in either group with the date 2019-11-17, that row will be included because there are no gaps and we are not filtering out future dates.  In the second version - future dates would be excluded...

    If you want to be able to select all dates with no gaps - the first version can do that...the second version must be filtered in the CTE with the filter on GroupID matching the same criteria.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams wrote:

    Here is another option:

       With groupedDates
    As (
    Select t.recid
    , t.docdate
    , GroupID = dateadd(day, -row_number() over(Partition By t.recid Order By t.docdate), t.docdate)
    From @tmptbl t
    )
    Select gd2.recid
    , gd2.docdate
    , gd2.GroupID
    From groupedDates gd
    Inner Join groupedDates gd2 On gd2.recid = gd.recid
    And gd2.GroupID = gd.GroupID
    Where gd.docdate = '2019-11-16';

    Azure SQL doesn't like the criteria (gd.docdate = '2019-11-16') specified in the where clause.  If added to the join conditions it works.  Here's a variation that might work for some people:

    with 
    grp_dt_cte as (
    select
    t.recid,
    t.docdate,
    group_id = dateadd(day, -row_number() over(partition by t.recid order by t.docdate), t.docdate)
    from
    #tmptbl t)
    select
    t.*
    from
    grp_dt_cte g
    join
    grp_dt_cte t on g.recid=t.recid
    and g.group_id=t.group_id
    and g.docdate=@docdate;

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Here's a similar way that's maybe simpler.

    with x_cte as (
    select
    *,
    row_number() over (partition by recid order by docdate desc) row_num
    from
    #tmptbl
    where
    docdate<=@docdate)
    select * from x_cte
    where
    datediff(dd, docdate, @docdate)+1=row_num
    order by
    1,2 desc;

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply