Overlapping datetime ranges.

  • ben.brugman (12/23/2016)


    Did take a better look at the code. I am not familiar with the "BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING" part.

    But I do think I do understand the code. (But still have to try this code)

    Remarks :

    I do not like the BETWEEN in SQL because between in general it is scale dependend. I like to write the code in such a way that the code should work if the starttime / endtime are done in days, minutes, seconds, months or any other scale. The BETWEEN rarely does work scale independend. (But still have to figure that out for this situation).

    The BETWEEN operator does not work well with continuous values like datetime, time, decimal, etc., but it works fine with discrete values like date or integer. In the case of the OVER clause, the BETWEEN is operating on discrete values. This is easy to see with ROWS, but it's also true of RANGE, because you're working with a well-defined subset of the rows being returned.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Jason A. Long (12/22/2016)


    This should be a simpler & faster solution as it only takes a single pass over the data...

    -- Setup some test data...

    IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL

    DROP TABLE #TestData;

    CREATE TABLE #TestData (

    ID INT NOT NULL PRIMARY KEY,

    BegDate DATE NOT NULL,

    EndDate DATE NOT NULL

    );

    INSERT #TestData (ID, BegDate, EndDate) VALUES

    (1, '2016-01-01', '2016-01-10'),

    (2, '2016-01-04', '2016-01-07'),

    (3, '2016-01-09', '2016-01-17'),

    (8, '2016-01-11', '2016-01-13'), -- <<<------ This line was added.

    (4, '2016-02-01', '2016-02-20'),

    (5, '2016-02-05', '2016-02-25'),

    (6, '2016-01-15', '2016-02-10');

    -- Create a "POC" index to get rid of the sort that would otherwise be

    -- generated by the window function.

    CREATE NONCLUSTERED INDEX ix_TestData_POC ON #TestData (BegDate, EndDate, ID);

    --============================================================================

    -- The actual solution...

    WITH

    cte_AlteredValues AS (

    SELECT

    td.ID,

    td.BegDate,

    td.EndDate,

    MaxPrevEnd = MAX(td.EndDate) OVER (ORDER BY td.BegDate, td.EndDate, td.ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)

    FROM

    #TestData td

    )

    SELECT

    av.ID,

    av.BegDate,

    av.EndDate,

    [.] = '||',

    NewBeg = CASE WHEN av.BegDate <= av.MaxPrevEnd THEN DATEADD(dd, 1, av.MaxPrevEnd) ELSE av.BegDate END,

    NewEnd = CASE WHEN av.EndDate <= av.MaxPrevEnd THEN DATEADD(dd, 1, av.MaxPrevEnd) ELSE av.EndDate END

    FROM

    cte_AlteredValues av;

    Results...

    IDBegDateEndDate.NewBegNewEnd

    12016-01-012016-01-10||2016-01-012016-01-10

    22016-01-042016-01-07||2016-01-112016-01-11

    32016-01-092016-01-17||2016-01-112016-01-17

    82016-01-112016-01-12||2016-01-182016-01-18 -- Extra line. (Oops)

    62016-01-152016-02-10||2016-01-182016-02-10

    42016-02-012016-02-20||2016-02-112016-02-20

    52016-02-052016-02-25||2016-02-212016-02-25

    First of all have a Happy Newyear, to all of you.

    At home I only have access to a 2008R2 version of SQL-server. With the supplied code the message : 'Incorrect syntax near 'ROWS'.' is given. So didn't have the oppertunity actually run the code. So this has caused the 'delay', sorry for that.

    The given solution is much faster than the solution that I have supplied, but needs a more actual SQLserver version.

    Comparing my solution and the given solution, both solutions contained 'errors'. The enddate is 'fixed' and should not be altered. In the above solution this is very easily fixed, by removing the NewEnd.

    An error in both solutions was, that the NewBeg can be from three different 'columns';

    1. If the MaxPrevEnd was less then BegDate, BegDate should be used.

    2. Else if the MaxPrevEnd is less then the EndDate, the MaxPrevEnd should be used.

    3. Else the EndDate should be used.

    I'll adapt this solution for the given example. (I am working on that).

    Thanks for this solution:

    1. It did help to find an error in my solution.

    2. It is much faster. (But a newer version is needed).

    3. It does help me to understand the 'ROWS' clause in SQLserver.

    I'll post both solutions later on.

    Many thanks for your time and attention,

    Ben

  • --============================================================================

    -- The actual solution... (Based on the solution given by Jason A. Long, thanx).

    -- Adapted to the example.

    -- Corrected for : 1 Enddate does not change.

    -- 2 NewStartdate should not go beyond Enddate.

    -- 3 SubscriberID is now reflected in the code.

    --

    SET NOCOUNT ON;

    IF OBJECT_ID(N'ResultTable') IS NOT NULL DROP TABLE ResultTable

    ;

    WITH

    cte_AlteredValues AS (

    SELECT

    *

    , [.] = '||'

    , MaxPrevEnd = MAX(EndDate) OVER (PARTITION BY SubscriberID ORDER BY StartDate, nr, EndDate ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)

    FROM TestData

    )

    , B as (

    SELECT

    *

    , NewStartDate =

    CASE WHEN MaxPrevEnd is NULL or MaxPrevEnd < StartDate THEN StartDate

    WHEN MaxPrevEnd < EndDate THEN MaxPrevEnd

    ELSE EndDate

    END

    FROM cte_AlteredValues

    )

    SELECT * into ResultTable FROM B

    As Promissed the 'corrected' code, based on the previous solution.

    Fieldnames have been adapted to fit the example.

    Enddate does not alter.

    NewStartDate does not overlap with the previous highest enddate.

    As already explained in previous messages, an overlap is defined as a period of time that two events overlap with each other, this period can be very small. But when the start of an event is the same as the end of an event, it does not overlap. The event is defined as till the endtime. The endtime, does not matter if it is a minute, second, day or 'whatever' is not included.

    I hope the above example works because again I do not have access to a 'recent' version of SQL-server.

    Ben

    Improved code for 'older' systems:

    --============================================================================

    -- The solution

    -- Corrected for : 1 NewStartdate should not go beyond Enddate.

    SET NOCOUNT ON;

    IF OBJECT_ID(N'ResultTable') IS NOT NULL DROP TABLE ResultTable

    ;

    WITH

    --

    -- Use the prepared A for efficeiency. (Or use the commented out version).

    --

    --A AS (SELECT row_number() OVER (PARTITION BY SubscriberID ORDER BY StartDate) Nr, * FROM TestTable)

    --,

    CTE as (SELECT CONVERT(datetime, '1900-01-01') PreMaxDate, * FROM A where Nr = 1

    union all

    SELECT

    CASE WHEN s.PreMaxDate >= s.EndDate THEN s.PreMaxDate

    ELSE s.EndDate

    END PreMaxDate

    , E.* FROM CTE S join a E on S.Nr+1 = E.nr and S.SubscriberID = E.SubscriberID)

    , D as (SELECT

    CASE WHEN Nr= 1 THEN StartDate

    WHEN PreMaxDate >= StartDate THEN PreMaxDate

    ELSE StartDate

    END NewStartDate

    , * FROM CTE )

    SELECT * into ResultTable FROM D ORDER BY SubscriberID, StartDate OPTION (MAXRECURSION 0)

    -- Timing with a prepared A is :

    --

    -- Check Result, there should be no gapsizes less than 0.

    --

    ;

    With

    A AS (select * from resulttable where newstartdate <> enddate)

    , B as (select P.EndDate pENDDATE, R.* FROM A p RIGHT JOIN A R ON P.NR+1 = R.NR AND P.Subscriberid = R.Subscriberid)

    , C as (SELECT CONVERT(int, newstartdate-penddate) gapsize,* FROM B)

    SELECT * FROM c WHERE gapsize < 0

  • ben.brugman (1/5/2017)


    First of all have a Happy Newyear, to all of you.

    At home I only have access to a 2008R2 version of SQL-server. With the supplied code the message : 'Incorrect syntax near 'ROWS'.' is given. So didn't have the oppertunity actually run the code. So this has caused the 'delay', sorry for that.

    The given solution is much faster than the solution that I have supplied, but needs a more actual SQLserver version.

    Comparing my solution and the given solution, both solutions contained 'errors'. The enddate is 'fixed' and should not be altered. In the above solution this is very easily fixed, by removing the NewEnd.

    An error in both solutions was, that the NewBeg can be from three different 'columns';

    1. If the MaxPrevEnd was less then BegDate, BegDate should be used.

    2. Else if the MaxPrevEnd is less then the EndDate, the MaxPrevEnd should be used.

    3. Else the EndDate should be used.

    I'll adapt this solution for the given example. (I am working on that).

    Thanks for this solution:

    1. It did help to find an error in my solution.

    2. It is much faster. (But a newer version is needed).

    3. It does help me to understand the 'ROWS' clause in SQLserver.

    I'll post both solutions later on.

    Many thanks for your time and attention,

    Ben

    If I'm understanding correctly, the objection is that the "obliterated" rows are displaying the "MaxPrevEnd" values in the NewBeg & NewEnd columns, rather than the original EndDate.

    If that's correct, that's a very easy fix... See below...

    WITH

    cte_AlteredValues AS (

    SELECT

    td.ID,

    td.BegDate,

    td.EndDate,

    MaxPrevEnd = MAX(td.EndDate) OVER (ORDER BY td.BegDate, td.EndDate, td.ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)

    FROM

    #TestData td

    )

    SELECT

    av.ID,

    av.BegDate,

    av.EndDate,

    [.] = '||',

    NewBeg = CASE WHEN nv.NewBeg = nv.NewEnd THEN av.EndDate ELSE nv.NewBeg END,

    NewEnd = CASE WHEN nv.NewBeg = nv.NewEnd THEN av.EndDate ELSE nv.NewEnd END

    FROM

    cte_AlteredValues av

    CROSS APPLY ( VALUES (

    CASE WHEN av.BegDate <= av.MaxPrevEnd THEN DATEADD(dd, 1, av.MaxPrevEnd) ELSE av.BegDate END,

    CASE WHEN av.EndDate <= av.MaxPrevEnd THEN DATEADD(dd, 1, av.MaxPrevEnd) ELSE av.EndDate END

    )

    ) nv (NewBeg, NewEnd)

    Results...

    ID BegDate EndDate . NewBeg NewEnd

    ----------- ---------- ---------- ---- ---------- ----------

    1 2016-01-01 2016-01-10 || 2016-01-01 2016-01-10

    2 2016-01-04 2016-01-07 || 2016-01-07 2016-01-07

    3 2016-01-09 2016-01-17 || 2016-01-11 2016-01-17

    8 2016-01-11 2016-01-13 || 2016-01-13 2016-01-13

    6 2016-01-15 2016-02-10 || 2016-01-18 2016-02-10

    4 2016-02-01 2016-02-20 || 2016-02-11 2016-02-20

    5 2016-02-05 2016-02-25 || 2016-02-21 2016-02-25

    If you don't care for the CROSS APPLY VALUES method, you can also use an additional CTE...

    WITH

    cte_AlteredValues AS (

    SELECT

    td.ID,

    td.BegDate,

    td.EndDate,

    MaxPrevEnd = MAX(td.EndDate) OVER (ORDER BY td.BegDate, td.EndDate, td.ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)

    FROM

    #TestData td

    ),

    cte_NewBegEnd AS (

    SELECT

    av.ID,

    av.BegDate,

    av.EndDate,

    [.] = '||',

    NewBeg = CASE WHEN av.BegDate <= av.MaxPrevEnd THEN DATEADD(dd, 1, av.MaxPrevEnd) ELSE av.BegDate END,

    NewEnd = CASE WHEN av.EndDate <= av.MaxPrevEnd THEN DATEADD(dd, 1, av.MaxPrevEnd) ELSE av.EndDate END

    FROM

    cte_AlteredValues av

    )

    SELECT

    nbe.ID,

    nbe.BegDate,

    nbe.EndDate,

    nbe.[.],

    NewBeg = CASE WHEN nbe.NewBeg = nbe.NewEnd THEN nbe.EndDate ELSE nbe.NewBeg END,

    NewEnd = CASE WHEN nbe.NewBeg = nbe.NewEnd THEN nbe.EndDate ELSE nbe.NewEnd END

    FROM

    cte_NewBegEnd nbe;

    They both produce identical results and identical execution plans. It's just a matter of personal syntax preference.

    (I personally prefer the abbreviated syntax of the 1st one.)

  • Jason A. Long (1/5/2017)

    If I'm understanding correctly, the objection is that the "obliterated" rows are displaying the "MaxPrevEnd" values in the NewBeg & NewEnd columns, rather than the original EndDate.

    Sorry, that I wasn't clear.

    EndDate should remain as it is. So there is no NewEndDate.

    NewBeginDate so that there is no period overlap with previous rows. (If the duration is 0 there is no overlap).

    NewBeginDate can become; BeginDate, MaxPrevEnd, Enddate.

    For the actual code change see below.

    -----------------------------------------------------------------------------------------------------------------

    -- Setup some test data...

    IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL DROP TABLE #TestData;

    CREATE TABLE #TestData (

    ID INT NOT NULL PRIMARY KEY,

    BegDate DATEtime NOT NULL, -- A scale 'independend' type. (Month/day/hour/minute/second all works)

    EndDate DATEtime NOT NULL

    );

    --

    -- Example of testdate with periods of less than a day.

    --

    INSERT #TestData (ID, BegDate, EndDate) VALUES

    (1, '2016-01-01', '2016-01-10'),

    (2, '2016-01-04', '2016-01-07'),

    (3, '2016-01-09', '2016-01-17'),

    (4, '2016-01-11', '2016-01-13'), -- <<<------ This line was added.

    (5, '2016-02-01', '2016-02-20'),

    (6, '2016-02-05', '2016-02-25'),

    (7, '2016-01-15', '2016-02-10'),

    (8, '2017-01-06 14:00:00', '2017-01-06 15:30:00'), -- Code should also work on hours/minutes/seconds.

    (9, '2017-01-06 14:30:00', '2017-01-06 15:00:00'),

    (10, '2017-01-06 15:00:00', '2017-01-06 17:00:00')

    -----------------------------------------------------------------------------------------------------------------

    -- The preferred method of Ben

    -----------------------------------------------------------------------------------------------------------------

    IF OBJECT_ID('B1', 'U') IS NOT NULL DROP TABLE B1;

    ;WITH

    ---- From SQLserver 2012 and above.

    A AS (SELECT *

    , MaxPrevEnd = MAX(EndDate) OVER (ORDER BY BegDate, EndDate, ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)

    FROM #TestData)

    , B AS (SELECT *,

    CASE WHEN MaxPrevEnd is NULL or MaxPrevEnd < BegDate THEN BegDate

    WHEN MaxPrevEnd < EndDate THEN MaxPrevEnd

    ELSE EndDate

    END NewBeg

    FROM A)

    SELECT * FROM B;

    -----------------------------------------------------------------------------------------------------------------

    Thanks for the solution with ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING.

    This is the actual solution to the problem.

    Then the NewBeg date is a choice between BegDate,MaxPrevEnd and EndDate.

    No Extra Selection or CROSS APPLY is needed.

    Solutions for SQLServer below version 2012, see below there a recursive method is used. (Number of recursions should be 'large').

    I like solutions which work with periods or with numbers to be scale independ. So that the end of a period or the end of a range is not included.

    See the above example where the last 3 rows contain a time part as well.

    Also for example distances, I like to say from 5 to 10 kilometers, or from 5000 to 10000 kilometers. (And not From 5 to 9 km).

    This is reflected in the code.

    See below the signature for a complete set of code, your (little bit changed) preverence, and my prefered solutions.

    All result in the same execution plan.

    My style is a bit different from your style, I try to be a bit more minimalistic with intermediate names and code. (Personal preverence).

    For my actual example as given a lot of messages back there was also a SubscriberID, in my actual script this is offcourse reflected.

    Thanks for your examples and help,

    Ben

    Long Long example containing the 'same' code more than once.

    With extra code to check if the results are identical.

    -----------------------------------------------------------------------------------------------------------------

    -- Setup some test data...

    IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL DROP TABLE #TestData;

    CREATE TABLE #TestData (

    ID INT NOT NULL PRIMARY KEY,

    BegDate DATEtime NOT NULL, -- A scale 'independend' type. (Month/day/hour/minute/second all works)

    EndDate DATEtime NOT NULL

    );

    --

    -- Example of testdate with periods of less than a day.

    --

    INSERT #TestData (ID, BegDate, EndDate) VALUES

    (1, '2016-01-01', '2016-01-10'),

    (2, '2016-01-04', '2016-01-07'),

    (3, '2016-01-09', '2016-01-17'),

    (4, '2016-01-11', '2016-01-13'), -- <<<------ This line was added.

    (5, '2016-02-01', '2016-02-20'),

    (6, '2016-02-05', '2016-02-25'),

    (7, '2016-01-15', '2016-02-10'),

    (8, '2017-01-06 14:00:00', '2017-01-06 15:30:00'), -- Code should also work on hours/minutes/seconds.

    (9, '2017-01-06 14:30:00', '2017-01-06 15:00:00'),

    (10, '2017-01-06 15:00:00', '2017-01-06 17:00:00')

    -----------------------------------------------------------------------------------------------------------------

    GO

    -----------------------------------------------------------------------------------------------------------------

    -- The preferred method of Jason A. Long

    IF OBJECT_ID('L1', 'U') IS NOT NULL DROP TABLE L1;

    ;

    WITH

    -- Version of SQLServer from 2012 and above should use the first cte_AlteredValues.

    cte_AlteredValues AS (

    SELECT

    td.ID,

    td.BegDate,

    td.EndDate,

    MaxPrevEnd = MAX(td.EndDate) OVER (ORDER BY td.BegDate, td.EndDate, td.ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)

    FROM

    #TestData td

    )

    -- SQLServer version pre 2012

    -- cte_AlteredValues as (SELECT CONVERT(datetime, '1900-01-01') MaxPrevEnd, * FROM #TestData where id = 1

    -- union all

    -- SELECT

    -- CASE WHEN s.MaxPrevEnd >= s.EndDate THEN s.MaxPrevEnd

    -- ELSE s.EndDate

    -- END PreMaxDate

    -- , E.* FROM cte_AlteredValues S join #TestData E on S.id+1 = E.id )

    SELECT

    av.ID,

    av.BegDate,

    av.EndDate,

    [.] = '||',

    NewBeg = CASE WHEN nv.NewBeg = nv.NewEnd THEN av.EndDate ELSE nv.NewBeg END

    INTO L1

    FROM

    cte_AlteredValues av

    CROSS APPLY ( VALUES (

    CASE WHEN av.BegDate <= av.MaxPrevEnd THEN av.MaxPrevEnd ELSE av.BegDate END,

    CASE WHEN av.EndDate <= av.MaxPrevEnd THEN av.MaxPrevEnd ELSE av.EndDate END

    )

    ) nv (NewBeg, NewEnd)

    -----------------------------------------------------------------------------------------------------------------

    GO

    -----------------------------------------------------------------------------------------------------------------

    -- The alternate method of Jason A. Long

    IF OBJECT_ID('L2', 'U') IS NOT NULL DROP TABLE L2

    ;

    WITH

    -- Version of SQLServer from 2012 and above should use the first cte_AlteredValues.

    cte_AlteredValues AS (

    SELECT

    td.ID,

    td.BegDate,

    td.EndDate,

    MaxPrevEnd = MAX(td.EndDate) OVER (ORDER BY td.BegDate, td.EndDate, td.ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)

    FROM

    #TestData td

    ),

    --cte_AlteredValues as (SELECT CONVERT(datetime, '1900-01-01') MaxPrevEnd, * FROM #TestData where id = 1

    -- union all

    -- SELECT

    -- CASE WHEN s.MaxPrevEnd >= s.EndDate THEN s.MaxPrevEnd

    -- ELSE s.EndDate

    -- END PreMaxDate

    -- , E.* FROM cte_AlteredValues S join #TestData E on S.id+1 = E.id ),

    cte_NewBegEnd AS (

    SELECT

    av.ID,

    av.BegDate,

    av.EndDate,

    [.] = '||',

    NewBeg = CASE WHEN av.BegDate <= av.MaxPrevEnd THEN av.MaxPrevEnd ELSE av.BegDate END,

    NewEnd = CASE WHEN av.EndDate <= av.MaxPrevEnd THEN av.MaxPrevEnd ELSE av.EndDate END

    FROM

    cte_AlteredValues av

    )

    SELECT

    nbe.ID,

    nbe.BegDate,

    nbe.EndDate,

    nbe.[.],

    NewBeg = CASE WHEN nbe.NewBeg = nbe.NewEnd THEN nbe.EndDate ELSE nbe.NewBeg END

    into L2

    FROM

    cte_NewBegEnd nbe;

    -----------------------------------------------------------------------------------------------------------------

    Go

    -----------------------------------------------------------------------------------------------------------------

    -- The preferred method of Ben

    IF OBJECT_ID('B1', 'U') IS NOT NULL DROP TABLE B1;

    ;

    WITH

    ---- From SQLserver 2012 and above.

    A AS (SELECT *,

    MaxPrevEnd = MAX(EndDate) OVER (ORDER BY BegDate, EndDate, ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)

    FROM #TestData)

    , B AS (SELECT *,

    CASE WHEN MaxPrevEnd is NULL or MaxPrevEnd < BegDate THEN BegDate

    WHEN MaxPrevEnd < EndDate THEN MaxPrevEnd

    ELSE EndDate

    END NewBeg

    FROM A)

    SELECT * into B1 FROM B;

    GO

    -----------------------------------------------------------------------------------------------------------------

    -- Method Ben, 'equivalent' syntax to Jason A. Long

    --

    IF OBJECT_ID('B2', 'U') IS NOT NULL DROP TABLE B2;

    ;WITH

    -- Version of SQLServer from 2012 and above should use the first cte_AlteredValues.

    cte_AlteredValues AS (

    SELECT

    td.ID,

    td.BegDate,

    td.EndDate,

    MaxPrevEnd = MAX(td.EndDate) OVER (ORDER BY td.BegDate, td.EndDate, td.ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)

    FROM

    #TestData td

    ),

    -- For versions before SQLServer 2012.

    -- cte_AlteredValues as (SELECT CONVERT(datetime, '1900-01-01') MaxPrevEnd, * FROM #TestData where id = 1

    -- union all

    -- SELECT

    -- CASE WHEN s.MaxPrevEnd >= s.EndDate THEN s.MaxPrevEnd

    -- ELSE s.EndDate

    -- END PreMaxDate

    -- , E.* FROM cte_AlteredValues S join #TestData E on S.id+1 = E.id ),

    cte_NewBegEnd AS (

    SELECT

    av.ID,

    av.BegDate,

    av.EndDate,

    [.] = '||',

    NewBeg =

    CASE WHEN av.MaxPrevEnd is NULL or av.MaxPrevEnd < av.BegDate THEN av.BegDate

    WHEN av.MaxPrevEnd < EndDate THEN av.MaxPrevEnd

    ELSE av.EndDate

    END

    FROM

    cte_AlteredValues av

    )

    SELECT * into B2 FROM cte_NewBegEnd;

    GO

    -----------------------------------------------------------------------------------------------------------------

    -- Check the results.

    -- The number of rows is the same, so the except should not give any rows.

    --

    select 'Next selection compares between L1 and L2 and should result in zero rows'

    select * from L1

    except

    select * from L2

    select 'Next selection compares between L1 and B1 and should result in zero rows'

    select * from L1

    except

    select * from B2

    select 'Next selection compares between B1 and B2 and should result in zero rows'

    SELECT ID,BegDate,EndDate,NewBeg FROM b1

    EXCEPT

    SELECT ID,BegDate,EndDate,NewBeg FROM b2

    -----------------------------------------------------------------------------------------------------------------

    --

    -- Test code

    --

    IF 1 = 2 BEGIN

    SELECT * FROM b1 order by id

    SELECT * FROM b2 order by id

    SELECT * FROM (

    (select 'new' version, ID,BegDate,EndDate,NewBeg from B1

    except

    select 'new' version,ID,BegDate,EndDate,NewBeg from B2)

    UNION

    (select 'old' version,ID,BegDate,EndDate,NewBeg from B2

    except

    select 'old' version,ID,BegDate,EndDate,NewBeg from B1)

    ) XXX ORDER BY ID,VERSION

    END

    --

    -- Clean up

    --

    IF 1 = 2 BEGIN

    IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL DROP TABLE #TestData;

    IF OBJECT_ID('L1', 'U') IS NOT NULL DROP TABLE L1;

    IF OBJECT_ID('L2', 'U') IS NOT NULL DROP TABLE L2;

    IF OBJECT_ID('B1', 'U') IS NOT NULL DROP TABLE B1;

    IF OBJECT_ID('B2', 'U') IS NOT NULL DROP TABLE B2;

    END

  • ben.brugman (1/6/2017)


    Jason A. Long (1/5/2017)

    If I'm understanding correctly, the objection is that the "obliterated" rows are displaying the "MaxPrevEnd" values in the NewBeg & NewEnd columns, rather than the original EndDate.

    Sorry, that I wasn't clear.

    EndDate should remain as it is. So there is no NewEndDate.

    NewBeginDate so that there is no period overlap with previous rows. (If the duration is 0 there is no overlap).

    NewBeginDate can become; BeginDate, MaxPrevEnd, Enddate.

    For the actual code change see below.

    -----------------------------------------------------------------------------------------------------------------

    -- Setup some test data...

    IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL DROP TABLE #TestData;

    CREATE TABLE #TestData (

    ID INT NOT NULL PRIMARY KEY,

    BegDate DATEtime NOT NULL, -- A scale 'independend' type. (Month/day/hour/minute/second all works)

    EndDate DATEtime NOT NULL

    );

    --

    -- Example of testdate with periods of less than a day.

    --

    INSERT #TestData (ID, BegDate, EndDate) VALUES

    (1, '2016-01-01', '2016-01-10'),

    (2, '2016-01-04', '2016-01-07'),

    (3, '2016-01-09', '2016-01-17'),

    (4, '2016-01-11', '2016-01-13'), -- <<<------ This line was added.

    (5, '2016-02-01', '2016-02-20'),

    (6, '2016-02-05', '2016-02-25'),

    (7, '2016-01-15', '2016-02-10'),

    (8, '2017-01-06 14:00:00', '2017-01-06 15:30:00'), -- Code should also work on hours/minutes/seconds.

    (9, '2017-01-06 14:30:00', '2017-01-06 15:00:00'),

    (10, '2017-01-06 15:00:00', '2017-01-06 17:00:00')

    -----------------------------------------------------------------------------------------------------------------

    -- The preferred method of Ben

    -----------------------------------------------------------------------------------------------------------------

    IF OBJECT_ID('B1', 'U') IS NOT NULL DROP TABLE B1;

    ;WITH

    ---- From SQLserver 2012 and above.

    A AS (SELECT *

    , MaxPrevEnd = MAX(EndDate) OVER (ORDER BY BegDate, EndDate, ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)

    FROM #TestData)

    , B AS (SELECT *,

    CASE WHEN MaxPrevEnd is NULL or MaxPrevEnd < BegDate THEN BegDate

    WHEN MaxPrevEnd < EndDate THEN MaxPrevEnd

    ELSE EndDate

    END NewBeg

    FROM A)

    SELECT * FROM B;

    -----------------------------------------------------------------------------------------------------------------

    Thanks for the solution with ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING.

    This is the actual solution to the problem.

    Then the NewBeg date is a choice between BegDate,MaxPrevEnd and EndDate.

    No Extra Selection or CROSS APPLY is needed.

    Solutions for SQLServer below version 2012, see below there a recursive method is used. (Number of recursions should be 'large').

    There is no need for any recursion.

    This works on any version of SQL Server:

    SELECT P2.ID, P2.BegDate, P2.EndDate,

    ISNULL(MAX(P1.EndDate), P2.BegDate) newBegDate,

    MAX(CASE WHEN P2.EndDate < P1.EndDate THEN P1.EndDate ELSE P2.EndDate END) newEndDate

    FROM #TestData p1

    RIGHT JOIN #TestData p2 ON P2.BegDate > P1.BegDate AND p2.BegDate < p1.EndDate

    GROUP BY P2.EndDate, P2.ID, P2.BegDate, P2.EndDate

    ORDER BY ID

    This solution was posted on the previous page.

    Apparently, you did not bother.

    This solution shifts both BegDate and EndDate of a collapsed period to the EndDate of the previous period.

    I find it more logical than having a zero-duration period somewhere in the middle.

    _____________
    Code for TallyGenerator

  • Sergiy - Tuesday, January 10, 2017 8:11 PM

    ben.brugman (1/6/2017)


    Jason A. Long (1/5/2017)If I'm understanding correctly, the objection is that the "obliterated" rows are displaying the "MaxPrevEnd" values in the NewBeg & NewEnd columns, rather than the original EndDate.

    Sorry, that I wasn't clear.EndDate should remain as it is. So there is no NewEndDate.NewBeginDate so that there is no period overlap with previous rows. (If the duration is 0 there is no overlap).NewBeginDate can become; BeginDate, MaxPrevEnd, Enddate.For the actual code change see below.------------------------------------------------------------------------------------------------------------------- Setup some test data...IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL DROP TABLE #TestData;CREATE TABLE #TestData (ID INT NOT NULL PRIMARY KEY,BegDate DATEtime NOT NULL, -- A scale 'independend' type. (Month/day/hour/minute/second all works)EndDate DATEtime NOT NULL );---- Example of testdate with periods of less than a day.-- INSERT #TestData (ID, BegDate, EndDate) VALUES (1, '2016-01-01', '2016-01-10'),(2, '2016-01-04', '2016-01-07'),(3, '2016-01-09', '2016-01-17'),(4, '2016-01-11', '2016-01-13'), -- <<<------ This line was added.(5, '2016-02-01', '2016-02-20'),(6, '2016-02-05', '2016-02-25'),(7, '2016-01-15', '2016-02-10'),(8, '2017-01-06 14:00:00', '2017-01-06 15:30:00'), -- Code should also work on hours/minutes/seconds.(9, '2017-01-06 14:30:00', '2017-01-06 15:00:00'),(10, '2017-01-06 15:00:00', '2017-01-06 17:00:00')------------------------------------------------------------------------------------------------------------------- The preferred method of Ben-----------------------------------------------------------------------------------------------------------------IF OBJECT_ID('B1', 'U') IS NOT NULL DROP TABLE B1;;WITH ---- From SQLserver 2012 and above. A AS (SELECT * , MaxPrevEnd = MAX(EndDate) OVER (ORDER BY BegDate, EndDate, ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) FROM #TestData) , B AS (SELECT *, CASE WHEN MaxPrevEnd is NULL or MaxPrevEnd < BegDate THEN BegDate WHEN MaxPrevEnd < EndDate THEN MaxPrevEnd ELSE EndDate END NewBeg FROM A)SELECT * FROM B;-----------------------------------------------------------------------------------------------------------------Thanks for the solution with ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING.This is the actual solution to the problem.Then the NewBeg date is a choice between BegDate,MaxPrevEnd and EndDate.No Extra Selection or CROSS APPLY is needed.Solutions for SQLServer below version 2012, see below there a recursive method is used. (Number of recursions should be 'large').

    There is no need for any recursion.This works on any version of SQL Server:SELECT P2.ID, P2.BegDate, P2.EndDate, ISNULL(MAX(P1.EndDate), P2.BegDate) newBegDate, MAX(CASE WHEN P2.EndDate < P1.EndDate THEN P1.EndDate ELSE P2.EndDate END) newEndDateFROM #TestData p1RIGHT JOIN #TestData p2 ON P2.BegDate > P1.BegDate AND p2.BegDate < p1.EndDateGROUP BY P2.EndDate, P2.ID, P2.BegDate, P2.EndDateORDER BY ID This solution was posted on the previous page.Apparently, you did not bother.This solution shifts both BegDate and EndDate of a collapsed period to the EndDate of the previous period.I find it more logical than having a zero-duration period somewhere in the middle.

    First of all thanks for your time and attention.

    But as written before, the GROUP BY clause causes information to be lost.(Rows are aggregated so that the result is less rows AND not all columns appear in the result).

    Other answers in this thread gave a complete solution (some only for newer versions of SQL-server). So I did not persue the given answer.

    When your anwser is used in the case of multiple rows containing the exactly the same startdate,enddate,subscriberid, there still is the problem of which rows should be changed. In most cases at least one of the rows in such a case is an exception to the change.

    The example code I have supplied (largely taken from another thread), does supply a large number of rows, also containing (almost) identical rows. It does represent the problems occuring with the actual data we are using, but is not the same. So I have to switch between that example another example and the real problem. So I did not persue all the given solutions. I did use the answers which I considered most promising.

    As for the recursion, no I am not totaly happy with that, but it does work and is fairly logical. And after the recursion, this code results in the same code as for the elegant 2012 solution.

    Given that there are now several complete solutions, This thread has answered my question.

    And shifting only the BegDate to a NewBegDate and not the EndDate, was the requirement for our situation. For our situation we preserve all fields and add one extra field the NewBegDate. For our situation this is the minimal solution and more logical.

    'Apparently, you did not bother'
    No I bothered for a great deal, but persued the other solutions.

    Thanks adding to this thread, I did read all your anwers and did run your code.
    Ben

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply