Convert Date List to Date Range

  • Hey Zidar, here is the function and check constraint to avoid overlap. Can't guarantee it's flawless, I just knocked it out during a break.

    I presume you are going to need a similar function to enforce the SUSUSU... pattern.

    Also, you keep referring to one of Joe Celko's articles, why don't you post a link to it here?

    Anyone who has followed this thread so far might be interested in reading it.

    Best regards.

    Create FUNCTION dbo.PreviousUnsubscribedDate

    (

    @subScriberID INT

    ,@subScriptionID int

    )

    RETURNS DATE

    AS

    BEGIN

    DECLARE @UnsubscribedDate DATE;

    WITH cte AS (-- get UnsubscribedDate from most recent prior row for the subscriber

    SELECT TOP (1) unSubscribedDate

    FROM dbo.Subscriptions

    WHERE SubScriberID = @subScriberID

    AND SubScriptionID < @subScriptionID

    ORDER BY SubscriberID,SubScriptionID DESC ) -- want an index to support this

    SELECT @UnsubscribedDate = ISNULL(UnsubscribedDate,'01/01/9999')

    FROM cte

    RETURN @UnsubscribedDate

    END

    ALTER TABLE dbo.Subscriptions

    ADD CONSTRAINT CH_Prior_Sub CHECK (SubscribedDate > dbo.PreviousUnsubscribedDate(SubscriberID, SubscriptionID))

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • J Livingston SQL (7/8/2016)


    Zidar (7/8/2016)


    Obviously, we need more constraints. What we have so far is OK, but we need more. Time permitting, new post is coming this afternoon, if not, then on Monday I guess.

    maybe someone will knock up a decent test harness ??

    THANK YOU FOR VOLUNTEERING !!! 😀

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Did not have time to check the function. Generally speaking, it should work. I trued some functions in CHECK constraints, and they do work. Never tested on huge size tables though, but it might work. If it doesn't, it can be fixed. The most important thing is to be aware of constraints. Once we know what has to be enforced, we'll find a way to enforce it somehow.

    I have to leave soon, so quickly what I was bale to achieve so far:

    We must create real table, for we will need to set up some foreign keys, which I am not sure we can do with #temp tables.

    -- All constraints we were able to set up so far:

    IF Object_ID('Subscriptions') IS NOT NULL DROP TABLE Subscriptions

    GO

    Create table Subscriptions

    ( SubscriptionID int identity(1,1) NOT NULL UNIQUE

    , SubscriberID int NOT NULL

    , SubscribedDate date NOT NULL

    , UnSubscribedDate date NULL

    , CONSTRAINT [PK_Subscriptions] PRIMARY KEY (SubscriberID,SubscribedDate)

    , CONSTRAINT [UNIQUE Subscriptions (SubscriberID,UnSubscribedDate)]

    UNIQUE (SubscriberID,UnSubscribedDate)

    , CONSTRAINT [CK Subscriptions (SubscribedDate < UnSubscribedDate)]

    CHECK (SubscribedDate < UnSubscribedDate)

    )

    ;

    GO

    -- Add some correct test data:

    Insert into Subscriptions(SubscriberID, SubscribedDate)

    values (1,'2/23/2015'), (2,'2/26/2015'), (3, '3/3/2015')

    ;

    UPDATE Subscriptions

    set UnSubscribedDate = '20151101'

    WHERE SubscriptionID = 2

    ;

    insert into Subscriptions(SubscriberID, SubscribedDate)

    values (4,'5/23/2016'), (2,'5/26/2016'), (5, '6/29/2016')

    ;

    UPDATE Subscriptions

    set UnSubscribedDate = '20151101'

    WHERE SubscriberID = 1 AND UnSubscribedDate IS NULL

    ;

    UPDATE Subscriptions

    set UnSubscribedDate = '2016-05-31'

    WHERE SubscriberID = 2 AND UnSubscribedDate IS NULL

    ;

    insert into Subscriptions(SubscriberID, SubscribedDate)

    VAlues (2,'2016-06-10')

    ;

    -- See what we have so far:

    SELECT *

    FROM Subscriptions

    ORDER BY SubscriberID, SubscribedDate

    ;

    /*--

    SubscriptionID SubscriberID SubscribedDate UnSubscribedDate PrevUnSubscribedDate

    -------------- ------------ -------------- ---------------- --------------------

    1 1 2015-02-23 2015-11-01 NULL

    2 2 2015-02-26 2015-11-01 NULL

    5 2 2016-05-26 2016-05-31 NULL

    7 2 2016-06-10 NULL NULL

    3 3 2015-03-03 NULL NULL

    4 4 2016-05-23 NULL NULL

    6 5 2016-06-29 NULL NULL

    (7 row(s) affected)

    --*/

    We don't want SubscribedDate to be before previous UnSubscribedDate, to avoid overlap between subscriptions, or starting new subscription before the previous one is finished.

    This means we have to compare SubscribedDate to previous(UnSubscribedDate) in the same row. There is no such function as Previous (ColumnName), so we have to be crafty here.

    We can add a column, like this :

    ALTER TABLE Subscriptions

    ADD PrevUnSubscribedDate date NULL

    ;-

    In the next step we can make sure that value in PrevUnSubscribedDate for row X1 exists in UnSubscribedDate for some existing row. Actually, it should be exactly one row before the current row X. What we want to see is this, for Customer 2, for example:

    SubscriptionID SubscriberID SubscribedDate UnSubscribedDate PrevUnSubscribedDate

    -------------- ------------ -------------- ---------------- --------------------

    2 2 2015-02-26 2015-11-01 NULL

    5 2 2016-05-26 2016-05-31 2015-11-01

    7 2 2016-06-10 NULL 2016-05-31

    (3 row(s) affected)

    Stop for the moment and carefully look at the last data set. First row has no [PrevUnSubscribedDate] , so we see NULL there. Each row after the firs, we have some value in [PrevUnSubscribedDate] , exactly the same as the value of [UnSubscribedDate] in previous row. That is what we want to see in the table, not what we currently have in the table. Before proceeding, I will update the data in order to be able to create more constraints. (Mechanics of the update itself is not that important, no stress if it looks complicated)

    -- Now we could say SubscribedDate > PrevUnSubscribedDate. Lets update

    -- the table and then add necessary constraints:

    ; WITH Cte AS

    (

    SELECT SubscriptionID, SubscriberID

    , SubscribedDate

    , UnSubscribedDate

    , PrevUnSubscribedDate

    , PrevUnSubDateToBe = LAG (UnSubscribedDate)

    OVER (PARTITION BY SubscriberID ORDER BY SubscriptionID)

    FROM Subscriptions

    )

    UPDATE Cte

    SET PrevUnSubscribedDate = PrevUnSubDateToBe

    ;

    -- (3 row(s) affected)

    -- Now we have:

    SELECT SubscriptionID, SubscriberID

    , PrevUnSubscribedDate

    , SubscribedDate

    , UnSubscribedDate

    FROM Subscriptions

    --WHERE SubscriberID = 2

    ORDER BY SubscriberID, SubscribedDate

    ;

    /*--

    SubscriptionID SubscriberID PrevUnSubscribedDate SubscribedDate UnSubscribedDate

    -------------- ------------ -------------------- -------------- ----------------

    1 1 NULL 2015-02-23 2015-11-01

    2 2 NULL 2015-02-26 2015-11-01

    5 2 2015-11-01 2016-05-26 2016-05-31

    7 2 2016-05-31 2016-06-10 NULL

    3 3 NULL 2015-03-03 NULL

    4 4 NULL 2016-05-23 NULL

    6 5 NULL 2016-06-29 NULL

    (7 row(s) affected)

    --*/

    Here is set of constraint we need in order to prevent subscription overlaps:

    ALTER TABLE Subscriptions

    ADD CONSTRAINT [Subscriptions UNIQUE (SubscriberID,PrevUnSubscribedDate)]

    UNIQUE (SubscriberID,PrevUnSubscribedDate)

    ;

    ALTER TABLE Subscriptions

    ADD CONSTRAINT [FK Subscriptions (PrevUnSubscribedDate) REF (UnSubscribedDate)]

    FOREIGN KEY (SubscriberID,PrevUnSubscribedDate)

    REFERENCES Subscriptions (SubscriberID,UnSubscribedDate)

    ;

    -- Command(s) completed successfully.

    ALTER TABLE Subscriptions

    ADD CONSTRAINT [CK Subscriptions PrevUnSubscribedDate < SubscribedDate]

    CHECK (PrevUnSubscribedDate < SubscribedDate)

    ;

    Let's see what we can and cannot do now:

    Can we add a brand new subscriber?

    Insert into Subscriptions(SubscriberID, SubscribedDate)

    values (6,'2/23/2014')

    ;-- (1 row(s) affected)

    ;

    --(1 row(s) affected), ok, it worked

    -- Try to close the subscription:

    UPDATE Subscriptions

    SET UnSubscribedDate = '3/23/2014'

    WHERE SubscriberID = 6 AND UnSubscribedDate IS NULL

    ; It all looks well. Since we have closed the subscription for customer 6, why not open a new one, the same way we did it before:

    -- Add new subscription forSubscreberID = 6

    Insert into Subscriptions(SubscriberID, SubscribedDate)

    values (6,'4/23/2014')

    ; No luck, this is what we get:

    Msg 2627, Level 14, State 1, Line 1

    Violation of UNIQUE KEY constraint

    'Subscriptions UNIQUE (SubscriberID,PrevUnSubscribedDate)'.

    Cannot insert duplicate key in object 'dbo.Subscriptions'.

    The duplicate key value is (6, <NULL>).

    The statement has been terminated.

    What is missing - [PrevUnSubscribedDate]. We already have one row where [PrevUnSubscribedDate] IS NULL, so constraint Dixie set up did excellent job. This however works:

    -- But this does work:

    Insert into Subscriptions

    (SubscriberID, SubscribedDate, PrevUnSubscribedDate)

    values (6,'4/23/2014','3/23/2014')

    ;

    SELECT

    SubscriptionID, SubscriberID

    --, PrevSubscribedDate

    , SubscribedDate

    , UnSubscribedDate

    , PrevUnSubscribedDate

    FROM Subscriptions

    ORDER BY SubscriberID, SubscribedDate

    ;

    /*-- what we see in the table now:

    SubscriptionID SubscriberID SubscribedDate UnSubscribedDate PrevUnSubscribedDate

    -------------- ------------ -------------- ---------------- --------------------

    1 1 2015-02-23 2015-11-01 NULL

    2 2 2015-02-26 2015-11-01 NULL

    5 2 2016-05-26 2016-05-31 2015-11-01

    7 2 2016-06-10 NULL 2016-05-31

    3 3 2015-03-03 NULL NULL

    4 4 2016-05-23 NULL NULL

    6 5 2016-06-29 NULL NULL

    8 6 2014-02-23 2014-03-23 NULL

    10 6 2014-04-23 NULL 2014-03-23

    (9 row(s) affected)

    --*/

    let's try to unsubscribe with a wrong date, before previous subscription finished:

    -- Unsubscribe, with bad date

    UPDATE Subscriptions

    SET UnSubscribedDate = '4/23/2014'

    WHERE SubscriberID = 6 AND UnSubscribedDate IS NULL

    ;

    /*Msg 547, Level 16, State 0, Line 1

    The UPDATE statement conflicted with the CHECK constraint "CK Subscriptions (SubscribedDate < UnSubscribedDate)".

    The statement has been terminated.*/ We need to do this:

    UPDATE Subscriptions

    SET UnSubscribedDate = '6/23/2014'

    WHERE SubscriberID = 6 AND UnSubscribedDate IS NULL

    -- What we have in the table now is:

    /*--

    SubscriptionID SubscriberID SubscribedDate UnSubscribedDate PrevUnSubscribedDate

    -------------- ------------ -------------- ---------------- --------------------

    1 1 2015-02-23 2015-11-01 NULL

    2 2 2015-02-26 2015-11-01 NULL

    5 2 2016-05-26 2016-05-31 2015-11-01

    7 2 2016-06-10 NULL 2016-05-31

    3 3 2015-03-03 NULL NULL

    4 4 2016-05-23 NULL NULL

    6 5 2016-06-29 NULL NULL

    8 6 2014-02-23 2014-03-23 NULL

    10 6 2014-04-23 2014-06-23 2014-03-23

    (9 row(s) affected)

    --*/

    ;

    It looks so far so good. I'll try few more bad cases, and see what happens:

    Insert into Subscriptions

    (SubscriberID, SubscribedDate, PrevUnSubscribedDate)

    values (6,'2014-06-23','2014-06-23')

    ;

    /*--

    Msg 547, Level 16, State 0, Line 1

    The INSERT statement conflicted with the

    CHECK constraint "CK Subscriptions PrevUnSubscribedDate < SubscribedDate".

    The statement has been terminated.

    --*/

    Insert into Subscriptions

    (SubscriberID, SubscribedDate, PrevUnSubscribedDate)

    values (6,'2014-06-23','2014-03-23 ')

    ;

    /*Msg 2627, Level 14, State 1, Line 1

    Violation of UNIQUE KEY constraint

    'Subscriptions UNIQUE (SubscriberID,PrevUnSubscribedDate)'. Cannot insert duplicate key in object 'dbo.Subscriptions'. The duplicate key value is (6, 2014-03-23).

    The statement has been terminated.*/

    --

    -- Can we add new subscription, with date before all other subscriptions?

    Insert into Subscriptions

    (SubscriberID, SubscribedDate)

    values (6,'2010-06-23')

    ;

    /*--

    Msg 2627, Level 14, State 1, Line 1

    Violation of UNIQUE KEY constraint

    'Subscriptions UNIQUE (SubscriberID,PrevUnSubscribedDate)'.

    Cannot insert duplicate key in object 'dbo.Subscriptions'.

    The duplicate key value is (6, <NULL>).

    The statement has been terminated.

    --*/

    -- We cannot!

    It was not a full 'test harness', but if you look the cases we tested, it is getting close.

    Here is perhaps the first article by Joe dealing with state transitions:

    https://www.simple-talk.com/sql/t-sql-programming/constraint-yourself!/ https://www.simple-talk.com/sql/t-sql-programming/discrete-and-continuous-data-in-sql/

    From Alex Kuznetsov:

    https://www.simple-talk.com/books/sql-books/defensive-database-programming/

    ,https://www.simple-talk.com/sql/t-sql-programming/modifying-contiguous-time-periods-in-a-history-table/

    by Dwaine CAmps:https://www.simple-talk.com/sql/t-sql-programming/self-maintaining,-contiguous-effective-dates-in-temporal-tables/

    See you all on Monday, 🙂

  • with Mycte

    as

    (

    select ROW_NUMBER()over(order by id)as c, t.* from ##TEST as t

    )

    select t1.DT as "startDate", t2.DT as "endDate"from Mycte as t1

    left join Mycte as t2

    on t1.ID =t2.ID and t1.c=t2.c-1

  • santiagoc93 (7/9/2016)


    with Mycte

    as

    (

    select ROW_NUMBER()over(order by id)as c, t.* from ##TEST as t

    )

    select t1.DT as "startDate", t2.DT as "endDate"from Mycte as t1

    left join Mycte as t2

    on t1.ID =t2.ID and t1.c=t2.c-1

    Just a few problems with this idea:

    The Row_Number() should (have order by id, DT) or there is no guarantee the start and end dates will be in the correct sequence.

    The join will join unsubscribe dates back to subscribe dates, but also subscribe dates back to unsubscribe dates. Not to mention if there are two subscription or unsubscriptions in a row. (Yes I agree with Zidar that we shouldn't allow things like this to happen.) But for this example, avoid nonsense data bye adding a WHERE clause that specifies that the starting flag is 'S'

    Finally, be aware that performance of a join on a computed column will be poor as the table gets larger. You could use LEAD or LAG to replace the JOIN and get faster results.

    ;WITH cte AS (

    SELECT ID, DT as StartDate, LEAD(DT,1,NULL) OVER(PARTITION BY ID ORDER BY DT) AS EndDate

    , FLAG AS StartFlag

    , LEAD(FLAG,1,NULL) OVER(PARTITION BY id ORDER BY DT) AS EndFlag

    , LEAD(ID,1,NULL) OVER (PARTITION BY id ORDER BY DT) AS EndID

    from ##TEST as t1

    )

    SELECT ID, StartDate,EndDate

    FROM cte

    WHERE StartFlag = 'S' AND EndFlag = 'U' and EndID = ID OR ENDID IS NULL

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply