Datediff between rows with unsequential ID

  • Hello! I have been searching the internet around for some good solution and I think I am almost there but still need your attention, guys 🙂

    My table has unsequential ID, Product, Start and End dates where End dates are NULL where the product is still current.

    IDPRODUCTSTART_DATEEND_DATE

    1A2010-06-04NULL

    1D2004-08-102011-02-01

    1D2011-02-02NULL

    5R2015-09-15NULL

    8R2015-09-15NULL

    3D2011-02-02NULL

    4M 2014-05-06NULL

    5M 2008-10-222013-08-24

    2M2013-09-17NULL

    9D 2011-02-02NULL

    I need to calculate DATEDIFF between START_DATE and End_DATE where it is not NULL for the same product and ID.

    SELECT ROW_NUMBER() OVER (PARTITION BY PRODUCT ORDER BY ID) Rn,

    ID,

    PRODUCT,

    START_DATE,

    END_DATE,

    DATEDIFF(DD, START_DATE, END_DATE) as diff

    FROM (SELECT ID,

    PRODUCT,

    START_DATE,

    ( SELECT min(END_DATE)

    FROM MYTABLE T2

    WHERE T2.PRODUCT = T1.PRODUCT

    AND T2.START_DATE > T1.START_DATE

    ) AS END_DATE

    FROM MYTABLE T1

    where END_DATE is not null

    ) AS T

    go

    [/CODE]

  • Here is a slightly less complicated version that gives you the difference in days: (Note the setup of table and data!)

    declare @t table

    (

    id int,

    product char(1),

    start_date date,

    end_date date

    );

    Insert @t (id, product, start_date, end_date)

    values

    (1,'A','2010-06-04',NULL)

    ,(1,'D','2004-08-10','2011-02-01')

    ,(1,'D','2011-02-02',NULL)

    ,(5,'R','2015-09-15',NULL)

    ,(8,'R','2015-09-15',NULL)

    ,(3,'D','2011-02-02',NULL)

    ,(4,'M', '2014-05-06',NULL)

    ,(5,'M', '2008-10-22','2013-08-24')

    ,(2,'M','2013-09-17',NULL)

    ,(9,'D', '2011-02-02',NULL)

    ;with cte as

    (

    Select id, product, start_date, end_date,

    Row_Number() over(Partition by id, product order by (Select null)) RowNum

    from @t

    )

    select *, datediff(dd, start_date, end_date)

    from cte

    where end_date is not null

    The cte was used just to make the code more readable. The Row_Number was used so that the order of ID and Product could be established with this data set. If you have dates on the same ID and Product and you only wanted the latest, then the RowNum comes into play with a slight modification to Row_Number 'order by' and adding RowNum = 1 to the where clause.

    __________________________________________________________________________________________________________
    How to Post to get the most: http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • ;with cte as

    (

    Select id, product, start_date, end_date,

    Row_Number() over(Partition by id, product order by (Select null)) RowNum

    from @t

    )

    select *, datediff(dd, start_date, end_date)

    from cte

    where end_date is not null

    This returns DATEDIFF from the startdate and end_date of the same row where End_date is not null, while I need to calculate a DATEDIFF between 2011-02-01 (end_date row 2) and 2011-02-02 (start_date row 3):

    idPRODUCT start_dateend_date

    9D 2003-07-082006-02-06

    9D 2006-02-072011-02-01

    9D 2011-02-022014-01-03

  • karine.gambarjan (1/8/2016)


    This returns DATEDIFF from the startdate and end_date of the same row where End_date is not null, while I need to calculate a DATEDIFF between 2011-02-01 (end_date row 2) and 2011-02-02 (start_date row 3):

    idPRODUCT start_dateend_date

    9D 2003-07-082006-02-06

    9D 2006-02-072011-02-01

    9D 2011-02-022014-01-03

    You did not make this clear in your original post.

    Please post consumable data, like I did in my previous reply, that contains a representative sample of your data. Also, you need to provide what your output should look like so there is no guessing involved.

    __________________________________________________________________________________________________________
    How to Post to get the most: http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • I'm pretty sure I have a solution. You need to provide a little more information. I need to know which start_date and which end_date you are using in the datediff function to get the days elapsed.

    __________________________________________________________________________________________________________
    How to Post to get the most: http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • What the heck. Here are two solutions! The first is a 2012 or better, the other for 2008. Of course, using lead provides much better performance in this case.

    I have also included the additional data that was provided since none of the original data satisfies the revised requirements.

    declare @t table

    (

    id int,

    product char(1),

    start_date date,

    end_date date

    );

    Insert @t (id, product, start_date, end_date)

    values

    (1,'A','2010-06-04',NULL)

    ,(1,'D','2004-08-10','2011-02-01')

    ,(1,'D','2011-02-02',NULL)

    ,(5,'R','2015-09-15',NULL)

    ,(8,'R','2015-09-15',NULL)

    ,(3,'D','2011-02-02',NULL)

    ,(4,'M', '2014-05-06',NULL)

    ,(5,'M', '2008-10-22','2013-08-24')

    ,(2,'M','2013-09-17',NULL)

    ,(9,'D', '2011-02-02',NULL)

    ,(9,'D', '2003-07-08','2006-02-06')

    ,(9,'D', '2006-02-07','2011-02-01')

    ,(9,'D', '2011-02-02','2014-01-03')

    -- 2012 using lead

    ;with cte as

    (

    select id, product, start_date, end_date,

    lead(end_date) over (Partition by id, product order by start_date) nxt_endDate

    from @t

    )

    select c.id, c.product, c.start_date, c.nxt_endDate, datediff(dd, c.start_date, c.nxt_endDate)

    from cte c

    where c.end_date is not null

    and c.nxt_endDate is not null

    -- 2008 version using a self join

    ;with cte as

    (

    select id, product, start_date, end_date,

    Row_Number() over (Partition by id, product order by start_date) rowNum

    from @t

    )

    select c.id, c.product, c.start_date, n.end_date, datediff(dd, c.start_date, n.end_date)

    from cte c

    left outer join cte n

    on c.id = n.id

    and c.product = n.product

    and c.rowNum = n.rowNum - 1

    where c.end_date is not null

    and n.end_date is not null

    __________________________________________________________________________________________________________
    How to Post to get the most: http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Hi! Sorry for delayed answer, was OOO during weekend.

    Thank you for your efforts, I think you are close to the solution but something is still wrong with the output...

    I am calculating DATEDIFF between two rows for the same product to eliminate such rows with DATEDIFF = 1 in the final output.

    When I run your solution for SQL 2008 I got some of this rows eliminated but not all of them for the same Product.

    Insert @t (id, product, start_date, end_date)

    values

    (9,'D', '2003-07-08','2006-02-06')

    ,(9,'D', '2006-02-07','2011-02-01')

    ,(9,'D', '2011-02-02','2014-01-03')

    --For the id 9 above your solution returns following:

    (9,'D', '2003-07-08','2011-02-01',2765)

    ,(9,'D', '2006-02-07','2014-01-03', 2887)

    --and what I need in the final output for ID 9 is:

    (9, 'D', '2003-07-08', '2014-01-03', 3832)

    So basically I don't want to have any product renewals, just the whole period of the Product ownership in my output.

  • Is it not as simple as this?

    SELECT

    id

    ,product

    ,MIN(start_date) AS StartDate

    ,MAX(end_date) AS EndDate

    ,DATEDIFF(DAY,MIN(start_date),MAX(end_date)) AS Diff

    FROM @t

    WHERE end_date IS NOT NULL

    GROUP BY

    id

    ,product

    John

  • John Mitchell-245523 (1/11/2016)


    Is it not as simple as this?

    I don't know. The specs keep changing.

    __________________________________________________________________________________________________________
    How to Post to get the most: http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • It would work but only for those Products which are not current and have END_DATE is not null. For current Product where END_DATE is null:

    9, D, 2004-08-10, 2011-02-01

    9, D, 2011-02-02, NULL

    -- it will return following:

    9, D, 2004-08-10, 2011-02-01,2366

    -- but the Product D is still current so I expect following in return to get it right

    9, D, 2004-08-10, NULL

  • Gosh, that's the first time you've mentioned that requirement. But not to worry - all you need is a UNION ALL and a further statement selecting all rows where end_date is null. The trouble with that, however, is that you're likely to get two scans on your table to produce the two result sets. Someone else may chip in with a way to avoid that.

    John

  • your rules appear to change as you go along.......please read this article and post back with some scripts that fully demonstrate what you require.

    http://spaghettidba.com/2015/04/24/how-to-post-a-t-sql-question-on-a-public-forum

    shot in the dark

    SELECT product,

    MIN(start_date),

    CASE WHEN MAX(ISNULL(end_date,'9999-01-01')) = '9999-01-01' THEN NULL ELSE MAX(end_date) END,

    CASE WHEN MAX(ISNULL(end_date,'9999-01-01')) = '9999-01-01' THEN NULL ELSE DATEDIFF(dd, MIN(start_date), MAX(end_date)) END

    FROM @t

    GROUP BY product;

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Sorry guys, I'm complete newbie on this forum!

    In the beginning I just wondered how do I define DATEDIFF =1 between two data rows for the same Product and ID. But as long as you guys posted the solutions I understood that I need to present you the whole picture of the problem, I am sorry that this caused you some xaos....

    So let's do it right:

    --Table Script and Sample data

    declare @t table

    (

    id int,

    product char(1),

    start_date date,

    end_date date

    );

    Insert @t (id, product, start_date, end_date)

    values

    (1,'A','2010-06-04',NULL)

    ,(1,'D','2004-08-10','2011-02-01')

    ,(1,'D','2011-02-02',NULL)

    ,(5,'R','2015-09-15',NULL)

    ,(5,'M','2008-10-22','2013-08-24')

    ,(5,'M','2013-08-25',NULL)

    ,(3,'D','2011-02-02',NULL)

    ,(4,'M', '2014-05-06',NULL)

    ,(2,'M','2013-09-17',NULL)

    ,(9,'D','2003-07-08','2006-02-06')

    ,(9,'D','2006-02-07','2011-02-01')

    ,(9,'D','2011-02-02','2014-01-03')

    -- Expected output

    (1,'A','2010-06-04',NULL)

    ,(1,'D','2004-08-10',NULL)

    ,(5,'R','2015-09-15',NULL)

    ,(5,'M','2008-10-22',NULL)

    ,(3,'D','2011-02-02',NULL)

    ,(4,'M','2014-05-06',NULL)

    ,(2,'M','2013-09-17',NULL)

    Hope ny intentions are much more clear now!

  • karine.gambarjan (1/11/2016)


    Sorry guys, I'm complete newbie on this forum!

    In the beginning I just wondered how do I define DATEDIFF =1 between two data rows for the same Product and ID. But as long as you guys posted the solutions I understood that I need to present you the whole picture of the problem, I am sorry that this caused you some xaos....

    So let's do it right:

    --Table Script and Sample data

    declare @t table

    (

    id int,

    product char(1),

    start_date date,

    end_date date

    );

    Insert @t (id, product, start_date, end_date)

    values

    (1,'A','2010-06-04',NULL)

    ,(1,'D','2004-08-10','2011-02-01')

    ,(1,'D','2011-02-02',NULL)

    ,(5,'R','2015-09-15',NULL)

    ,(5,'M','2008-10-22','2013-08-24')

    ,(5,'M','2013-08-25',NULL)

    ,(3,'D','2011-02-02',NULL)

    ,(4,'M', '2014-05-06',NULL)

    ,(2,'M','2013-09-17',NULL)

    ,(9,'D','2003-07-08','2006-02-06')

    ,(9,'D','2006-02-07','2011-02-01')

    ,(9,'D','2011-02-02','2014-01-03')

    -- Expected output

    (1,'A','2010-06-04',NULL)

    ,(1,'D','2004-08-10',NULL)

    ,(5,'R','2015-09-15',NULL)

    ,(5,'M','2008-10-22',NULL)

    ,(3,'D','2011-02-02',NULL)

    ,(4,'M','2014-05-06',NULL)

    ,(2,'M','2013-09-17',NULL)

    Hope ny intentions are much more clear now!

    ok...just to be sure ....you definitely dont want to return rows for the following ?

    ,(9,'D','2003-07-08','2006-02-06')

    ,(9,'D','2006-02-07','2011-02-01')

    ,(9,'D','2011-02-02','2014-01-03')

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • ok...just to be sure ....you definitely dont want to return rows for the following ?

    ,(9, 'D', '2003-07-08', '2006-02-06')

    ,(9, 'D', '2006-02-07', '2011-02-01')

    ,(9, 'D', '2011-02-02', '2014-01-03')

    Nope, definitly not as the Product is not active anymore in this case (END_DATE is not null)

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply