Consolidating dates from multiple rows

  • Hey everyone,

    I am having a bit of a problem over here. I am trying to consolidate dates from multiple records into a time line that has no date overlap. I'll give you an example to make things clear.

    Let's say I have 3 data records:

    RowID BeginDate EndDate Price ($)

    ----------------------------------------------------------

    1 01/01/2008 01/10/2008 1.00

    2 01/05/2008 01/15/2008 2.00

    3 12/20/2007 02/01/2008 1.50

    The result I would like to see should look like this:

    12/20/2007 - 12/31/2007 the price was 1.50

    01/01/2008 - 01/10/2008 the price was 2.50 because row 1 and 3 overlap.

    01/05/2008 - 01/15/2008 the price was 3.50 because row 2 and 3 overlap.

    01/16/2008 - 02/01/2008 the price was 1.50 because of the row 3.

    Any idea on the best way to automate this process?

    The solutions that I came up with add a lot of complications to the process.

    Any help would be appreciated!

  • [Quote]RowID BeginDate EndDate Price ($)

    1 01/01/2008 01/10/2008 1.00

    2 01/05/2008 01/15/2008 2.00

    3 12/20/2007 02/01/2008 1.50

    The result I would like to see should look like this:

    12/20/2007 - 12/31/2007 the price was 1.50

    01/01/2008 - 01/10/2008 the price was 2.50 because row 1 and 3 overlap.

    01/05/2008 - 01/15/2008 the price was 3.50 because row 2 and 3 overlap.

    01/16/2008 - 02/01/2008 the price was 1.50 because of the row 3.[/Quote]

    Your explanation is too vague for anyone to fathom how you are arriving at your values. Like why do you start with BeginDate from Row#3 and stop at month-end? Why are you adding up the prices (instead of averaging them)? What do you do when both BeginDates and EndDates overlap on some rows?


    Regards,

    goodguy

    Experience is a bad teacher whose exams precede its lessons

  • Sorry for that. I'll try to explain what's happening.

    Let's say I have 3 records in my Price table

    RowID BeginDate EndDate Price ($)

    ----------------------------------------------------

    1 01/01/2008 01/10/2008 1.00

    2 01/05/2008 01/15/2008 2.00

    3 12/20/2007 02/01/2008 1.50

    I need to start with the earliest BeginDate and work up till I hit the most recent BeginDate adding Prices for those records that overlap and only for the date range of the overlap by creating a new record for that.

    The earliest BeginDate would be 12/20/2007 for the RowID=3. Now we need to find the one that goes right after it which is RowID=1 with the BeginDate starting on 01/01/2008. Since the date range for the RowID=3 spans the date range for the RowID=1 I have to create 3 records in my new table where I am going to keep a flatten view of this records based on the date.

    The records in my new table should be looked like this:

    RowID BeginDate EndDate Price ($)

    ----------------------------------------------------

    1 12/20/2007 12/31/2008 1.50

    2 01/01/2008 01/10/2008 2.00 (= the above record's value 1.50 + the current 1)

    3 01/11/2008 02/01/2008 1.50

    Now I have to find the next date range that I need to compare to the ones sitting in my new table and apply it accordingly.

    Does it make sense?

  • This might very well suck performance-wise against your data, but here goes:

    drop table overlap

    drop table #mytable

    go

    create table overlap(rid int identity(1,1), st datetime, ed datetime,amt money)

    go

    insert overlap(st,ed,amt)

    select '01/01/2008' ,'01/10/2008' ,'1.00' UNION ALL

    select '01/05/2008', '01/15/2008' ,'2.00' UNION ALL

    select '12/20/2007','02/01/2008','1.50'

    go

    select

    dateadd(day,tally.N-1, firstdte) dte , cast(0 as money) CumAmt,0 as period

    into #myTable

    from

    (select min(st) as firstdte, max(ed) as lastdte from overlap) ol

    cross join Tally

    where dateadd(day,tally.N-1, firstdte) between firstdte and lastdte

    create clustered index ucmytable on #mytable(dte)

    update #myTable

    set CumAmt=(select sum(amt) from overlap where dte between st and ed)

    from #mytable

    declare @prevamt money

    declare @pid int

    select @prevamt=0,@pid=0

    update #mytable

    set @pid=period=case when cumamt=@prevamt then @pid else @pid+1 end,

    @prevamt=cumamt

    from #mytable with (index(ucmytable),tablockx)

    select * from

    (

    select period,min(dte) stp, max(dte) edp,cumamt

    from #mytable

    group by period,cumamt

    ) x

    order by stp

    indexes on the date fields would help substantially if the dataset is large.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Hey Matt,

    Where's that Tally table you are cross joining on?

  • Ya gotta make one...

    --===== Create and populate the Tally table on the fly

    SELECT TOP 11000 --equates to more than 30 years of dates

    IDENTITY(INT,1,1) AS N

    INTO dbo.Tally

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally_N

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    --===== Allow the general public to use it

    GRANT SELECT ON dbo.Tally TO PUBLIC

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Either way, looks like a correlated sub-query is going to be involved which is why Matt suspects the performance is going to suck a bit...

    Here's another way (output could be directed into another table)...

    --drop table overlap

    go

    create table overlap(rid int identity(1,1), st datetime, ed datetime,amt money)

    go

    insert overlap(st,ed,amt)

    select '01/01/2008' ,'01/10/2008' ,1.00 UNION ALL

    select '01/05/2008', '01/15/2008' ,2.00 UNION ALL

    select '12/20/2007','02/01/2008',1.50

    go

    DECLARE @MinDate DATETIME

    DECLARE @MaxDate DATETIME

    DECLARE @Days INT

    SELECT @MinDate = MIN(st),

    @MaxDate = MAX(ed),

    @Days = DATEDIFF(dd,@MinDate,@MaxDate)+1

    FROM dbo.OverLap

    SELECT t.N AS RowNum,

    @MinDate+t.n-1 AS Date,

    (SELECT SUM(Amt) FROM dbo.OverLap o WHERE @MinDate+t.n-1 >= o.st AND @MinDate+t.n-1 <= o.ed) AS Amt

    FROM dbo.Tally t

    WHERE t.N <= @Days

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks a lot, I appreciate your help with this.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply