Consolidating dates from multiple rows

Question

Post reply

Consolidating dates from multiple rows

ghostrider

SSC Enthusiast

Points: 119
More actions
February 19, 2008 at 2:38 pm

#71318

Hey everyone,
I am having a bit of a problem over here. I am trying to consolidate dates from multiple records into a time line that has no date overlap. I'll give you an example to make things clear.
Let's say I have 3 data records:
RowID BeginDate EndDate Price ($)
----------------------------------------------------------
1 01/01/2008 01/10/2008 1.00
2 01/05/2008 01/15/2008 2.00
3 12/20/2007 02/01/2008 1.50
The result I would like to see should look like this:
12/20/2007 - 12/31/2007 the price was 1.50
01/01/2008 - 01/10/2008 the price was 2.50 because row 1 and 3 overlap.
01/05/2008 - 01/15/2008 the price was 3.50 because row 2 and 3 overlap.
01/16/2008 - 02/01/2008 the price was 1.50 because of the row 3.
Any idea on the best way to automate this process?
The solutions that I came up with add a lot of complications to the process.
Any help would be appreciated!

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

goodguy SSCertifiable Points: 5456 More actions · Answer 1

[Quote]RowID BeginDate EndDate Price ($)

1 01/01/2008 01/10/2008 1.00

2 01/05/2008 01/15/2008 2.00

3 12/20/2007 02/01/2008 1.50

The result I would like to see should look like this:

12/20/2007 - 12/31/2007 the price was 1.50

01/01/2008 - 01/10/2008 the price was 2.50 because row 1 and 3 overlap.

01/05/2008 - 01/15/2008 the price was 3.50 because row 2 and 3 overlap.

01/16/2008 - 02/01/2008 the price was 1.50 because of the row 3.[/Quote]

Your explanation is too vague for anyone to fathom how you are arriving at your values. Like why do you start with BeginDate from Row#3 and stop at month-end? Why are you adding up the prices (instead of averaging them)? What do you do when both BeginDates and EndDates overlap on some rows?

Regards,

goodguy

Experience is a bad teacher whose exams precede its lessons

ghostrider SSC Enthusiast Points: 119 More actions · Answer 2

Sorry for that. I'll try to explain what's happening.

Let's say I have 3 records in my Price table

RowID BeginDate EndDate Price ($)

----------------------------------------------------

1 01/01/2008 01/10/2008 1.00

2 01/05/2008 01/15/2008 2.00

3 12/20/2007 02/01/2008 1.50

I need to start with the earliest BeginDate and work up till I hit the most recent BeginDate adding Prices for those records that overlap and only for the date range of the overlap by creating a new record for that.

The earliest BeginDate would be 12/20/2007 for the RowID=3. Now we need to find the one that goes right after it which is RowID=1 with the BeginDate starting on 01/01/2008. Since the date range for the RowID=3 spans the date range for the RowID=1 I have to create 3 records in my new table where I am going to keep a flatten view of this records based on the date.

The records in my new table should be looked like this:

RowID BeginDate EndDate Price ($)

----------------------------------------------------

1 12/20/2007 12/31/2008 1.50

2 01/01/2008 01/10/2008 2.00 (= the above record's value 1.50 + the current 1)

3 01/11/2008 02/01/2008 1.50

Now I have to find the next date range that I need to compare to the ones sitting in my new table and apply it accordingly.

Does it make sense?

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 3

This might very well suck performance-wise against your data, but here goes:

drop table overlap

drop table #mytable

go

create table overlap(rid int identity(1,1), st datetime, ed datetime,amt money)

go

insert overlap(st,ed,amt)

select '01/01/2008' ,'01/10/2008' ,'1.00' UNION ALL

select '01/05/2008', '01/15/2008' ,'2.00' UNION ALL

select '12/20/2007','02/01/2008','1.50'

go

select

dateadd(day,tally.N-1, firstdte) dte , cast(0 as money) CumAmt,0 as period

into #myTable

from

(select min(st) as firstdte, max(ed) as lastdte from overlap) ol

cross join Tally

where dateadd(day,tally.N-1, firstdte) between firstdte and lastdte

create clustered index ucmytable on #mytable(dte)

update #myTable

set CumAmt=(select sum(amt) from overlap where dte between st and ed)

from #mytable

declare @prevamt money

declare @pid int

select @prevamt=0,@pid=0

update #mytable

set @pid=period=case when cumamt=@prevamt then @pid else @pid+1 end,

@prevamt=cumamt

from #mytable with (index(ucmytable),tablockx)

select * from

(

select period,min(dte) stp, max(dte) edp,cumamt

from #mytable

group by period,cumamt

) x

order by stp

indexes on the date fields would help substantially if the dataset is large.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

ghostrider SSC Enthusiast Points: 119 More actions · Answer 4

Hey Matt,

Where's that Tally table you are cross joining on?

Jeff Moden SSC Guru Points: 1004434 More actions · Answer 5

Ya gotta make one...

--===== Create and populate the Tally table on the fly

SELECT TOP 11000 --equates to more than 30 years of dates

IDENTITY(INT,1,1) AS N

INTO dbo.Tally

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2

--===== Add a Primary Key to maximize performance

ALTER TABLE dbo.Tally

ADD CONSTRAINT PK_Tally_N

PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

--===== Allow the general public to use it

GRANT SELECT ON dbo.Tally TO PUBLIC

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004434 More actions · Answer 6

Either way, looks like a correlated sub-query is going to be involved which is why Matt suspects the performance is going to suck a bit...

Here's another way (output could be directed into another table)...

--drop table overlap

go

create table overlap(rid int identity(1,1), st datetime, ed datetime,amt money)

go

insert overlap(st,ed,amt)

select '01/01/2008' ,'01/10/2008' ,1.00 UNION ALL

select '01/05/2008', '01/15/2008' ,2.00 UNION ALL

select '12/20/2007','02/01/2008',1.50

go

DECLARE @MinDate DATETIME

DECLARE @MaxDate DATETIME

DECLARE @Days INT

SELECT @MinDate = MIN(st),

@MaxDate = MAX(ed),

@Days = DATEDIFF(dd,@MinDate,@MaxDate)+1

FROM dbo.OverLap

SELECT t.N AS RowNum,

@MinDate+t.n-1 AS Date,

(SELECT SUM(Amt) FROM dbo.OverLap o WHERE @MinDate+t.n-1 >= o.st AND @MinDate+t.n-1 <= o.ed) AS Amt

FROM dbo.Tally t

WHERE t.N <= @Days

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ghostrider SSC Enthusiast Points: 119 More actions · Answer 7

ghostrider

SSC Enthusiast

Points: 119

February 20, 2008 at 9:57 pm

#780501

Thanks a lot, I appreciate your help with this.