August 4, 2009 at 7:19 am
Hello all,
I have a rather difficult problem and I was hoping to get some input from you all. The problem is: I have claims that begin and end at varying times. I need to give the appropriate amount of claim amount to each month.
That is, if I had a claim for $1000 and it started on 1/15/2009 and ended 3/15/2009. This is a total of 59 days. I want to know how much should be billed for each month. The breakdown for the above would be 16 in January, 28 in February, and 15 in March. Thus, (16/59) * 1000 is the amount for January, (28/59) * 1000 for February, and (15/59)*1000 for March.
Not only this, but the desired number of months is variable sometimes its 3 months or less (current quarter) sometimes it's 24 months.
Any help you all can give me is appreciated.
aktikt
August 4, 2009 at 8:33 am
First read Jeff Moden's excellent article on Tally tables here[/url]. Create one, then try this:
DECLARE
@startdate datetime,
@enddate datetime
SELECT
@startdate='20090115',
@enddate = '20090315'
SELECT
month = MONTH(date),
year = YEAR(date),
amount = SUM(dailyamount)
FROM
(
SELECT
N, date = dateadd(dd,N-1,@startdate), dailyamount = 1000.0/datediff(dd,@startdate,@enddate)
FROM
Tally
WHERE
N <= datediff(dd,@startdate,@enddate)
) s1
GROUP BY
MONTH(date),YEAR(date)
ORDER BY
YEAR(date),MONTH(date)
Hope this gets you started.
August 4, 2009 at 9:54 am
Nigel,
Thanks for your response. I have read Jeff's article before. I hadn't started down the tally table road for this problem. You're right on with using this method.
My thought is I am going to need to nest tally tables in order to do what you have done for many date ranges. I'm not sure how to do it though. I'm still trying to wrap my head around this solution.
aktikt
August 4, 2009 at 11:50 am
Hello all,
I have come up with a solution that works, but I don't like it. 🙂
What I have done is use the above tally solution and have used a WHILE loop and
a temp table. The while loop allows me to populate new start and end dates into the @startdate and @enddate variables and the temp table stores the results of each tally table select. I then group by the month and year and get a sum for the amount.
I really don't like this, and it's not that fast. For 3000 loops it took 10 seconds. I imagine for what could be 10s of thousands of records this is going to be pretty slow.
The only way I've seen the tally table work is against variables. What if the value is in a table instead?
I'm all ears for anyone who can solve this.
August 4, 2009 at 9:22 pm
aktikt (8/4/2009)
The only way I've seen the tally table work is against variables. What if the value is in a table instead?I'm all ears for anyone who can solve this.
If you'd provide a table creation statement and some data in a readily consumable format, a couple of us would be all over it. Read the first link in my signature below and come back.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 5, 2009 at 7:41 am
Jeff,
Ok, Jeff here is some data and the code I have used to accomplish the result I need. For 10
records this is no problem, but I found out yesterday there are potentially 12 million claim records that need to be processed. At 3000 records it took 10 seconds.
-- Start Code
--===== If the test table already exists, drop it
IF OBJECT_ID('TempDB..#Claims','U') IS NOT NULL
DROP TABLE #Claims
--===== Create the test table
CREATE TABLE #Claims
(
IDINT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
StartDateDATETIME,
EndDateDATETIME,
Amount FLOAT,
)
--===== All Inserts into the IDENTITY column
SET IDENTITY_INSERT #Claims ON
INSERT INTO #Claims
(ID, StartDate, EndDate, Amount)
SELECT '1','Jan 1 2006 12:00AM','Jun 30 2007 12:00AM',100.0 UNION ALL
SELECT '2','Feb 15 2006 12:00AM','Mar 31 2007 12:00AM',75.42 UNION ALL
SELECT '3','Mar 10 2006 12:00AM','Mar 31 2007 12:00AM',250.0 UNION ALL
SELECT '4','Apr 1 2006 12:00AM','Mar 31 2007 12:00AM',300.31 UNION ALL
SELECT '5','May 4 2006 12:00AM','Aug 31 2006 12:00AM',1200.0 UNION ALL
SELECT '6','Feb 18 2006 12:00AM','Sep 29 2006 12:00AM',550.75 UNION ALL
SELECT '7','Sep 1 2006 12:00AM','Apr 25 2007 12:00AM',300.25 UNION ALL
SELECT '8','Mar 1 2006 12:00AM','Jun 30 2006 12:00AM',250.55 UNION ALL
SELECT '9','Jan 1 2006 12:00AM','Dec 31 2006 12:00AM',40.23 UNION ALL
SELECT '10','May 7 2006 12:00AM','Mar 31 2007 12:00AM',78.0 UNION ALL
SELECT '11','May 7 2006 12:00AM','May 8 2006 12:00AM',50.23
--===== Set the identity insert back to normal
SET IDENTITY_INSERT #Claims OFF
-- The length of time under question for the report
DECLARE @PeriodBegin datetime, @PeriodEnd datetime
SET @PeriodBegin = '1/1/05'
SET @PeriodEnd = '12/31/07'
SET NOCOUNT ON;
-- Table to store the Tally table inserts
DECLARE @Results as Table
(
DateMonth tinyint,
DateYear smallint,
Amount float
)
-- Loop variables
DECLARE
@Count int,
@Counter int
DECLARE
@startdate datetime,
@enddate datetime,
@amount float
SELECT @Count = COUNT(*) FROM #Claims
-- Initialize While loop variable
SET @Counter = 1
-- Begin while loop
WHILE @Count > @Counter
BEGIN
-- Get next row of data from #Claims
SELECT
@startdate = StartDate,
@enddate = EndDate,
@amount = Amount
FROM #Claims
WHERE ID = @Counter
INSERT INTO @Results
SELECT
Month = MONTH(date),
Year = YEAR(date),
Amount = SUM(dailyamount)
FROM
(
SELECT
N, date = dateadd(dd,N-1,@startdate), dailyamount = @amount/datediff(dd,@startdate,@enddate)
FROM
RecycleBin.Tally
WHERE
N = Year(@PeriodBegin) AND DateYear <= Year(@PeriodEnd)
GROUP BY DateMonth, DateYear
-- End code --
Thanks,
aktikt
August 5, 2009 at 8:18 am
Well, here's my stab at it. I am sure there is better way to do this using the APPLY operator
Jim
DECLARE @Table1 TABLE (id int,Date datetime)
DECLARE @Table2 TABLE (id int, Amount float)
DECLARE @Tally Table (Date Datetime)
INSERT INTO @Table1
SELECT 1,'01/15/2009' UNION
SELECT 1,'03/15/2009' UNION
SELECT 2,'03/15/2009' UNION
SELECT 2,'07/01/2009'
INSERT INTO @table2
SELECT 1,1000 UNION
SELECT 2,5000
INSERT INTO @Tally
SELECT DATEADD(month,spt.Number,'01/01/2005')
FROM
master..spt_values spt
WHERE spt.Type = 'P'
;With cte_MinMax
AS
(
select id,[MinDate] = min(date),[maxDate] =max(date)
from @table1
group by id
)
, cte_Dates
AS
(
SELECT t1.id,t.date
FROM @tally t
INNER JOIN
(
select id,[MinDate] = min(date),[maxDate] =max(date)
from @table1
group by id
) t1
ON
t.date >= t1.MinDate and month(t.date) month(t1.MinDate )
and t.date <= t1.Maxdate and month(t.date) month(t1.MaxDate )
UNION
SELECT id,date from @table1
)
select
dates.id,dates.date, dateadd(month,datediff(month,0,dates.date) +1 ,0)
,datediff(day,dates.date,dateadd(month,datediff(month,0,dates.date) +1 ,0))* t1.Amount
/datediff(day,mm.Mindate,mm.maxdate)
from
cte_dates dates
inner join
cte_MinMax mm
on
dates.id = mm.id
inner join
@table2 t1
on
dates.id = t1.id
August 5, 2009 at 8:19 am
aktikt,
Try this on your data and let us know how it goes (you will need the Tally table):
SELECT
ID,
Amount = SUM(dailyAmount),
Year = YEAR(theDay),
Month = MONTH(theDay)
FROM
(
SELECT
ID,
dailyAmount = Amount/DATEDIFF(dd,StartDate,EndDate),
theDay= DATEADD(dd, N - 1,StartDate )
FROM
#Claims
JOIN
Tally t on t.N <= DATEDIFF(dd,StartDate,EndDate)
) s1
GROUP BY ID,YEAR(theDay), MONTH(theDay)
ORDER BY ID,YEAR(theDay), MONTH(theDay)
Hope it helps,
Nigel
August 5, 2009 at 8:25 am
aktikt,
Apologies, just noticed that you don't want it grouped by ClaimID, just the Month and Year.
Try this slightly modified version instead:
SELECT
Amount = SUM(dailyAmount),
Year = YEAR(theDay),
Month = MONTH(theDay)
FROM
(
SELECT
dailyAmount = Amount/DATEDIFF(dd,StartDate,EndDate),
theDay= DATEADD(dd, N - 1,StartDate )
FROM
#Claims
JOIN
Tally t on t.N <= DATEDIFF(dd,StartDate,EndDate)
) s1
GROUP BY YEAR(theDay), MONTH(theDay)
ORDER BY YEAR(theDay), MONTH(theDay)
Nigel
August 5, 2009 at 9:03 am
nigel (8/5/2009)
aktikt,Apologies, just noticed that you don't want it grouped by ClaimID, just the Month and Year.
Try this slightly modified version instead:
SELECT
Amount = SUM(dailyAmount),
Year = YEAR(theDay),
Month = MONTH(theDay)
FROM
(
SELECT
dailyAmount = Amount/DATEDIFF(dd,StartDate,EndDate),
theDay= DATEADD(dd, N - 1,StartDate )
FROM
#Claims
JOIN
Tally t on t.N <= DATEDIFF(dd,StartDate,EndDate)
) s1
GROUP BY YEAR(theDay), MONTH(theDay)
ORDER BY YEAR(theDay), MONTH(theDay)
Nigel
Well done Nigel! No temp table, no loops, no cross apply, 100% set based, and can be used in 2k, tk5, or 2k8.
You too, Aktikt. See what I mean? If you post data where people don't have to setup a test and the data and.... well, they just jump all over a problem. Thanks for doing that. What's really cool is that you also took it one step further... you made your code work with the test setup so we could absolutely see what the end result was supposed to be. Very well done.
--Jeff Moden
Change is inevitable... Change for the better is not.
August 5, 2009 at 9:14 am
Nigel,
I tried your second solution on 35923 records but I added the
HAVING clause to restrict the rows to 2006 and 2007 with this:
HAVING YEAR(theDay) > 2005 AND YEAR(theDay) < 2008
and obtained the results in 27 seconds.
aktikt
August 5, 2009 at 9:39 am
aktikt (8/5/2009)
Nigel,I tried your second solution on 35923 records but I added the
HAVING clause to restrict the rows to 2006 and 2007 with this:
HAVING YEAR(theDay) > 2005 AND YEAR(theDay) < 2008
and obtained the results in 27 seconds.
aktikt
aktikt,
Am I to assume that 27 seconds is good then (or not)?
Nigel
August 5, 2009 at 9:43 am
Jeff Moden (8/5/2009)
Well done Nigel! No temp table, no loops, no cross apply, 100% set based, and can be used in 2k, tk5, or 2k8.
Thank you Jeff.
Oooh! This is like getting an A from teacher, and I didn't even have to bring in an apple 😉
August 5, 2009 at 9:51 am
Hi all,
Jeff, thanks for your patronizing complements... Just kidding. I sincerely thank you. I received some helpful replies particularly Nigel's.
Nigel, this is an excellent solution. I wonder though how it will perform on 12 million records. I think 4 - 5 minutes is probably an acceptable amount of time.
Perhaps we could make this execute faster if we restrict the time frame to the 2 year period at the top in the WHERE clause instead of the HAVING clause. I'm not sure how to do this, though
aktikt
August 5, 2009 at 10:02 am
aktikt (8/5/2009)
Hi all,Jeff, thanks for your patronizing complements... Just kidding. I sincerely thank you. I received some helpful replies particularly Nigel's.
Nigel, this is an excellent solution. I wonder though how it will perform on 12 million records. I think 4 - 5 minutes is probably an acceptable amount of time.
Perhaps we could make this execute faster if we restrict the time frame to the 2 year period at the top in the WHERE clause instead of the HAVING clause. I'm not sure how to do this, though
aktikt
Ummm.... I'm thinking that the solution will bog a bit on 12 million rows because of all the individual date rows that the Tally table spawns. It'll also need a pretty good size TempDB. I believe we'll need to break it down by month instead of breaking it down by day and aggregating the days. I can't get to it right now (on my way to a funeral for a friend) but I'll see if I can come up with something later on... heh... Nigel will probably beat me to it, though.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 27 total)
You must be logged in to reply to this topic. Login to reply