August 24, 2006 at 10:14 am
I have a problem I can’t figure out and I’m sure someone here can enlighten me. First, let me setup my scenario.
I have the following table:
CREATE TABLE dbo.table1(
Acct_nbr int,
Month_ind int,
end_balance decimal(15,2),
credit_amt decimal(15,2),
debit_amt decimal(15,2),
amt_change decimal(15,2))
I insert data using the following:
INSERT INTO dbo.table1 VALUES(123, 1, 0, 10, -5, 5)
INSERT INTO dbo.table1 VALUES(123, 4, 0, 7, 0, 7)
INSERT INTO dbo.table1 VALUES(123, 5, 0, 12, -5, 7)
INSERT INTO dbo.table1 VALUES(123, 17, 0, 3, -3, 0)
INSERT INTO dbo.table1 VALUES(789, 7, 0, 9, 0, 9)
INSERT INTO dbo.table1 VALUES(789, 12, 0, 53, -10, 43)
INSERT INTO dbo.table1 VALUES(789, 13, 0, 25, 0, 25)
INSERT INTO dbo.table1 VALUES(789, 14, 0, 5, -17, -12)
INSERT INTO dbo.table1 VALUES(789, 19, 0, 38, -17, 21)
INSERT INTO dbo.table1 VALUES(789, 25, 0, 14, -11, 3)
Now the data looks like the following:
Acct_nbr | Month_ind | End_balance | Credit_amt | Debit_amt | Amt_change |
123 | 1 | 0 | 10 | -5 | 5 |
123 | 4 | 0 | 7 | 0 | 7 |
123 | 5 | 0 | 12 | -5 | 7 |
123 | 17 | 0 | 3 | -3 | 0 |
789 | 7 | 0 | 9 | 0 | 9 |
789 | 12 | 0 | 53 | -10 | 43 |
789 | 13 | 0 | 25 | 0 | 25 |
789 | 14 | 0 | 5 | -17 | -12 |
789 | 19 | 0 | 38 | -17 | 21 |
789 | 25 | 0 | 14 | -11 | 3 |
I want to do two things. One, I need to fill in the gaps in the month_ind field for each account. These entries will have zeroes in all of the fields except acct_nbr and month_ind. Two, I then want to update the end_balance by taking the previous month’s end_balance plus the current month’s amt_change and keep this running total specific to each account
Then final result will look like the following:
Acct_nbr | Month_ind | End_balance | Credit_amt | Debit_amt | Amt_change |
123 | 1 | 5 | 10 | -5 | 5 |
123 | 2 | 5 | 0 | 0 | 0 |
123 | 3 | 5 | 0 | 0 | 0 |
123 | 4 | 12 | 7 | 0 | 7 |
123 | 5 | 19 | 12 | -5 | 7 |
123 | 6 | 19 | 0 | 0 | 0 |
123 | 7 | 19 | 0 | 0 | 0 |
123 | 8 | 19 | 0 | 0 | 0 |
123 | 9 | 19 | 0 | 0 | 0 |
123 | 10 | 19 | 0 | 0 | 0 |
123 | 11 | 19 | 0 | 0 | 0 |
123 | 12 | 19 | 0 | 0 | 0 |
123 | 13 | 19 | 0 | 0 | 0 |
123 | 14 | 19 | 0 | 0 | 0 |
123 | 15 | 19 | 0 | 0 | 0 |
123 | 16 | 19 | 0 | 0 | 0 |
123 | 17 | 19 | 3 | -3 | 0 |
789 | 7 | 9 | 9 | 0 | 9 |
789 | 8 | 9 | 0 | 0 | 0 |
789 | 9 | 9 | 0 | 0 | 0 |
789 | 10 | 9 | 0 | 0 | 0 |
789 | 11 | 9 | 0 | 0 | 0 |
789 | 12 | 52 | 53 | -10 | 43 |
789 | 13 | 77 | 25 | 0 | 25 |
789 | 14 | 65 | 5 | -17 | -12 |
789 | 15 | 65 | 0 | 0 | 0 |
789 | 16 | 65 | 0 | 0 | 0 |
789 | 17 | 65 | 0 | 0 | 0 |
789 | 18 | 65 | 0 | 0 | 0 |
789 | 19 | 76 | 38 | -17 | 21 |
789 | 20 | 76 | 0 | 0 | 0 |
789 | 21 | 76 | 0 | 0 | 0 |
789 | 22 | 76 | 0 | 0 | 0 |
789 | 23 | 76 | 0 | 0 | 0 |
789 | 24 | 76 | 0 | 0 | 0 |
789 | 25 | 79 | 14 | -11 | 3 |
----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011
August 24, 2006 at 11:03 am
Being a programmer, I like to tackle problems like this with cursors first and then massage it into a set-based (a.k.a. "good" or "real") solution.
With that in mind, here you go:
--loop the accounts
DECLARE acct CURSOR LOCAL FAST_FORWARD FOR
SELECT Acct_nbr FROM dbo.table1
DECLARE @acctid INT
DECLARE @month_min INT, @month_max INT
DECLARE @m INT
DECLARE @prev_bal decimal(15,2)
OPEN acct
FETCH NEXT FROM acct INTO @acctid
WHILE @@FETCH_STATUS = 0
BEGIN
--get the start and end months for the account
SELECT
@month_min = MIN(Month_ind)
, @month_max = MAX(Month_ind)
FROM dbo.table1
WHERE Acct_nbr = @acctid
--starting month
SET @m = @month_min
--loop each month
WHILE @m <= @month_max
BEGIN
--get the previous end_balance
-- SELECT @prev_bal = ISNULL(SUM(credit_amt + debit_amt), 0)
SELECT @prev_bal = ISNULL(SUM(amt_change), 0)
FROM dbo.table1
WHERE
Acct_nbr = @acctid AND
Month_ind < @m
--does this month exist for this account?
IF EXISTS (
SELECT Acct_nbr
FROM dbo.table1
WHERE
Acct_nbr = @acctid AND
Month_ind = @m
)
--update the end_balance
UPDATE dbo.table1 SET
end_balance = @prev_bal
WHERE
Acct_nbr = @acctid AND
Month_ind = @m
ELSE --insert the month
INSERT INTO dbo.table1 (
Acct_nbr,
Month_ind,
end_balance,
credit_amt,
debit_amt,
amt_change
) VALUES (
@acctid,
@m,
@prev_bal,
0,
0,
0
)
--next month
END
--next account
FETCH NEXT FROM acct INTO @acctid
END
CLOSE acct
--all done
SELECT * FROM dbo.table1 ORDER BY Acct_nbr, Month_ind
August 24, 2006 at 11:13 am
Thanks for the response. I too am a programmer at heart so the procedural approach is the easiest but, and I forgot to mention this, the problem there is that this job will have to run every night and it will be dealing with millions of records. Right now it stands at 3.2 million records (without the records I need to add). This number is going to grow dramatically as we start loading our new data and keeping historical data.
With that said, I am hoping to find some usage of set-based queries, as much as possible, to solve this problem. I don't think I can avoid some kind of looping (whether cursor or a WHILE loop) but I am hoping to make this as fast as possible.
hawg
----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011
August 24, 2006 at 11:17 am
Will you really have a month indicator > 12...last time I looked at the calendar...December finished the year...you have 17 + 25. What would be the max month in your sequence...if the gap was at the end of the sequence?
I think the best way would be to have a tally/numbers table with the numbers 1-12 are sperate records...and join to it trying to fin the records which don't exist....and creating empty ones, and then resolve the movement records afterewards.
On a large dataset or with a repeatable exercise....it's best to stay away from cursors....unless absolutely necessary.
August 24, 2006 at 11:20 am
Oops, my bad. I was trying to setup a sample of what I am doing because the actual tables and data has many fields and combinations. I was trying to put together a simple sample to illustrate my problem.
Ignore the fact that I called it month_ind and just assume it is some kind of counter. The title, for this sample, really is insignificant.
This was just my error.
hawg
----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011
August 24, 2006 at 11:22 am
Are you going to need to process the "end_balance" every time? Can you initialize with NULL and then ignore on subsequent runs, or can it change?
August 24, 2006 at 11:25 am
My initial data is in a staging table and I am having to write the data to a new table.
I have to truncate the new table every time and reload so, yes, I have to re-calculate the end_balance every time.
----------------------------------------------------------01010011010100010100110000100000010100110110010101110010011101100110010101110010001000000101001001101111011000110110101101110011
August 24, 2006 at 12:13 pm
Aw yeah!!! Well, I took care of the inserting missing rows dealio. With some freakin' crazy idea that hit me upside the head. *The update calculations have dependencies on other rows in the same table, so you will have to incrementally update these. I used a cursor for each month, but updated all account records for the month.
SET NOCOUNT ON; CREATE TABLE #table1 ( Acct_nbr INT NOT NULL, Month_ind INT NOT NULL, end_balance DECIMAL(15, 2) NOT NULL DEFAULT(0), credit_amt DECIMAL(15, 2) NOT NULL DEFAULT(0), debit_amt DECIMAL(15, 2) NOT NULL DEFAULT(0), amt_change DECIMAL(15, 2) NOT NULL DEFAULT(0), PRIMARY KEY ( Acct_nbr, Month_ind ) ); --populate the test table INSERT INTO #table1 ( Acct_nbr, Month_ind, end_balance, credit_amt, debit_amt, amt_change ) SELECT 123, 1, 0, 10, -5, 5 UNION ALL SELECT 123, 4, 0, 7, 0, 7 UNION ALL SELECT 123, 5, 0, 12, -5, 7 UNION ALL SELECT 123, 17, 0, 3, -3, 0 UNION ALL SELECT 789, 7, 0, 9, 0, 9 UNION ALL SELECT 789, 12, 0, 53, -10, 43 UNION ALL SELECT 789, 13, 0, 25, 0, 25 UNION ALL SELECT 789, 14, 0, 5, -17, -12 UNION ALL SELECT 789, 19, 0, 38, -17, 21 UNION ALL SELECT 789, 25, 0, 14, -11, 3; --per account the smallest and largest months in the set SELECT Acct_nbr, MIN(Month_ind) AS Month_ind_min, MAX(Month_ind) AS Month_ind_max INTO #acct_month_range FROM #table1 GROUP BY Acct_nbr; DECLARE @min_Month_ind INT, @max_Month_ind INT; --get the smallest and largest months in the set SELECT @min_Month_ind = MIN(Month_ind_min), @max_Month_ind = MAX(Month_ind_max) FROM #acct_month_range; --store all of the possible months CREATE TABLE #all_months ( Month_ind INT NOT NULL PRIMARY KEY ); DECLARE @Month_ind INT; SET @Month_ind = @min_Month_ind; --start at the min WHILE @Month_ind prev.Month_ind GROUP BY curr.Acct_nbr, curr.Month_ind, curr.amt_change ) calc WHERE #table1.Acct_nbr = calc.Acct_nbr AND #table1.Month_ind = calc.Month_ind; SELECT * FROM #table1; --clean up DROP TABLE #acct_month_range; DROP TABLE #all_months; DROP TABLE #table1;
August 24, 2006 at 1:01 pm
I happen to read this articles on "Joy of Number" and had some explaination about avoiding cursors. So Here is how we can avoid cursors.
1.Create a table to hold Numbers. All this holds is incremental numbers.
2.Load the table with Numbers till you reach the max of month
if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[Numbers_Loop]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[Numbers_Loop]
GO
CREATE TABLE [dbo].[Numbers_Loop] (
[Num] [int] NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Numbers_Loop] WITH NOCHECK ADD
CONSTRAINT [PK_Numbers_Loop] PRIMARY KEY CLUSTERED
([Num]) ON [PRIMARY]
GO
Declare @i int;
Declare @i_Max int
Select @i_Max = max(Month_ind) from table1
Select @i = 1;
While @i <= @i_Max
Begin
If Not exists (Select 1 from Numbers_Loop where Num = @i)
Begin
Insert INTO Numbers_Loop(Num)
select @i
End
Select @i = @i + 1;
End;
3.Using the "Numbers" table as the starting point will get us the "Missing Months". So load the data along with "Missing Months" into temp table.
select MinMax.Acct_Nbr,Numbers_1.Num as Month_ind,--MinMax.MinID,MinMax.MaxID,
isnull(table1.credit_amt,0) as credit_amt,
isnull(table1.debit_amt,0) as debit_amt,
isnull(table1.amt_change,0) as amt_change
into #tmpFillUp
from Numbers_1 JOIN ( select Acct_Nbr,min(Month_ind) as MinID,max(Month_ind) as MaxID
from table1
group by Acct_Nbr) MinMax
On Numbers_1.Num between MinID and MaxID
LEFT OUTER JOIN table1 ON Numbers_1.Num = table1.Month_ind and MinMax.Acct_Nbr = table1.Acct_Nbr
order by 1,2
4.Run a query to summarize the ending balance.
select Acct_Nbr,Month_ind,
(Select sum(amt_change) from #tmpFillUp EndBalance
where #tmpFillUp.Acct_Nbr = EndBalance.Acct_Nbr
and EndBalance.Month_ind <= #tmpFillUp.Month_ind) as end_balance,
Credit_amt,debit_amt,amt_change
from #tmpFillUp
Hope this helps.
Thanks
Sreejith
August 24, 2006 at 1:15 pm
Step #4 is an excellent set-based method for determining the end_balance for the current month. May I also suggest that you eliminate the column "amt_change" and use "SUM(credit_amt + debit_amt)" if possible? If other queries depend on "amt_change", then may I also suggest using a VIEW?
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply