May 31, 2012 at 8:37 am
SQL Kiwi (5/31/2012)
Lynn Pettis (5/31/2012)
It happens to be the one that Paul found with his solution.Yes I highlighted that difference in my first post. I wonder if Greg missed it.
Nope, I saw it, but got tied up with something. I hope my description of that record to Lynn will clarify things a bit. Thanks for the great links too.
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
May 31, 2012 at 8:37 am
Set-based iteration solution:
-- Important index
CREATE UNIQUE INDEX
[UQ dbo.PAYMENTS ID_CASH_TRAN, ACCT_NUM, TS_PMT (AT_PMT, CD_PMT_SRC)]
ON dbo.PAYMENTS
(ID_CASH_TRAN, ACCT_NUM, TS_PMT)
INCLUDE (AT_PMT, CD_PMT_SRC);
GO
-- Set-based iteration holding table
DECLARE @Results AS TABLE
(
RankID bigint NOT NULL,
ACCT_NUM decimal(10,0) NOT NULL,
TS_PMT datetime2(6) NOT NULL DEFAULT '0001-01-01',
AT_PMT decimal(11,2) NOT NULL DEFAULT 0.00,
CD_PMT_SRC char(2) NOT NULL DEFAULT 'XX',
TS_NSF datetime2(6) NOT NULL,
AT_NSF decimal(11,2) NOT NULL,
CD_PMT_SRC_NSF char(2) NOT NULL,
PRIMARY KEY CLUSTERED (RankID, ACCT_NUM)
);
-- Add all NSF records
INSERT @Results
(
RankID,
ACCT_NUM,
TS_NSF,
AT_NSF,
CD_PMT_SRC_NSF
)
SELECT
RANK() OVER (
PARTITION BY nsf.ACCT_NUM
ORDER BY nsf.TS_PMT),
nsf.ACCT_NUM,
nsf.TS_PMT,
nsf.AT_PMT,
nsf.CD_PMT_SRC
FROM dbo.PAYMENTS AS nsf
WHERE
nsf.ID_CASH_TRAN = '5008';
-- Set-based iteration variable
DECLARE
@Rank bigint = 1;
-- Find PMT for the first NSF
-- for each account (rank 1)
UPDATE r
SET
TS_PMT = pmt.TS_PMT,
AT_PMT = pmt.AT_PMT,
CD_PMT_SRC = pmt.CD_PMT_SRC
FROM @Results AS r
CROSS APPLY
(
-- Find the PMT
SELECT TOP (1)
pmt.TS_PMT,
pmt.AT_PMT,
pmt.CD_PMT_SRC
FROM dbo.PAYMENTS AS pmt
WHERE
pmt.ID_CASH_TRAN = '5000'
AND pmt.ACCT_NUM = r.ACCT_NUM
AND pmt.AT_PMT = 0.0 - r.AT_NSF
AND pmt.TS_PMT <= r.TS_NSF
AND pmt.TS_PMT >= DATEADD(DAY,-60, r.TS_NSF)
ORDER BY
pmt.TS_PMT ASC
) AS pmt
WHERE
r.RankID = @Rank;
-- Loop: fills in all PMTs
-- for Rank 2, then Rank 3...
-- Note all accounts at the same
-- rank processed on each iteration
WHILE 1 = 1
BEGIN
SET @Rank += 1;
UPDATE r
SET
TS_PMT = pmt.TS_PMT,
AT_PMT = pmt.AT_PMT,
CD_PMT_SRC = pmt.CD_PMT_SRC
FROM @Results AS r
JOIN @Results AS prev ON
prev.RankID = @Rank - 1
AND prev.ACCT_NUM = r.ACCT_NUM
CROSS APPLY
(
-- PMT associated with the NSF
SELECT TOP (1)
pmt.TS_PMT,
pmt.AT_PMT,
pmt.CD_PMT_SRC
FROM dbo.PAYMENTS AS pmt
WHERE
pmt.ID_CASH_TRAN = '5000'
AND pmt.ACCT_NUM = r.ACCT_NUM
AND pmt.AT_PMT = 0.0 - r.AT_NSF
AND pmt.TS_PMT > prev.TS_PMT
AND pmt.TS_PMT <= r.TS_NSF
AND pmt.TS_PMT >= DATEADD(DAY,-60, r.TS_NSF)
ORDER BY
pmt.TS_PMT ASC
) AS pmt
WHERE
r.RankID = @Rank;
IF @@ROWCOUNT = 0 BREAK;
END;
SELECT
r.ACCT_NUM,
r.TS_PMT,
r.AT_PMT,
r.CD_PMT_SRC,
r.TS_NSF,
r.AT_NSF,
r.CD_PMT_SRC_NSF
FROM @Results AS r
ORDER BY
r.ACCT_NUM,
r.TS_PMT;
The logic here should be the same as the recursive CTE, but may be easier to understand. It is based on Hugo Kornelis' set-based iteration method shown in the original MVP Deep Dives book. It will be interesting to hear how it performs compared to the rCTE. Be sure not to have actual execution plan on when you run this!
ACCT_NUMTS_PMTAT_PMTCD_PMT_SRCTS_NSFAT_NSFCD_PMT_SRC_NSF
371728062010-12-23 20:21:44.728266-500.00CW2010-12-28 18:50:45.769860500.00GC
371728062010-12-23 20:21:44.809691-500.00CW2011-01-04 18:18:20.206875500.00GC
371728062010-12-23 20:21:44.816891-500.00CW2011-01-04 18:18:20.990343500.00GC
371728062010-12-23 20:21:44.823681-3236.81TP2011-01-11 20:23:32.7808913236.81TP
932999722011-09-22 20:22:23.081475-250.00CW2011-09-26 18:14:24.812516250.00GC
932999722011-11-04 20:24:28.590740-500.00CW2011-11-09 18:12:13.357006500.00GC
932999722011-11-04 20:24:28.872289-500.00CW2011-11-09 18:12:13.843398500.00GC
932999722011-11-04 20:24:28.886868-142.00CW2011-11-09 18:12:13.884036142.00GC
932999722011-12-01 20:22:07.649355-500.00CW2011-12-05 18:16:01.951365500.00GC
932999722011-12-01 20:22:07.818174-500.00CW2011-12-05 18:16:02.528679500.00GC
1147397902011-04-11 20:22:56.769630-379.25CW2011-04-15 18:14:55.295335379.25GC
1147397902011-04-13 20:22:27.659510-500.00CW2011-04-20 18:16:57.222716500.00GC
1147397902011-04-13 20:22:27.885458-500.00CW2011-04-20 18:16:57.795456500.00GC
2324123202011-10-24 20:23:16.712385-600.00CW2011-11-02 18:18:20.336650600.00GC
2324123202011-10-24 20:23:16.809497-600.00CW2011-11-02 18:18:21.192792600.00GC
2324123202011-10-24 20:23:16.818424-381.00CW2011-11-02 18:18:21.295221381.00GC
6928094112010-12-17 18:22:02.569107-190.00CF2010-12-28 19:01:05.482708190.00CF
6928094112011-03-11 18:26:00.625220-190.00CF2011-03-21 18:27:31.321997190.00CF
6928094112011-07-01 18:24:40.453690-190.00CF2011-07-12 18:33:10.543816190.00CF
6928094112011-07-29 18:29:23.825201-190.00CF2011-08-08 18:27:35.606194190.00CF
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 31, 2012 at 8:44 am
Greg Snidow (5/31/2012)
Lynn Pettis (5/31/2012)
Greg,Last night while working n this I did find a problem with your solution. Based on your requirements, one of the records you return is incorrect. It happens to be the one that Paul found with his solution.
Your Results:
ACCT_NUMTS_PMTAT_PMTCD_PMT_SRCTS_NSFAT_NSFCD_PMT_SRC
6928094112010-12-20 18:20:41.767741-190.00CF2010-12-28 19:01:05.482708190.00CF
Your sample data:
ACCT_NUMTS_PMTAT_PMT ID_CASH_TRANCD_PMT_SRCRowNum
6928094112010-12-17 18:22:02.569107-190.005000CF1 <-- Should return this
6928094112010-12-20 18:20:41.767741-190.005000CF2 <-- You return this this
6928094112010-12-28 19:01:05.482708 190.005008CF1
Ahh, and therein may lie the problem. By looking at the above, the customer made two $190 payments, one on 12/17, and one on 12/20, then had a NSF on 12/28. One of the assumptions is that a NSF must be tied to the most recent instance of a payment in that amount. So, sinced the payment of $190 on 12/20 is more recent than the one made on 12/17, the NSF must be matched to the payment made on 12/20. It is easy to be confused by the multiple payments in the same dollar amount, but we have many customers who are on a levelized payment plan, so they pay the exact same amount every month. We then have customers who take their monthly amount and break it into several equal parts they send in throughout the month, sometimes just days apart.
Doesn't match with this from your original post:
Also, the NSF transactions are assumed to occur in the same order as the payments. For example, if customer 1 has two $500 payments on the same day, then has two NSF records a week later, the first NSF is to be matched with the first payment, and the second NSF is to be matched with the second payment.
Edit: Nevermind, the above says for payments made the same day. Sorry, just got out of the Dentists office and not thinking clearly.
May 31, 2012 at 8:46 am
SQL Kiwi (5/31/2012)
Greg Snidow (5/31/2012)
I don't get it Paul. I took 10 of the accounts that had records in my result set and not yours, and ran both of our code on only those accounts, and the results were identical.There will be a logical reason (or perhaps you have ghosts, can't say for sure from here). Did you persist your 60,114 rows and my 59,252 rows in a table? It ought to be possible to quickly find differences from those, and check manually whether the rows belong in the result set or not. I'm sure it's not as easy as that, though. Oh, and I agree lids are vital.
Paul
I did put the results into tables and compared the differences. It's very strange in that when I take some of the accounts missing from your results, then feed them through your solution separately, it works fine. In fact, I was going to load the data from some of them here for you to see for yourself, but then I thought I ought to try it first to see what happens, and indeed there was no problem with your output. I'm flumoxed, but should have more time later today or tomorrow. I'll have to completely break down you code to see if I can re-create it before I fully understand it. Like Dwain said, I'm very recursively challenged. I thought there might be a way with this one, but I did not really expect such great errort. Thank you all so much.
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
May 31, 2012 at 8:49 am
Greg Snidow (5/31/2012)
I'll have to completely break down you code to see if I can re-create it before I fully understand it. Like Dwain said, I'm very recursively challenged.
Have a look at the set-based iteration alternative I just posted. The concepts are very similar, but you might find it easier to grok that before fully tackling the rCTE head-on.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 31, 2012 at 8:53 am
Lynn Pettis (5/31/2012)
Doesn't match with this from your original post:
Also, the NSF transactions are assumed to occur in the same order as the payments. For example, if customer 1 has two $500 payments on the same day, then has two NSF records a week later, the first NSF is to be matched with the first payment, and the second NSF is to be matched with the second payment.
Edit: Nevermind, the above says for payments made the same day. Sorry, just got out of the Dentists office and not thinking clearly.
Yes, I realize now I could have been way more clear in my requirements. I hope now it is more clear. Thank you, and I apologize for the confusion.
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
May 31, 2012 at 9:06 am
Greg Snidow (5/31/2012)
Lynn Pettis (5/31/2012)
Doesn't match with this from your original post:
Also, the NSF transactions are assumed to occur in the same order as the payments. For example, if customer 1 has two $500 payments on the same day, then has two NSF records a week later, the first NSF is to be matched with the first payment, and the second NSF is to be matched with the second payment.
Edit: Nevermind, the above says for payments made the same day. Sorry, just got out of the Dentists office and not thinking clearly.
Yes, I realize now I could have been way more clear in my requirements. I hope now it is more clear. Thank you, and I apologize for the confusion.
See, we aren't that much different from our users. :w00t: 😛
By the way, not a problem.
May 31, 2012 at 9:40 am
Greg Snidow (5/31/2012)
I hope my description of that record to Lynn will clarify things a bit.
Aha, ok this is an easy tweak - we just need to traverse the records in the reverse order. Updated set-based iteration code below:
-- Set-based iteration holding table
DECLARE @Results AS TABLE
(
RankID bigint NOT NULL,
ACCT_NUM decimal(10,0) NOT NULL,
TS_PMT datetime2(6) NOT NULL DEFAULT '0001-01-01',
AT_PMT decimal(11,2) NOT NULL DEFAULT 0.00,
CD_PMT_SRC char(2) NOT NULL DEFAULT 'XX',
TS_NSF datetime2(6) NOT NULL,
AT_NSF decimal(11,2) NOT NULL,
CD_PMT_SRC_NSF char(2) NOT NULL,
PRIMARY KEY CLUSTERED (RankID, ACCT_NUM)
);
-- Add all NSF records
INSERT @Results
(
RankID,
ACCT_NUM,
TS_NSF,
AT_NSF,
CD_PMT_SRC_NSF
)
SELECT
RANK() OVER (
PARTITION BY nsf.ACCT_NUM
ORDER BY nsf.TS_PMT DESC),
nsf.ACCT_NUM,
nsf.TS_PMT,
nsf.AT_PMT,
nsf.CD_PMT_SRC
FROM dbo.PAYMENTS AS nsf
WHERE
nsf.ID_CASH_TRAN = '5008';
-- Set-based iteration variable
DECLARE
@Rank bigint = 1;
-- Find PMT for the last NSF
-- for each account (rank 1)
UPDATE r
SET
TS_PMT = pmt.TS_PMT,
AT_PMT = pmt.AT_PMT,
CD_PMT_SRC = pmt.CD_PMT_SRC
FROM @Results AS r
CROSS APPLY
(
-- Find the PMT
SELECT TOP (1)
pmt.TS_PMT,
pmt.AT_PMT,
pmt.CD_PMT_SRC
FROM dbo.PAYMENTS AS pmt
WHERE
pmt.ID_CASH_TRAN = '5000'
AND pmt.ACCT_NUM = r.ACCT_NUM
AND pmt.AT_PMT = 0.0 - r.AT_NSF
AND r.TS_NSF BETWEEN pmt.TS_PMT AND DATEADD(DAY,60, pmt.TS_PMT)
ORDER BY
pmt.TS_PMT DESC
) AS pmt
WHERE
r.RankID = @Rank;
-- Loop: fills in all PMTs
-- for Rank 2, then Rank 3...
-- Note all accounts at the same
-- rank processed on each iteration
WHILE 1 = 1
BEGIN
SET @Rank += 1;
UPDATE r
SET
TS_PMT = pmt.TS_PMT,
AT_PMT = pmt.AT_PMT,
CD_PMT_SRC = pmt.CD_PMT_SRC
FROM @Results AS r
JOIN @Results AS prev ON
prev.RankID = @Rank - 1
AND prev.ACCT_NUM = r.ACCT_NUM
CROSS APPLY
(
-- PMT associated with the NSF
SELECT TOP (1)
pmt.TS_PMT,
pmt.AT_PMT,
pmt.CD_PMT_SRC
FROM dbo.PAYMENTS AS pmt
WHERE
pmt.ID_CASH_TRAN = '5000'
AND pmt.ACCT_NUM = r.ACCT_NUM
AND pmt.AT_PMT = 0.0 - r.AT_NSF
AND pmt.TS_PMT < prev.TS_PMT
AND r.TS_NSF BETWEEN pmt.TS_PMT AND DATEADD(DAY,60, pmt.TS_PMT)
ORDER BY
pmt.TS_PMT DESC
) AS pmt
WHERE
r.RankID = @Rank;
IF @@ROWCOUNT = 0 BREAK;
END;
SELECT
r.ACCT_NUM,
r.TS_PMT,
r.AT_PMT,
r.CD_PMT_SRC,
r.TS_NSF,
r.AT_NSF,
r.CD_PMT_SRC_NSF
FROM @Results AS r
ORDER BY
r.ACCT_NUM,
r.TS_PMT;
ACCT_NUMTS_PMTAT_PMTCD_PMT_SRCTS_NSFAT_NSFCD_PMT_SRC_NSF
371728062010-12-23 20:21:44.728266-500.00CW2010-12-28 18:50:45.769860500.00GC
371728062010-12-23 20:21:44.809691-500.00CW2011-01-04 18:18:20.206875500.00GC
371728062010-12-23 20:21:44.816891-500.00CW2011-01-04 18:18:20.990343500.00GC
371728062010-12-23 20:21:44.823681-3236.81TP2011-01-11 20:23:32.7808913236.81TP
932999722011-09-22 20:22:23.081475-250.00CW2011-09-26 18:14:24.812516250.00GC
932999722011-11-04 20:24:28.590740-500.00CW2011-11-09 18:12:13.357006500.00GC
932999722011-11-04 20:24:28.872289-500.00CW2011-11-09 18:12:13.843398500.00GC
932999722011-11-04 20:24:28.886868-142.00CW2011-11-09 18:12:13.884036142.00GC
932999722011-12-01 20:22:07.649355-500.00CW2011-12-05 18:16:01.951365500.00GC
932999722011-12-01 20:22:07.818174-500.00CW2011-12-05 18:16:02.528679500.00GC
1147397902011-04-11 20:22:56.769630-379.25CW2011-04-15 18:14:55.295335379.25GC
1147397902011-04-13 20:22:27.659510-500.00CW2011-04-20 18:16:57.222716500.00GC
1147397902011-04-13 20:22:27.885458-500.00CW2011-04-20 18:16:57.795456500.00GC
2324123202011-10-24 20:23:16.712385-600.00CW2011-11-02 18:18:20.336650600.00GC
2324123202011-10-24 20:23:16.809497-600.00CW2011-11-02 18:18:21.192792600.00GC
2324123202011-10-24 20:23:16.818424-381.00CW2011-11-02 18:18:21.295221381.00GC
6928094112010-12-20 18:20:41.767741-190.00CF2010-12-28 19:01:05.482708190.00CF *** fixed! ***
6928094112011-03-11 18:26:00.625220-190.00CF2011-03-21 18:27:31.321997190.00CF
6928094112011-07-01 18:24:40.453690-190.00CF2011-07-12 18:33:10.543816190.00CF
6928094112011-07-29 18:29:23.825201-190.00CF2011-08-08 18:27:35.606194190.00CF
It's late/early. I'll fix up the rCTE tomorrow.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 31, 2012 at 9:48 am
SQL Kiwi (5/31/2012)
Greg Snidow (5/31/2012)
I hope my description of that record to Lynn will clarify things a bit.Aha, ok this is an easy tweak - we just need to traverse the records in the reverse order. Updated set-based iteration code below:
Thank you Paul. That is why in my ROW_NUMBER() I ordered by TS_PMT DESC. When I did it ASC I got screwy results. I'll let you know how it comes out a little later. Thank you for sticking with it. I don't know what I would do if it were not for SSC and all the good folks here willing to help with other's issues.
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
June 1, 2012 at 1:03 am
Updated recursive solution:
WITH LastNSF AS
(
-- Last NSF per account
SELECT
nsf.ACCT_NUM,
nsf.AT_PMT,
nsf.TS_PMT,
nsf.CD_PMT_SRC
FROM dbo.PAYMENTS AS nsf
WHERE
nsf.ID_CASH_TRAN = '5008'
AND nsf.TS_PMT =
(
SELECT
MAX(nsf2.TS_PMT)
FROM dbo.PAYMENTS AS nsf2
WHERE
nsf2.ID_CASH_TRAN = nsf.ID_CASH_TRAN
AND nsf2.ACCT_NUM = nsf.ACCT_NUM
)
), rCTE AS
(
-- Anchor: last NSF and matching PMT per account
SELECT
pmt.ACCT_NUM,
pmt.TS_PMT,
pmt.AT_PMT,
pmt.CD_PMT_SRC,
TS_NSF = LastNSF.TS_PMT,
AT_NSF = LastNSF.AT_PMT,
CD_PMT_SRC_NSF = LastNSF.CD_PMT_SRC
FROM LastNSF
CROSS APPLY
(
-- PMT associated with the first NSF
SELECT TOP (1)
pmt.ACCT_NUM,
pmt.TS_PMT,
pmt.AT_PMT,
pmt.CD_PMT_SRC
FROM dbo.PAYMENTS AS pmt
WHERE
pmt.ID_CASH_TRAN = '5000'
AND pmt.ACCT_NUM = LastNSF.ACCT_NUM
AND pmt.AT_PMT = 0.0 - LastNSF.AT_PMT
AND LastNSF.TS_PMT BETWEEN pmt.TS_PMT AND DATEADD(DAY, 60, pmt.TS_PMT)
ORDER BY
pmt.TS_PMT DESC
) AS pmt
UNION ALL
-- Recusrive bit
SELECT
PMT.ACCT_NUM,
PMT.TS_PMT,
PMT.AT_PMT,
PMT.CD_PMT_SRC,
TS_NSF = PreviousNSF.TS_PMT,
AT_NSF = PreviousNSF.AT_PMT,
CD_PMT_SRC_NSF = PreviousNSF.CD_PMT_SRC
FROM
(
-- Previous NSF in sequence
SELECT * FROM
(
SELECT
PreviousNSF.ACCT_NUM,
AT_PMT = PreviousNSF.AT_PMT,
PreviousNSF.CD_PMT_SRC,
PreviousNSF.TS_PMT,
Recusrive_TS_PMT = rCTE.TS_PMT,
PNSF60 = DATEADD(DAY, -60, PreviousNSF.TS_PMT),
rn = ROW_NUMBER() OVER (ORDER BY PreviousNSF.TS_PMT DESC)
FROM dbo.PAYMENTS AS PreviousNSF
JOIN rCTE ON
rCTE.ACCT_NUM = PreviousNSF.ACCT_NUM
AND PreviousNSF.TS_PMT < rCTE.TS_NSF
WHERE
PreviousNSF.ID_CASH_TRAN = '5008'
) AS PreviousNSF
WHERE
PreviousNSF.rn = 1
) AS PreviousNSF
CROSS APPLY
(
-- PMT associated with the current NSF
SELECT *
FROM
(
SELECT
PMT.ACCT_NUM,
PMT.TS_PMT,
PMT.AT_PMT,
PMT.CD_PMT_SRC,
rn = ROW_NUMBER() OVER (ORDER BY PMT.TS_PMT DESC)
FROM dbo.PAYMENTS AS PMT
WHERE
PMT.ID_CASH_TRAN = '5000'
AND PMT.ACCT_NUM = PreviousNSF.ACCT_NUM
AND PMT.AT_PMT = 0.0 - PreviousNSF.AT_PMT
AND PMT.TS_PMT < PreviousNSF.Recusrive_TS_PMT
AND PMT.TS_PMT <= PreviousNSF.TS_PMT
AND PMT.TS_PMT >= PreviousNSF.PNSF60
) AS PMT
WHERE
PMT.rn = 1
) AS PMT
)
SELECT
ACCT_NUM,
TS_PMT,
AT_PMT,
CD_PMT_SRC,
TS_NSF,
AT_NSF,
CD_PMT_SRC_NSF
FROM rCTE
ORDER BY
ACCT_NUM,
TS_PMT
OPTION (MAXRECURSION 0);
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
June 1, 2012 at 5:27 am
SQL Kiwi (6/1/2012)
Updated recursive solution:
Thank you so much Paul. I get everything down to the recursive bit. It's going to take me some time to digest the rest. I'm still trying to wrap my head around APPLY. Last night I messed around with using it in some simple queries a bit, and I can see it will require a whole new way of thinking. Thank you for introducing this to me.
Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
Viewing 11 posts - 31 through 40 (of 40 total)
You must be logged in to reply to this topic. Login to reply