determine which transactions add up to a specific value?

Question

Post reply

determine which transactions add up to a specific value?

Lowell

SSC Guru

Points: 323518
More actions
October 2, 2012 at 2:56 pm

#282606

I swear I've done this before, but I'm drawing a blank today.
in this example, one or more detail transactions should add up together to be a specific value; but it's not known which transactions should be included.
so in the example below, i have 23 individual transactions, and some subset of them will add up to exactly 244,194.49
can anyone point me to an example where they determine the transactions involved, since they would add up to a specific value?
With MySampleData(RID,VAL)
AS
(
SELECT 2172, 429.00 UNION ALL
SELECT 2173, 20555.00 UNION ALL
SELECT 2174, 5855.71 UNION ALL
SELECT 2175, 13892.89 UNION ALL
SELECT 2194, 36175.10 UNION ALL
SELECT 2188, 5451.32 UNION ALL
SELECT 2189, 566.00 UNION ALL
SELECT 2190, 706.40 UNION ALL
SELECT 2191, 19344.21 UNION ALL
SELECT 2192, 16600.00 UNION ALL
SELECT 2193, 52775.10 UNION ALL
SELECT 2182, 13320.34 UNION ALL
SELECT 2183, 1454.00 UNION ALL
SELECT 2184, 15960.00 UNION ALL
SELECT 2185, 18365.00 UNION ALL
SELECT 2186, 18646.00 UNION ALL
SELECT 2187, 21279.72 UNION ALL
SELECT 2176, 17525.00 UNION ALL
SELECT 2177, 19796.00 UNION ALL
SELECT 2178, 20573.00 UNION ALL
SELECT 2179, 21064.00 UNION ALL
SELECT 2180, 21252.00 UNION ALL
SELECT 2181, 7503.11
)
--theoretically, multiple val items add up to $244,194.49
--23 rows max, since there's 23 entries in this example
SELECT row_number() OVER (ORDER BY RID) AS RW,
RID,VAL
FROM MySampleData
Lowell
--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!
Forum Etiquette: How to post Reporting Services problems

[/url]

Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
[/url]

How to Post Performance Problems - by Gail Shaw
[/url]

Jared SSC Guru Points: 61793 More actions · Answer 1

Will they be sequential, or can it be any combination of rows?

Jared
CE - Microsoft

Lowell SSC Guru Points: 323518 More actions · Answer 2

SQLKnowItAll (10/2/2012)
Will they be sequential, or can it be any combination of rows?

definitely non sequential/gaps are involved;

i'm trying a mini tally table and n-1 transactions, but i keep thinking i need to joint he table agaisnt itself 23 times(for 23 row?), and i know that's not right;

i'm feeling dumb today, for sure.

the actual biz problem is one system shows the transactions were applied at a certain dollar amount, but the otehr system ahs them as unapplied;

trying to determine which details are involved is a bit of a pain, and trying the eyeball method didn't work so far.

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

mister.magoo SSC-Forever Points: 47174 More actions · Answer 3

There are two solutions to this amount:

2172,2173,2174, 2176,2177,2178,2179,2180,2181,2182,2183,2184,2185,2186,2187, 2189,2190,2191

2172,2173,2174,2175,2176,2177,2178,2179,2180,2181,2182,2183,2184,2185,2186,2187,2188,2189,2190

I have formatted them like that to highlight the differences.

The query is damn ugly and runs like a pig with all four legs roughly hacked away at the knee, taking over 2 minutes to find the solutions...

IF OBJECT_ID('tempdb..#data') is null

BEGIN

;WITH MySampleData(RID,VAL )

AS

(

SELECT 2172, 429.00 UNION ALL

SELECT 2173, 20555.00 UNION ALL

SELECT 2174, 5855.71 UNION ALL

SELECT 2175, 13892.89 UNION ALL

SELECT 2194, 36175.10 UNION ALL

SELECT 2188, 5451.32 UNION ALL

SELECT 2189, 566.00 UNION ALL

SELECT 2190, 706.40 UNION ALL

SELECT 2191, 19344.21 UNION ALL

SELECT 2192, 16600.00 UNION ALL

SELECT 2193, 52775.10 UNION ALL

SELECT 2182, 13320.34 UNION ALL

SELECT 2183, 1454.00 UNION ALL

SELECT 2184, 15960.00 UNION ALL

SELECT 2185, 18365.00 UNION ALL

SELECT 2186, 18646.00 UNION ALL

SELECT 2187, 21279.72 UNION ALL

SELECT 2176, 17525.00 UNION ALL

SELECT 2177, 19796.00 UNION ALL

SELECT 2178, 20573.00 UNION ALL

SELECT 2179, 21064.00 UNION ALL

SELECT 2180, 21252.00 UNION ALL

SELECT 2181, 7503.11

)

SELECT RID,VAL

INTO #data

FROM MySampleData

CREATE UNIQUE CLUSTERED INDEX ix1 ON #data (RID);

END

-- little hack - i ran the query including only items with non-zero "pennies/cents" - looking for combinations that ended in ".49"

-- this told me that nothing higher than 2181 could be involved in this particular solution, so I could include a WHERE RID<=2181 in this case.

;WITH rec AS

(

SELECT RID,CONVERT(DECIMAL(30,2),VAL) VAL,CONVERT(VARCHAR(4000),'') as items,RID as RID2

FROM #data

WHERE RID<=2181

UNION ALL

SELECT rec.RID,CONVERT(DECIMAL(30,2),rec.VAL + MySampleData.VAL),CONVERT(VARCHAR(4000),items+','+CONVERT(VARCHAR(4),MySampleData.RID)),MySampleData.RID

FROM rec

JOIN #data MySampleData

ON MySampleData.RID> rec.RID2 -- don't go over items you already covered

AND rec.VAL + MySampleData.VAL<=244194.49 -- no point adding up stuff that exceeds the target!

)

SELECT *

FROM rec

WHERE VAL = 244194.49 -- just in case it was out by a small amount...

OPTION(MAXRECURSION 0)

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

Jared SSC Guru Points: 61793 More actions · Answer 4

Well then, it is possible to have 23! combinations, I believe. You will have to have a pretty sophisticated algorithm to figure this out. I would think this is a CLR thing and not an SQL thing.

Jared
CE - Microsoft

Dwain Camps SSC Guru Points: 86978 More actions · Answer 5

Lowell - Mister Magoo has pointed you in the right direction and his analysis to reduce the initial rowset based on the cents was really good.

His approach is similar to the second script in this article: http://www.sqlservercentral.com/articles/sql+n-Tuples/89809/ (for future reference).

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

mister.magoo SSC-Forever Points: 47174 More actions · Answer 6

Thanks for the validation dwain, and having read your thread, I thought that Jeff's preference for the WHILE loop must have some mileage, so I tried to speed things up and took the best of both worlds....a recursive CTE inside a while loop...

Now, this optimisation is pretty specific to this particular problem, so it is by no means meant as a general solution.

That said, it does bring the solution duration down to zero seconds, from over 2 minutes, so it's not bad...:-D

IF OBJECT_ID('tempdb..#data') is null

BEGIN

;WITH MySampleData(RID,VAL )

AS

(

SELECT 2172, 429.00 UNION ALL

SELECT 2173, 20555.00 UNION ALL

SELECT 2174, 5855.71 UNION ALL

SELECT 2175, 13892.89 UNION ALL

SELECT 2194, 36175.10 UNION ALL

SELECT 2188, 5451.32 UNION ALL

SELECT 2189, 566.00 UNION ALL

SELECT 2190, 706.40 UNION ALL

SELECT 2191, 19344.21 UNION ALL

SELECT 2192, 16600.00 UNION ALL

SELECT 2193, 52775.10 UNION ALL

SELECT 2182, 13320.34 UNION ALL

SELECT 2183, 1454.00 UNION ALL

SELECT 2184, 15960.00 UNION ALL

SELECT 2185, 18365.00 UNION ALL

SELECT 2186, 18646.00 UNION ALL

SELECT 2187, 21279.72 UNION ALL

SELECT 2176, 17525.00 UNION ALL

SELECT 2177, 19796.00 UNION ALL

SELECT 2178, 20573.00 UNION ALL

SELECT 2179, 21064.00 UNION ALL

SELECT 2180, 21252.00 UNION ALL

SELECT 2181, 7503.11

)

SELECT RID,VAL

INTO #data

FROM MySampleData

CREATE UNIQUE CLUSTERED INDEX ix1 ON #data (RID);

END

IF OBJECT_ID('tempdb..#bits') IS NOT NULL

DROP TABLE #bits;

DECLARE

@val DECIMAL(30,2)

,@MaxRID BIGINT

,@val2 DECIMAL(30,2)

,@id int

,@items VARCHAR(4000)

SET @val = 244194.49

-- little hack - run the query including only items with non-zero "pennies/cents" - looking for combinations that end in the required "pennies/cents"

-- store the results so we can reduce the number of items on the next run

;WITH rec AS

(

SELECT RID,CONVERT(DECIMAL(30,2),VAL) VAL,CONVERT(VARCHAR(4000),'') as items,RID as RID2

FROM #data

-- only interested in non-whole numbers here

WHERE VAL>FLOOR(VAL)

UNION ALL

SELECT rec.RID,CONVERT(DECIMAL(30,2),rec.VAL + MySampleData.VAL),CONVERT(VARCHAR(4000),items+','+CONVERT(VARCHAR(4),MySampleData.RID)),MySampleData.RID

FROM rec

JOIN #data MySampleData

-- don't go over items you already covered

ON MySampleData.RID> rec.RID2

-- no point adding up stuff that exceeds the target!

AND rec.VAL + MySampleData.VAL<=@val

-- only interested in non-whole numbers here

WHERE MySampleData.VAL>FLOOR(MySampleData.VAL)

)

SELECT identity(int,1,1) as id,RID,VAl,items

INTO #bits

FROM rec

WHERE (VAL-FLOOR(VAL)) = (@val-FLOOR(@val))

OPTION(MAXRECURSION 0)

select * from #bits

IF OBJECT_ID('tempdb..#results') IS NOT NULL

DROP TABLE #results;

CREATE TABLE #results(matched varchar(4000));

-- Now use a loop to calculate each possible set based on the knowledge we already have about non-whole numbers

SELECT TOP 1 @id = id, @MaxRID = RID,@val2=VAL,@items=CONVERT(VARCHAR(10),RID)+items+',' FROM #bits ORDER BY id

WHILE @@ROWCOUNT=1

BEGIN

-- find combinations of whole numbers that satisfy the remaining requirement

;WITH rec AS

(

SELECT RID,CONVERT(DECIMAL(30,2),VAL) VAL,CONVERT(VARCHAR(4000),'') as items,RID as RID2

FROM #data d

WHERE 1=1

-- No point checking RIDs that come after the one we are interested in, is there?

AND RID<=@MaxRID

-- whole numbers only please

AND VAL=FLOOR(VAL)

UNION ALL

SELECT rec.RID,CONVERT(DECIMAL(30,2),rec.VAL + MySampleData.VAL),CONVERT(VARCHAR(4000),items+','+CONVERT(VARCHAR(4),MySampleData.RID)),MySampleData.RID

FROM rec

JOIN #data MySampleData

-- don't go over items you already covered

ON MySampleData.RID> rec.RID2

-- no point adding up stuff that exceeds the target!

AND (MySampleData.VAL+rec.VAL)<=(@val-@val2)

-- whole numbers only please

AND MySampleData.VAL=FLOOR(MySampleData.VAL)

)

INSERT #results(matched)

SELECT CONVERT(VARCHAR(10),r.RID)+r.items+','+@items

FROM rec r

WHERE r.VAL + @val2 =@val

OPTION(MAXRECURSION 0, FAST 1)

SELECT TOP 1 @id = id, @MaxRID = RID,@val2=VAL,@items=CONVERT(VARCHAR(10),RID)+items+',' FROM #bits WHERE id>@id ORDER BY id

END

SELECT @val as TargetAmount,matched

FROM #results

Edit: Included a mistake made while tidying the code...now fixed.

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

Dwain Camps SSC Guru Points: 86978 More actions · Answer 7

MM,

I don't know if you're familiar with the "Sweden Redistricting" problem, but it is mentioned at the bottom of the page in this thread: http://www.sqlservercentral.com/Forums/Topic1318149-392-5.aspx.

I spent two weeks working on a solution to it that ran in what I considered a reasonable period of time, using exactly the approach you used in your latest post to solve Lowell's problem. Essentially, both are bin packing problems:

1. Sweden re-districting: pack the bins, first with doublets and then with triplets

2. Lowell's problem: pack the bins with values containing fractional amounts and then pack them with the remaining values that are whole numbers.

I didn't post my final solution to the Sweden problem. If you're interested, PM me.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Lowell SSC Guru Points: 323518 More actions · Answer 8

ahh "bin packing" was the keyword; I've got a lot of saved scripts, could't remeber the label for it.

I wrote a script i wrote long ago, but my version involved while loops instead of rCTE's

thank you everyone for the help and the articles again!

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

mister.magoo SSC-Forever Points: 47174 More actions · Answer 9

Dwain,

I did have a read of some of the Swedish districts thread, but not in that much detail.

I guess the general approach with these multi-level recursion problems is to reduce the size of the sets as often as possible, however suits the problem - in your case doublets first ,in this case fractional amounts first.

The key is in spotting how you can do that reduction and finding a way to apply it in a multi-level solution.

I was on the brink of looking at a solution for the Swedish District problem, because it looked interesting, but it may have to wait a while 🙂

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

Dwain Camps SSC Guru Points: 86978 More actions · Answer 10

I had a bit of an epiphany today that I thought I would share. I figure, if Mr. Magoo can use tricks to speed up the solution, I can use a trick to the solution approach. So instead of thinking of this as a bin packing problem, let's think of it conversely as a bin unpacking problem.

Consider the following (set up data omitted but courtesy of Mr. Magoo):

DECLARE @ReferenceTotal MONEY = 244194.49

,@Delimiter CHAR(1) = ','

,@ActualTotal MONEY = (

SELECT SUM(VAL)

FROM #data)

DECLARE @TargetTotal MONEY = @ActualTotal - @ReferenceTotal

;WITH UNIQUEnTuples (n, Tuples, ID, value) AS (

SELECT DISTINCT 1, CAST(RID AS VARCHAR(8000)), RID, CAST(VAL AS MONEY)

FROM #data

WHERE VAL <= @TargetTotal

UNION ALL

SELECT 1 + n.n, CAST(t.RID AS VARCHAR(8000)) + @Delimiter + n.Tuples, RID

,CAST(n.value + t.VAL AS MONEY)

FROM UNIQUEnTuples n

CROSS APPLY (

SELECT RID, VAL

FROM #data t

WHERE t.RID < n.ID) t

WHERE n.value + t.VAL <= @TargetTotal

)

SELECT [Soln No]=ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

,ExcludedRIDs=Tuples, value

INTO #Temp

FROM UNIQUEnTuples

WHERE value = @ActualTotal - @ReferenceTotal

-- Display the solutions to the converse problem

SELECT * FROM #Temp

;WITH MyData AS (

SELECT [Soln No], RID

FROM #Temp

CROSS APPLY #data

EXCEPT

SELECT [Soln No], Item

FROM #Temp

CROSS APPLY dbo.DelimitedSplit8K(Tuples, @Delimiter)

)

SELECT [Soln No]

,IncludedRIDs=STUFF((

SELECT ',' + CAST(RID AS VARCHAR(10))

FROM MyData b

WHERE a.[Soln No] = b.[Soln No]

ORDER BY RID

FOR XML PATH('')), 1, 1, '')

FROM MyData a

GROUP BY [Soln No]

What I have done is to break the problem into 2 parts:

- Identify the RIDs that should be excluded to match the reference total provided by Lowell.

- Unpack those items from the full bin using EXCEPT.

- What is left is then STUFF/FOR XML PATH grouped into the two resulting records as the IncludedRIDs.

The solution may be a little more general than Mr. Magoo's (not sure), possibly a bit "prettier" (beauty is in the eye of the beholder) and certainly not as speedy (runs in about 17 seconds for me) but it was fun intellectually. The second step proved a bit more challenging than I expected until I switched my approach to using EXCEPT (before that I was doing something utterly stupid).

It would probably run faster if it didn't need to recurse to 5 levels to identify the 5 excluded RIDs, so is an option to consider when there's only a handful of RIDs you expect to be excluded.

UNIQUEnTuples is based on the article I quoted earlier with a couple of speed improvements that I posted today: http://www.sqlservercentral.com/Forums/Topic1301485-3122-5.aspx.

I need to thank you guys for the speed improvements. It was this problem and the slow run time of Mr. Magoo's original effort that made me go back and try some of the things I'd been thinking about.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St