determine which transactions add up to a specific value?

  • I swear I've done this before, but I'm drawing a blank today.

    in this example, one or more detail transactions should add up together to be a specific value; but it's not known which transactions should be included.

    so in the example below, i have 23 individual transactions, and some subset of them will add up to exactly 244,194.49

    can anyone point me to an example where they determine the transactions involved, since they would add up to a specific value?

    With MySampleData(RID,VAL)

    AS

    (

    SELECT 2172, 429.00 UNION ALL

    SELECT 2173, 20555.00 UNION ALL

    SELECT 2174, 5855.71 UNION ALL

    SELECT 2175, 13892.89 UNION ALL

    SELECT 2194, 36175.10 UNION ALL

    SELECT 2188, 5451.32 UNION ALL

    SELECT 2189, 566.00 UNION ALL

    SELECT 2190, 706.40 UNION ALL

    SELECT 2191, 19344.21 UNION ALL

    SELECT 2192, 16600.00 UNION ALL

    SELECT 2193, 52775.10 UNION ALL

    SELECT 2182, 13320.34 UNION ALL

    SELECT 2183, 1454.00 UNION ALL

    SELECT 2184, 15960.00 UNION ALL

    SELECT 2185, 18365.00 UNION ALL

    SELECT 2186, 18646.00 UNION ALL

    SELECT 2187, 21279.72 UNION ALL

    SELECT 2176, 17525.00 UNION ALL

    SELECT 2177, 19796.00 UNION ALL

    SELECT 2178, 20573.00 UNION ALL

    SELECT 2179, 21064.00 UNION ALL

    SELECT 2180, 21252.00 UNION ALL

    SELECT 2181, 7503.11

    )

    --theoretically, multiple val items add up to $244,194.49

    --23 rows max, since there's 23 entries in this example

    SELECT row_number() OVER (ORDER BY RID) AS RW,

    RID,VAL

    FROM MySampleData

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Will they be sequential, or can it be any combination of rows?

    Jared
    CE - Microsoft

  • SQLKnowItAll (10/2/2012)


    Will they be sequential, or can it be any combination of rows?

    definitely non sequential/gaps are involved;

    i'm trying a mini tally table and n-1 transactions, but i keep thinking i need to joint he table agaisnt itself 23 times(for 23 row?), and i know that's not right;

    i'm feeling dumb today, for sure.

    the actual biz problem is one system shows the transactions were applied at a certain dollar amount, but the otehr system ahs them as unapplied;

    trying to determine which details are involved is a bit of a pain, and trying the eyeball method didn't work so far.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • There are two solutions to this amount:

    2172,2173,2174, 2176,2177,2178,2179,2180,2181,2182,2183,2184,2185,2186,2187, 2189,2190,2191

    2172,2173,2174,2175,2176,2177,2178,2179,2180,2181,2182,2183,2184,2185,2186,2187,2188,2189,2190

    I have formatted them like that to highlight the differences.

    The query is damn ugly and runs like a pig with all four legs roughly hacked away at the knee, taking over 2 minutes to find the solutions...

    IF OBJECT_ID('tempdb..#data') is null

    BEGIN

    ;WITH MySampleData(RID,VAL )

    AS

    (

    SELECT 2172, 429.00 UNION ALL

    SELECT 2173, 20555.00 UNION ALL

    SELECT 2174, 5855.71 UNION ALL

    SELECT 2175, 13892.89 UNION ALL

    SELECT 2194, 36175.10 UNION ALL

    SELECT 2188, 5451.32 UNION ALL

    SELECT 2189, 566.00 UNION ALL

    SELECT 2190, 706.40 UNION ALL

    SELECT 2191, 19344.21 UNION ALL

    SELECT 2192, 16600.00 UNION ALL

    SELECT 2193, 52775.10 UNION ALL

    SELECT 2182, 13320.34 UNION ALL

    SELECT 2183, 1454.00 UNION ALL

    SELECT 2184, 15960.00 UNION ALL

    SELECT 2185, 18365.00 UNION ALL

    SELECT 2186, 18646.00 UNION ALL

    SELECT 2187, 21279.72 UNION ALL

    SELECT 2176, 17525.00 UNION ALL

    SELECT 2177, 19796.00 UNION ALL

    SELECT 2178, 20573.00 UNION ALL

    SELECT 2179, 21064.00 UNION ALL

    SELECT 2180, 21252.00 UNION ALL

    SELECT 2181, 7503.11

    )

    SELECT RID,VAL

    INTO #data

    FROM MySampleData

    CREATE UNIQUE CLUSTERED INDEX ix1 ON #data (RID);

    END

    -- little hack - i ran the query including only items with non-zero "pennies/cents" - looking for combinations that ended in ".49"

    -- this told me that nothing higher than 2181 could be involved in this particular solution, so I could include a WHERE RID<=2181 in this case.

    ;WITH rec AS

    (

    SELECT RID,CONVERT(DECIMAL(30,2),VAL) VAL,CONVERT(VARCHAR(4000),'') as items,RID as RID2

    FROM #data

    WHERE RID<=2181

    UNION ALL

    SELECT rec.RID,CONVERT(DECIMAL(30,2),rec.VAL + MySampleData.VAL),CONVERT(VARCHAR(4000),items+','+CONVERT(VARCHAR(4),MySampleData.RID)),MySampleData.RID

    FROM rec

    JOIN #data MySampleData

    ON MySampleData.RID> rec.RID2 -- don't go over items you already covered

    AND rec.VAL + MySampleData.VAL<=244194.49 -- no point adding up stuff that exceeds the target!

    )

    SELECT *

    FROM rec

    WHERE VAL = 244194.49 -- just in case it was out by a small amount...

    OPTION(MAXRECURSION 0)

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • Well then, it is possible to have 23! combinations, I believe. You will have to have a pretty sophisticated algorithm to figure this out. I would think this is a CLR thing and not an SQL thing.

    Jared
    CE - Microsoft

  • Lowell - Mister Magoo has pointed you in the right direction and his analysis to reduce the initial rowset based on the cents was really good.

    His approach is similar to the second script in this article: http://www.sqlservercentral.com/articles/sql+n-Tuples/89809/ (for future reference).


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thanks for the validation dwain, and having read your thread, I thought that Jeff's preference for the WHILE loop must have some mileage, so I tried to speed things up and took the best of both worlds....a recursive CTE inside a while loop...

    Now, this optimisation is pretty specific to this particular problem, so it is by no means meant as a general solution.

    That said, it does bring the solution duration down to zero seconds, from over 2 minutes, so it's not bad...:-D

    IF OBJECT_ID('tempdb..#data') is null

    BEGIN

    ;WITH MySampleData(RID,VAL )

    AS

    (

    SELECT 2172, 429.00 UNION ALL

    SELECT 2173, 20555.00 UNION ALL

    SELECT 2174, 5855.71 UNION ALL

    SELECT 2175, 13892.89 UNION ALL

    SELECT 2194, 36175.10 UNION ALL

    SELECT 2188, 5451.32 UNION ALL

    SELECT 2189, 566.00 UNION ALL

    SELECT 2190, 706.40 UNION ALL

    SELECT 2191, 19344.21 UNION ALL

    SELECT 2192, 16600.00 UNION ALL

    SELECT 2193, 52775.10 UNION ALL

    SELECT 2182, 13320.34 UNION ALL

    SELECT 2183, 1454.00 UNION ALL

    SELECT 2184, 15960.00 UNION ALL

    SELECT 2185, 18365.00 UNION ALL

    SELECT 2186, 18646.00 UNION ALL

    SELECT 2187, 21279.72 UNION ALL

    SELECT 2176, 17525.00 UNION ALL

    SELECT 2177, 19796.00 UNION ALL

    SELECT 2178, 20573.00 UNION ALL

    SELECT 2179, 21064.00 UNION ALL

    SELECT 2180, 21252.00 UNION ALL

    SELECT 2181, 7503.11

    )

    SELECT RID,VAL

    INTO #data

    FROM MySampleData

    CREATE UNIQUE CLUSTERED INDEX ix1 ON #data (RID);

    END

    IF OBJECT_ID('tempdb..#bits') IS NOT NULL

    DROP TABLE #bits;

    DECLARE

    @val DECIMAL(30,2)

    ,@MaxRID BIGINT

    ,@val2 DECIMAL(30,2)

    ,@id int

    ,@items VARCHAR(4000)

    SET @val = 244194.49

    -- little hack - run the query including only items with non-zero "pennies/cents" - looking for combinations that end in the required "pennies/cents"

    -- store the results so we can reduce the number of items on the next run

    ;WITH rec AS

    (

    SELECT RID,CONVERT(DECIMAL(30,2),VAL) VAL,CONVERT(VARCHAR(4000),'') as items,RID as RID2

    FROM #data

    -- only interested in non-whole numbers here

    WHERE VAL>FLOOR(VAL)

    UNION ALL

    SELECT rec.RID,CONVERT(DECIMAL(30,2),rec.VAL + MySampleData.VAL),CONVERT(VARCHAR(4000),items+','+CONVERT(VARCHAR(4),MySampleData.RID)),MySampleData.RID

    FROM rec

    JOIN #data MySampleData

    -- don't go over items you already covered

    ON MySampleData.RID> rec.RID2

    -- no point adding up stuff that exceeds the target!

    AND rec.VAL + MySampleData.VAL<=@val

    -- only interested in non-whole numbers here

    WHERE MySampleData.VAL>FLOOR(MySampleData.VAL)

    )

    SELECT identity(int,1,1) as id,RID,VAl,items

    INTO #bits

    FROM rec

    WHERE (VAL-FLOOR(VAL)) = (@val-FLOOR(@val))

    OPTION(MAXRECURSION 0)

    select * from #bits

    IF OBJECT_ID('tempdb..#results') IS NOT NULL

    DROP TABLE #results;

    CREATE TABLE #results(matched varchar(4000));

    -- Now use a loop to calculate each possible set based on the knowledge we already have about non-whole numbers

    SELECT TOP 1 @id = id, @MaxRID = RID,@val2=VAL,@items=CONVERT(VARCHAR(10),RID)+items+',' FROM #bits ORDER BY id

    WHILE @@ROWCOUNT=1

    BEGIN

    -- find combinations of whole numbers that satisfy the remaining requirement

    ;WITH rec AS

    (

    SELECT RID,CONVERT(DECIMAL(30,2),VAL) VAL,CONVERT(VARCHAR(4000),'') as items,RID as RID2

    FROM #data d

    WHERE 1=1

    -- No point checking RIDs that come after the one we are interested in, is there?

    AND RID<=@MaxRID

    -- whole numbers only please

    AND VAL=FLOOR(VAL)

    UNION ALL

    SELECT rec.RID,CONVERT(DECIMAL(30,2),rec.VAL + MySampleData.VAL),CONVERT(VARCHAR(4000),items+','+CONVERT(VARCHAR(4),MySampleData.RID)),MySampleData.RID

    FROM rec

    JOIN #data MySampleData

    -- don't go over items you already covered

    ON MySampleData.RID> rec.RID2

    -- no point adding up stuff that exceeds the target!

    AND (MySampleData.VAL+rec.VAL)<=(@val-@val2)

    -- whole numbers only please

    AND MySampleData.VAL=FLOOR(MySampleData.VAL)

    )

    INSERT #results(matched)

    SELECT CONVERT(VARCHAR(10),r.RID)+r.items+','+@items

    FROM rec r

    WHERE r.VAL + @val2 =@val

    OPTION(MAXRECURSION 0, FAST 1)

    SELECT TOP 1 @id = id, @MaxRID = RID,@val2=VAL,@items=CONVERT(VARCHAR(10),RID)+items+',' FROM #bits WHERE id>@id ORDER BY id

    END

    SELECT @val as TargetAmount,matched

    FROM #results

    Edit: Included a mistake made while tidying the code...now fixed.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • MM,

    I don't know if you're familiar with the "Sweden Redistricting" problem, but it is mentioned at the bottom of the page in this thread: http://www.sqlservercentral.com/Forums/Topic1318149-392-5.aspx.

    I spent two weeks working on a solution to it that ran in what I considered a reasonable period of time, using exactly the approach you used in your latest post to solve Lowell's problem. Essentially, both are bin packing problems:

    1. Sweden re-districting: pack the bins, first with doublets and then with triplets

    2. Lowell's problem: pack the bins with values containing fractional amounts and then pack them with the remaining values that are whole numbers.

    I didn't post my final solution to the Sweden problem. If you're interested, PM me.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • ahh "bin packing" was the keyword; I've got a lot of saved scripts, could't remeber the label for it.

    I wrote a script i wrote long ago, but my version involved while loops instead of rCTE's

    thank you everyone for the help and the articles again!

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Dwain,

    I did have a read of some of the Swedish districts thread, but not in that much detail.

    I guess the general approach with these multi-level recursion problems is to reduce the size of the sets as often as possible, however suits the problem - in your case doublets first ,in this case fractional amounts first.

    The key is in spotting how you can do that reduction and finding a way to apply it in a multi-level solution.

    I was on the brink of looking at a solution for the Swedish District problem, because it looked interesting, but it may have to wait a while 🙂

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • I had a bit of an epiphany today that I thought I would share. I figure, if Mr. Magoo can use tricks to speed up the solution, I can use a trick to the solution approach. So instead of thinking of this as a bin packing problem, let's think of it conversely as a bin unpacking problem.

    Consider the following (set up data omitted but courtesy of Mr. Magoo):

    DECLARE @ReferenceTotal MONEY = 244194.49

    ,@Delimiter CHAR(1) = ','

    ,@ActualTotal MONEY = (

    SELECT SUM(VAL)

    FROM #data)

    DECLARE @TargetTotal MONEY = @ActualTotal - @ReferenceTotal

    ;WITH UNIQUEnTuples (n, Tuples, ID, value) AS (

    SELECT DISTINCT 1, CAST(RID AS VARCHAR(8000)), RID, CAST(VAL AS MONEY)

    FROM #data

    WHERE VAL <= @TargetTotal

    UNION ALL

    SELECT 1 + n.n, CAST(t.RID AS VARCHAR(8000)) + @Delimiter + n.Tuples, RID

    ,CAST(n.value + t.VAL AS MONEY)

    FROM UNIQUEnTuples n

    CROSS APPLY (

    SELECT RID, VAL

    FROM #data t

    WHERE t.RID < n.ID) t

    WHERE n.value + t.VAL <= @TargetTotal

    )

    SELECT [Soln No]=ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    ,ExcludedRIDs=Tuples, value

    INTO #Temp

    FROM UNIQUEnTuples

    WHERE value = @ActualTotal - @ReferenceTotal

    -- Display the solutions to the converse problem

    SELECT * FROM #Temp

    ;WITH MyData AS (

    SELECT [Soln No], RID

    FROM #Temp

    CROSS APPLY #data

    EXCEPT

    SELECT [Soln No], Item

    FROM #Temp

    CROSS APPLY dbo.DelimitedSplit8K(Tuples, @Delimiter)

    )

    SELECT [Soln No]

    ,IncludedRIDs=STUFF((

    SELECT ',' + CAST(RID AS VARCHAR(10))

    FROM MyData b

    WHERE a.[Soln No] = b.[Soln No]

    ORDER BY RID

    FOR XML PATH('')), 1, 1, '')

    FROM MyData a

    GROUP BY [Soln No]

    What I have done is to break the problem into 2 parts:

    - Identify the RIDs that should be excluded to match the reference total provided by Lowell.

    - Unpack those items from the full bin using EXCEPT.

    - What is left is then STUFF/FOR XML PATH grouped into the two resulting records as the IncludedRIDs.

    The solution may be a little more general than Mr. Magoo's (not sure), possibly a bit "prettier" (beauty is in the eye of the beholder) and certainly not as speedy (runs in about 17 seconds for me) but it was fun intellectually. The second step proved a bit more challenging than I expected until I switched my approach to using EXCEPT (before that I was doing something utterly stupid).

    It would probably run faster if it didn't need to recurse to 5 levels to identify the 5 excluded RIDs, so is an option to consider when there's only a handful of RIDs you expect to be excluded.

    UNIQUEnTuples is based on the article I quoted earlier with a couple of speed improvements that I posted today: http://www.sqlservercentral.com/Forums/Topic1301485-3122-5.aspx.

    I need to thank you guys for the speed improvements. It was this problem and the slow run time of Mr. Magoo's original effort that made me go back and try some of the things I'd been thinking about.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Viewing 11 posts - 1 through 10 (of 10 total)

    You must be logged in to reply to this topic. Login to reply