Packing Joins

  • Hi, I need to join two tables in what I think is a 'Packing Join' as described by Joe Celko. I have his book, I read the code, but I'm pretty much lost.

    Can anyone point me to any other resourses.

    Essentially I have a table of credits that I want to join to debits based on equality of absolute value, but I only want each credit to be used once, I want to avoid a cursor, does that sound like a 'Packing Join'?

    Thanks

    Allen

  • I don't know what Celko's "Packing Join" is, but I think you're gonna need another column like a date/time or and IDENTITY column even if you were to use a cursor.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    I'm not sure if 'Packing Join' is the correct name but it's all I have right now. Being as I got your interest I'll be cheeky and post some DDL.

    CREATE TABLE #Debit

    (AccountNumber INT,

    DebitID INT,

    DebitAmount MONEY)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,1,10.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,2,10.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,4,10.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,6,10.00)

    CREATE TABLE #Credit

    (AccountNumber INT,

    CreditID INT,

    CreditAmount MONEY)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (10,3,-10.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (10,5,-10.00)

    --ID is unique accross both tables, match lowest allocate lowest CreditID to lowest DebitID

    /* Desired output

    AccountNumberDebitIDDebitAmmountCreditIDCreditAmount

    10110.003-10.00

    10210.005-10.00

    */

    Allen

  • I can do this

    SELECT c.AccountNumber, MIN(d.debitID) AS debitID, c.CreditID

    FROM #Credit c JOIN #Debit d

    ON ABS(c.CreditAmount) = d.DebitAmount

    WHERE d.debitID < c.CreditID

    GROUP BY c.AccountNumber, c.CreditID

    and get this but the second row is matching DebitID 1 which has already been joined on the previous row, it needs to match DebitID 2

    AccountNumberDebitIDDebitAmmountCreditIDCreditAmount

    10110.003-10.00

    10110.005-10.00

  • Assuming your using SQL 2005, here is one of ways using ROW_NUMBER()....

    SELECTC.AccountNumber, D.DebitID, D.DebitAmount, C.CreditID, C.CreditAmount

    FROM(

    SELECTROW_NUMBER() OVER( PARTITION BY AccountNumber ORDER BY CreditID ) AS RowNum,

    AccountNumber, CreditID, CreditAmount

    FROM#Credit

    ) C

    INNER JOIN

    (

    SELECTROW_NUMBER() OVER( PARTITION BY AccountNumber ORDER BY DebitID ) AS RowNum,

    AccountNumber, DebitID, DebitAmount

    FROM#Debit

    ) D ON C.AccountNumber = D.AccountNumber AND C.RowNum = D.RowNum

    --Ramesh


  • Doh. Thank you Ramesh. I had'nt thought of that. It works a treat plus I even understand it !

    Allen

  • Allen,

    I'm not 100% sure because I don't have 2k5, but I don't believe that Ramesh's wonderfully easy to read code is going to do the trick for you because it makes no comparison to the amount values themselves. It works for now because all the values are 10, but think it will fail if the values change and there is more data for each account...

    Ramesh, nice job. It gives the correct answer for what the data condition currently is, but like I said above, I'm not sure it'll stand for "different data" in larger amounts.

    And, I appologize to you both... I just wanted to see what SSCC sent me in the mail and I don't have time to work on this just now... I'm on my way to work...

    But, (here's a horrible thought) this may be one of those very rare instances where procedural code is required in the form of a double nested loop. I may have a way to make a very fast one and it won't look like a loop, but it will still be procedural, none the less.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    I don't think we need to compare the amounts too, as i can see that this is an accounting related information.:)

    ID is unique accross both tables, match lowest allocate lowest CreditID to lowest DebitID

    Here is the simple test with a couple of accounts added..

    IF ( OBJECT_ID( 'tempdb..#Debit' ) IS NOT NULL )

    DROP TABLE #Debit

    IF ( OBJECT_ID( 'tempdb..#Credit' ) IS NOT NULL )

    DROP TABLE #Credit

    CREATE TABLE #Debit

    (

    AccountNumber INT,

    DebitID INT,

    DebitAmount MONEY

    )

    CREATE TABLE #Credit

    (

    AccountNumber INT,

    CreditID INT,

    CreditAmount MONEY

    )

    IF ( OBJECT_ID( 'tempdb..#Debit' ) IS NOT NULL )

    DROP TABLE #Debit

    IF ( OBJECT_ID( 'tempdb..#Credit' ) IS NOT NULL )

    DROP TABLE #Credit

    CREATE TABLE #Debit

    (

    AccountNumber INT,

    DebitID INT PRIMARY KEY CLUSTERED,

    DebitAmount MONEY

    )

    CREATE TABLE #Credit

    (

    AccountNumber INT,

    CreditID INT PRIMARY KEY CLUSTERED,

    CreditAmount MONEY

    )

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,1,10.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,2,15.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,4,89.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,6,96.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (20,14,586.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (20,17,6.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (20,24,56.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (20,28,38.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (30,31,81.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (30,37,77.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (30,47,988.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (10,3,-97.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (10,5,-36.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (10,9,-97.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (20,13,-297.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (20,15,-736.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (20,19,-917.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (20,25,-977.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (30,29,-27.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (30,35,-136.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (30,45,-47.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (30,55,-7.00)

    SELECTC.AccountNumber, D.DebitID, D.DebitAmount, C.CreditID, C.CreditAmount

    FROM(

    SELECTROW_NUMBER() OVER( PARTITION BY AccountNumber ORDER BY CreditID ) AS RowNum,

    AccountNumber, CreditID, CreditAmount

    FROM#Credit

    ) C

    INNER JOIN

    (

    SELECTROW_NUMBER() OVER( PARTITION BY AccountNumber ORDER BY DebitID ) AS RowNum,

    AccountNumber, DebitID, DebitAmount

    FROM#Debit

    ) D ON C.AccountNumber = D.AccountNumber AND C.RowNum = D.RowNum

    I'm curious to see your version of the same.:cool:

    --Ramesh


  • Hi Both,

    I probably did not give you the full story. I'm only matching credits & debits where the absolute values are the same so I think I can include the amount (debitamount or creditamount) in the PARTITION BY clause and JOIN on the absolute amounts.

    DROP TABLE #Debit

    DROP TABLE #Credit

    CREATE TABLE #Debit

    (AccountNumber INT,

    DebitID INT,

    DebitAmount MONEY)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,1,10.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,2,10.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,4,10.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,6,10.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,8,10.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,9,10.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,10,10.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,12,20.00)

    INSERT INTO #Debit (AccountNumber, DebitID, DebitAmount) VALUES (10,13,20.00)

    CREATE TABLE #Credit

    (AccountNumber INT,

    CreditID INT,

    CreditAmount MONEY)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (10,3,-10.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (10,5,-10.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (10,7,-10.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (10,11,-10.00)

    INSERT INTO #Credit (AccountNumber, CreditID, CreditAmount) VALUES (10,14,-20.00)

    -- from Ramesh

    SELECTC.AccountNumber, D.DebitID, D.DebitAmount, C.CreditID, C.CreditAmount

    FROM( SELECTROW_NUMBER() OVER( PARTITION BY AccountNumber, CreditAmount ORDER BY CreditID ) AS RowNum,

    AccountNumber,

    CreditID,

    CreditAmount FROM#Credit ) C

    INNER JOIN

    ( SELECTROW_NUMBER() OVER( PARTITION BY AccountNumber, DebitAmount ORDER BY DebitID ) AS RowNum,

    AccountNumber,

    DebitID,

    DebitAmountFROM#Debit ) D

    ON C.AccountNumber = D.AccountNumber AND C.RowNum = D.RowNum AND ABS(C.CreditAmount) = D.DebitAmount

    Allen

  • Alen,

    You proved me wrong on what i said. Thats why we people say post exactly what do you want. Anyways, it that solved all of your queries?

    Jeff,

    My apologies to you for incorrect comments.:crying: Anyhow, im still curious to see your version of it.

    --Ramesh


  • Hi Ramesh,

    Thanks - I'm completely sorted on this.

    Completely understand about painting the full picture but sometimes I want to precis the info to make it digestible (or no one will read it !).

    I did write

    a table of credits that I want to join to debits based on equality of absolute value, but I only want each credit to be used once

    PS The 'Packing Join' was a complete red herring.

    Allen

  • Jeff,

    My apologies to you for incorrect comments. Anyhow, im still curious to see your version of it.

    No, no... not a problem, Ramesh. And, the RowNumber Over Partition to get repeating sequences was a great idea. Real pain in the hiney to do in 2k unless you use a triangular join which, as you know, can be a real performance killer in the face of scalability.

    My solution was going to be to put debits and credits in two separate tables and use the proprietary form of UPDATE SET @variable=Column=Formula to work. The RowNumber solution you guys came up with does exactly the same thing except it does it all in one nice query.

    The "procedural code" I was talking about was the proprietary form of UPDATE I talk about in the paragraph above.

    Great job, guys. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thats why its easy to work when you have the latest technology.:)

    May be its time to update your box too.....:D:D

    --Ramesh


Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply