Matching two Data Sets

  • I have two tables - one with sales and another with payments against those. The payment may not match the exact amount of sales and I have to use FIFO method to apply payments. The payment month must >= sales month.

    How can i write a query to do this? Examples are as below. Thanks in advance for help.

    Table 1

    Sales Sale DateSale Amt

    1Jun-141200

    2Oct-142400

    3Dec-14600

    4Feb-1512000

    Table 2

    Pay Month Pay YearPay Amount

    5 2014 300

    6 2014 1000

    10 2014 500

    11 2014 2000

    12 2014 300

    1 2015 900

    create table tbl1

    (

    saleNo int

    ,saleDate date

    ,saleAmt float

    )

    insert into tbl1 (saleNo, saleDate, saleAmt)

    values (1, '2014-06-01',1200)

    insert into tbl1 (saleNo, saleDate, saleAmt)

    values (2, '2014-10-01',2400)

    insert into tbl1 (saleNo, saleDate, saleAmt)

    values (3, '2014-12-01',600)

    insert into tbl1 (saleNo, saleDate, saleAmt)

    values (4, '2015-02-01',12000)

    create table tbl2

    (

    payMonth int

    ,payYear int

    ,payAmt float

    )

    insert into tbl2 (payMonth, payYear, payAmt)

    values (5, 2014,300)

    insert into tbl2 (payMonth, payYear, payAmt)

    values (6, 2014,1000)

    insert into tbl2 (payMonth, payYear, payAmt)

    values (10, 2014,500)

    insert into tbl2 (payMonth, payYear, payAmt)

    values (11, 2014,2000)

    insert into tbl2 (payMonth, payYear, payAmt)

    values (12, 2014,300)

    insert into tbl2 (payMonth, payYear, payAmt)

    values (1, 2014,900)

  • Using the DDL and example data that you provided can you provide an example of what the correct output would look like?

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • This might do it for you.

    create table #tbl1

    (

    saleNo int

    ,saleDate date

    ,saleAmt float

    )

    insert into #tbl1 (saleNo, saleDate, saleAmt)

    values (1, '2014-06-01',1200)

    insert into #tbl1 (saleNo, saleDate, saleAmt)

    values (2, '2014-10-01',2400)

    insert into #tbl1 (saleNo, saleDate, saleAmt)

    values (3, '2014-12-01',600)

    insert into #tbl1 (saleNo, saleDate, saleAmt)

    values (4, '2015-02-01',12000)

    create table #tbl2

    (

    payMonth int

    ,payYear int

    ,payAmt float

    -- Combine date parts to get a real date to work with

    ,paydate AS (CAST(DATEADD(month, payMonth-1, DATEADD(year, payYear-1900, 0)) AS DATE))

    );

    insert into #tbl2 (payMonth, payYear, payAmt)

    values (5, 2014,300)

    insert into #tbl2 (payMonth, payYear, payAmt)

    values (6, 2014,1000)

    insert into #tbl2 (payMonth, payYear, payAmt)

    values (10, 2014,500)

    insert into #tbl2 (payMonth, payYear, payAmt)

    values (11, 2014,2000)

    insert into #tbl2 (payMonth, payYear, payAmt)

    values (12, 2014,300)

    insert into #tbl2 (payMonth, payYear, payAmt)

    values (1, 2015,900);

    CREATE TABLE #tbl3

    (

    SaleNo INT

    ,SaleDate DATE

    ,SaleAmt FLOAT

    ,paydate DATE

    ,paymentAmt FLOAT

    ,RemainingSale FLOAT

    ,RemainingPmt FLOAT

    );

    DECLARE @Dummy INT = 1; -- To set @@ROWCOUNT to 1 (first pass)

    WHILE @@ROWCOUNT > 0

    BEGIN

    WITH NewTbl1 AS

    (

    SELECT saleNo, saleDate=ISNULL(paydate, SaleDate)

    ,saleAmt=ISNULL(RemainingSale, saleAmt)

    FROM #tbl1 a

    CROSS APPLY

    (

    SELECT TOP 1 RemainingSale, paydate

    FROM #tbl3 b

    WHERE a.SaleNo = b.SaleNo

    ORDER BY saleno DESC, paydate DESC

    ) b

    WHERE b.RemainingSale > 0

    UNION ALL

    SELECT saleNo, saleDate, saleAmt

    FROM #tbl1

    WHERE saleNo NOT IN (SELECT saleNo FROM #tbl3)

    ),

    NewTbl2 AS

    (

    SELECT b.paydate, payAmt=RemainingPmt

    FROM #tbl2 a

    CROSS APPLY

    (

    SELECT TOP 1 RemainingPmt, paydate

    FROM #tbl3 b

    WHERE a.paydate = b.paydate

    ORDER BY saleno DESC, paydate DESC

    ) b

    WHERE b.RemainingPmt > 0

    UNION ALL

    SELECT paydate, payAmt

    FROM #tbl2

    WHERE paydate NOT IN (SELECT paydate FROM #tbl3)

    )

    INSERT INTO #tbl3

    SELECT TOP 1 saleNo, saleDate, saleAmt, b.paydate, paymentAmt=payAmt

    ,RemainingSale = saleAmt - paidAmt

    ,RemainingPmt = payAmt - paidAmt

    FROM NewTbl1 a

    CROSS APPLY

    (

    SELECT TOP 1 paydate, payAmt

    FROM NewTbl2 b

    WHERE b.paydate > a.SaleDate

    ORDER BY b.paydate

    ) b

    CROSS APPLY

    (

    SELECT paidAmt=CASE WHEN payAmt >= saleAmt THEN saleAmt ELSE payAmt END

    ) c

    ORDER BY saleNo;

    END

    SELECT * FROM #tbl3;

    GO

    DROP TABLE #tbl1;

    DROP TABLE #tbl2;

    DROP TABLE #tbl3;

    I once saw Paul White come up with a brilliant recursive CTE for matching payments like this. Too bad I can't find that thread.

    Edit: Alan - this would be an example of a set-based loop were you to do it by customer.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I forgot that I wanted to point out a few problems with the requirements:

    - You have no primary keys on your tables.

    - Your payments table has the date split into year and month parts. You can see where I added a computed column to have a real DATE datatype to facilitate the lookups. That could probably be made persisted and indexed accordingly.

    - FLOAT is not a good choice for amount of payment. I use MONEY a lot, but DECIMAL(10,2) or something like it should work well also --- Added on edit ---

    Had I put a PRIMARY KEY on #tbl3, this would have probably resulted in the query plans adding Halloween protection. So I'd look into ways of avoiding that.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thanks Dwain for the solution and suggestions. I am working on understanding and adapting the solution to my data.

  • tinausa (4/29/2015)


    Thanks Dwain for the solution and suggestions. I am working on understanding and adapting the solution to my data.

    I hope it works for you. I failed to mention, there is no guarantee of performance.

    I played around yesterday with some other methods (even found the thread by Paul White I mentioned, which ended up not being relevant), but this is a tricky problem in SQL.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply