I want to left join a single time...

Question

I want to left join a single time...

Tim Howell

Grasshopper

Points: 13
More actions
October 21, 2006 at 12:55 am

#114890

Part of my database is structured as follows:
REG table has a column called REG_INVOICE_ID, which is a unique key, and REG_PID, which identifies which user owns that REG record.
INVOICE table has a primary key (INVOICE_KEY) and several other columns, including INVOICE_ID (references REG_INVOICE_ID) and INVOICE_AMOUNT.
PAYMENT table has a primary key and several other columns, including PAYMENT_INVOICE_ID (references REG_INVOICE_ID/INVOICE_ID) and PAYMENT_AMOUNT.
I want a query that given a user (REG_PID) returns INVOICE_ID, SUM(INVOICE_AMOUNT), and SUM(INVOICE_AMOUNT) - SUM(PAYMENT_AMOUNT) for each invoice for that user. I've tried the following, but it doesn't work because a single payment gets joined (and therefore summed) multiple times if there are multiple records with the same INVOICE_ID.
I'd really appreciate any help, =)
--TWH
SELECT INVOICE_ID,SUM(INVOICE_AMOUNT),(SUM(INVOICE_AMOUNT) - (CASE WHEN SUM(PAYMENT_AMOUNT) IS NULL THEN 0 ELSE SUM(PAYMENT_AMOUNT) END)) AS invoice_balance
FROM REG
JOIN INVOICE
ON REG_INVOICE_ID=INVOICE_ID
LEFT JOIN PAYMENT
ON INVOICE_ID=PAYMENT_INVOICE_ID
WHERE REG_PID=some_pid
GROUP BY INVOICE_ID;

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Michael Meierruth SSChampion Points: 10051 More actions · Answer 1

Sometime the best way to see what you are selecting is to remove the SUM and GROUP BY stuff. Thus when I throw some data at it (see the end of this note) and run the query

SELECT REG_PID,INVOICE_ID,INVOICE_AMOUNT,PAYMENT_AMOUNT

FROM REG

JOIN INVOICE ON REG_INVOICE_ID=INVOICE_ID

LEFT JOIN PAYMENT ON INVOICE_ID=PAYMENT_INVOICE_ID

WHERE REG_PID=1

I get

1 1 100 400

1 1 100 500

1 1 200 400

1 1 200 500

1 1 300 400

1 1 300 500

which is obviously wrong because there is a kind of serial effect going on with these JOINS, i.e. for every REG-INVOICE link there is an INVOICE-PAYMENT link. I don't know if this is the best approach, but I would summarize first and then join as in:

SELECT INVOICE_ID,SUM(INVOICE_AMOUNT),

(SUM(INVOICE_AMOUNT) - (CASE WHEN SUM(PAYMENT_AMOUNT) IS NULL THEN 0 ELSE SUM(PAYMENT_AMOUNT) END)) AS invoice_balance

FROM REG

JOIN

(

select INVOICE_ID,sum(INVOICE_AMOUNT) INVOICE_AMOUNT

from INVOICE

group by INVOICE_ID

) i on REG_INVOICE_ID=i.INVOICE_ID

RIGHT JOIN

(

select PAYMENT_INVOICE_ID,sum(PAYMENT_AMOUNT) PAYMENT_AMOUNT

from PAYMENT

group by PAYMENT_INVOICE_ID

) p on REG_INVOICE_ID=p.PAYMENT_INVOICE_ID

WHERE REG_PID=1

GROUP BY INVOICE_ID

Hope this gets you on the right track.

Data used:

drop table REG

create table REG

(

REG_INVOICE_ID int,

REG_PID int

)

insert into REG values(1,1)

drop table INVOICE

create table INVOICE

(

INVOICE_KEY int,

INVOICE_ID int,

INVOICE_AMOUNT int

)

insert into INVOICE values(1,1,100)

insert into INVOICE values(2,1,200)

insert into INVOICE values(3,1,300)

drop table PAYMENT

create table PAYMENT

(

PAYMENT_KEY int,

PAYMENT_INVOICE_ID int,

PAYMENT_AMOUNT int

)

insert into PAYMENT values(1,1,400)

insert into PAYMENT values(2,1,500)

Michael Meierruth SSChampion Points: 10051 More actions · Answer 2

Oops, I meant to say LEFT JOIN and not RIGHT JOIN up there. You might also want to join the sub-selects to REG in order to be able to say WHERE REG_PID=1.

But a totally different approach making use of the UNION operator is the following:

SELECT INVOICE_ID,SUM(INVOICE_AMOUNT),

(SUM(INVOICE_AMOUNT) - (CASE WHEN SUM(PAYMENT_AMOUNT) IS NULL THEN 0 ELSE SUM(PAYMENT_AMOUNT) END)) AS invoice_balance

FROM REG

JOIN

(

select INVOICE_ID,sum(INVOICE_AMOUNT) INVOICE_AMOUNT,0 PAYMENT_AMOUNT

from REG,INVOICE

where REG_INVOICE_ID=INVOICE_ID and REG_PID=1

group by INVOICE_ID

union

select PAYMENT_INVOICE_ID,0,sum(PAYMENT_AMOUNT) PAYMENT_AMOUNT

from REG,PAYMENT

where REG_INVOICE_ID=PAYMENT_INVOICE_ID and REG_PID=1

group by PAYMENT_INVOICE_ID

) a on REG_INVOICE_ID=INVOICE_ID

WHERE REG_PID=1

GROUP BY INVOICE_ID

Michael Meierruth SSChampion Points: 10051 More actions · Answer 3

By the way, when checking for nulls you are better off using the ISNULL function instead of CASE. And in the previous example it's no longer even needed.

Vladan SSC-Insane Points: 22072 More actions · Answer 4

Hmmm... can there be several invoices with the same INVOICE_ID? I would rather suppose that this does not happen - but on the other hand this could be a reason of rows multiplying in the result.

If yes, how do you know to which of the several invoices with the same INVOICE ID which payment belongs? Or don't you care about it?

Michael Meierruth SSChampion Points: 10051 More actions · Answer 5

I think the INVOICE table contains a row for each invoice line item just like PAYMENT contains a row for each payment (possibly more than one) agains a single invoice without indicating the invoice line item against which payment is made.

This creates two one-to-many links on the same key (INVOICE_ID) and will play havoc.

Vladan SSC-Insane Points: 22072 More actions · Answer 6

Yeah, that would explain it. I just didn't think of the table INVOICE as of table of invoice lines(details), supposed that it is a header because of the direct link to REG table. But maybe there isn't header table at all, or the example was simplified.

Sergiy SSC Guru Points: 110209 More actions · Answer 7

Again, wrong database design returns problems.

Each entity must have a table. YOu don't have table for Invoices. Only for Invoice Rows.

You must create it.

Will you create it once as a static table or every time you are doing SELECT as a deruived table - it's up to you.

If you can afford not to care about performance you may not bother to think about permanent solution.

_____________
Code for TallyGenerator

Michael Meierruth SSChampion Points: 10051 More actions · Answer 8

Bad database design or not, the problem remains the same. It would be easy to come up with a good database design of an arbitrary problem where you will run into, what I call, the dual one-to-many problem, i.e. table 1 has a one-to-many link to table 2 and table 1 has a one-to-many link to table 3.