August 15, 2006 at 1:00 pm
Hi,
I'm trying to only return all customers that have had a payment in the last two years. The problem is that my query is returning the customer multiple times for every payment received. How do I only return the customer once if they have a payment received in the past two years:
Customer Table: CustID, CustNamePayment Table: DatePayment My query:select customer.custID, customer.CustNamefrom customer join payment oncustomer.CustID = payment.CustIDwhere payment.DatePayment > '2004-08-01'and (customer.CustID like '10%' or customer.CustID like '20%')My results are:
100288 Bell Products 2004-08-03100288 Bell Products 2004-08-10100288 Bell Products 2004-08-15And I only want 1 row per customer to return.
Any help would be greatly appreaciated!
Isabelle
Thanks!
Bea Isabelle
August 15, 2006 at 1:09 pm
it is because every row in your join meets the condition in the where clause, not just the most recent payment. You need to either use DISTINCT, or change your grouping to something like below, so you're only looking at the most recent payment from each customer.
select c.custID, max(c.custname), max(p.DatePayment)
from customer c
join payment p on p.custID = c.custID
group by c.custID
having max(p.DatePayment) > '2004-08-01'
August 15, 2006 at 1:16 pm
Or you could do this:
select
cust.custID,
cust.CustName
from
dbo.customer cust
inner join dbo.payment pay
on (cust.CustID = pay.CustID)
where
pay.DatePayment > '2004-08-01'
and (cust.CustID like '10%' or cust.CustID like '20%')
and pay.DatePayment = (select max(p.DatePayment) from dbo.payment p where p.CustID = pay.CustID)
August 15, 2006 at 1:22 pm
Or another option that should perform well if column payment.CustID is the 1st column in an index:
SELECT c.custID, c.CustName
FROM customer As c
WHERE (c.CustID like '10%' or c.CustID like '20%')
AND EXISTS (
SELECT *
FROM payment As p
WHERE p.DatePayment > '2004-08-01'
AND p.CustID = c.CustID
)
August 15, 2006 at 1:27 pm
True. But that also depends on the actual query and what is wanted. We only see two columns in the query on-line, but what if they wanted to most recent payment.
August 15, 2006 at 2:46 pm
Well, I tried it
(and a.date_payment = (select max(a.date_payment) from dbo.appayrh
where p.vendor_code = a.vendor_code)
and got an error:
Server: Msg 147, Level 16, State 2, Line 1
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
I used a smaller version of my actual query because I have alot of other columns that are being selected. I thought about using distinct, but because of all the other columns, I was getting weird results.
Here is my full statement:
select p.vendor_code, p.vendor_name, p.vendor_addr_l1, p.vendor_addr_l2, p.vendor_addr_l3, p.vendor_addr_l4, p.vendor_addr_l5,
p.vendor_addr_l6, p.vendor_addr_l7, p.vendor_addr_l8, p.vendor_phone, p.vendor_remit_addr_l1, p.vendor_remit_addr_l2,
p.vendor_remit_addr_l3, p.vendor_remit_addr_l4, p.vendor_remit_addr_l5,
a.date_payment
from dbo.pocvnms p inner join dbo.appayrh a on
(p.vendor_code = a.vendor_code)
where p.vendor_code not in
(select p.vendor_code
where p.vendor_addr_l1 is null
and p.vendor_addr_l2 is null
and p.vendor_addr_l3 is null
and p.vendor_addr_l4 is null
and p.vendor_addr_l5 is null
and p.vendor_addr_l6 is null
and p.vendor_addr_l7 is null
and p.vendor_addr_l8 is null
and p.vendor_phone is null
and p.vendor_remit_addr_l1 is null
and p.vendor_remit_addr_l2 is null
and p.vendor_remit_addr_l3 is null
and p.vendor_remit_addr_l4 is null
and p.vendor_remit_addr_l5 is null)
and (p.vendor_code like '10%' or p.vendor_code like '20%')
and a.date_payment > '2004-08-01'
and a.date_payment = (select max(a.date_payment) from dbo.appayrh
where p.vendor_code = a.vendor_code)
Isabelle
Thanks!
Bea Isabelle
August 15, 2006 at 2:58 pm
You have included a.date_payment in the resultset.
It wasn't there in your original question. Which means you wasted effort on providing you the wrong solution.
When you put a.date_payment in the resultset, which date do you want if a customer has 2 or more payments ? The earliest date ? The most recent date ? A random date ?
August 15, 2006 at 3:02 pm
Sorry, that was a typo. I do not need the a.date_payment in the result set. I used the max(a.payment_date) and did a group by for all the other columns and it seems to have returned the correct data. I am in the process of checking it to see if it's valid.
Thanks so much for your help!
Isabelle
Thanks!
Bea Isabelle
August 15, 2006 at 3:06 pm
Reformatted your code and found what I think is the problem inthe subquery at the end. Check it out:
select
p.vendor_code,
p.vendor_name,
p.vendor_addr_l1,
p.vendor_addr_l2,
p.vendor_addr_l3,
p.vendor_addr_l4,
p.vendor_addr_l5,
p.vendor_addr_l6,
p.vendor_addr_l7,
p.vendor_addr_l8,
p.vendor_phone,
p.vendor_remit_addr_l1,
p.vendor_remit_addr_l2,
p.vendor_remit_addr_l3,
p.vendor_remit_addr_l4,
p.vendor_remit_addr_l5,
a.date_payment
from
dbo.pocvnms p inner
inner join dbo.appayrh a
on (p.vendor_code = a.vendor_code)
where
p.vendor_code not in
(select p.vendor_code
where p.vendor_addr_l1 is null
and p.vendor_addr_l2 is null
and p.vendor_addr_l3 is null
and p.vendor_addr_l4 is null
and p.vendor_addr_l5 is null
and p.vendor_addr_l6 is null
and p.vendor_addr_l7 is null
and p.vendor_addr_l8 is null
and p.vendor_phone is null
and p.vendor_remit_addr_l1 is null
and p.vendor_remit_addr_l2 is null
and p.vendor_remit_addr_l3 is null
and p.vendor_remit_addr_l4 is null
and p.vendor_remit_addr_l5 is null)
and (p.vendor_code like '10%' or p.vendor_code like '20%')
and a.date_payment > '2004-08-01'
and a.date_payment = ( select
max(a1.date_payment)
from
dbo.appayrh a1
where
p.vendor_code = a1.vendor_code)
hth
August 15, 2006 at 3:07 pm
Careful with my code above, I just noticed the inner on the inner join typed twice!
August 15, 2006 at 3:13 pm
Now I'm getting two different results
When I run yours, I get 3,147 rows. When I run mine:
select p.vendor_code, p.vendor_name, p.vendor_addr_l1, p.vendor_addr_l2, p.vendor_addr_l3, p.vendor_addr_l4, p.vendor_addr_l5,
p.vendor_addr_l6, p.vendor_addr_l7, p.vendor_addr_l8, p.vendor_phone, p.vendor_remit_addr_l1, p.vendor_remit_addr_l2,
p.vendor_remit_addr_l3, p.vendor_remit_addr_l4, p.vendor_remit_addr_l5,
max(a.date_payment)
from dbo.pocvnms p inner join dbo.appayrh a on
(p.vendor_code = a.vendor_code)
where p.vendor_code NOT in
(select p.vendor_code
where p.vendor_addr_l1 is null
and p.vendor_addr_l2 is null
and p.vendor_addr_l3 is null
and p.vendor_addr_l4 is null
and p.vendor_addr_l5 is null
and p.vendor_addr_l6 is null
and p.vendor_addr_l7 is null
and p.vendor_addr_l8 is null
and p.vendor_phone is null
and p.vendor_remit_addr_l1 is null
and p.vendor_remit_addr_l2 is null
and p.vendor_remit_addr_l3 is null
and p.vendor_remit_addr_l4 is null
and p.vendor_remit_addr_l5 is null)
and (p.vendor_code like '10%' or p.vendor_code like '20%')
and a.date_payment > '2004-08-01'
group by p.vendor_code, p.vendor_name, p.vendor_addr_l1, p.vendor_addr_l2, p.vendor_addr_l3, p.vendor_addr_l4, p.vendor_addr_l5,
p.vendor_addr_l6, p.vendor_addr_l7, p.vendor_addr_l8, p.vendor_phone, p.vendor_remit_addr_l1, p.vendor_remit_addr_l2,
p.vendor_remit_addr_l3, p.vendor_remit_addr_l4, p.vendor_remit_addr_l5
order by p.vendor_code
I get 3,113 rows. Mine is doing a max(a.date_payment) instead of another subquery the way you set it up. Why the difference? Sorry for these simple questions, but I'm a newbie in this area and I'm trying to get up to speed.
Thanks!
Isabelle
Thanks!
Bea Isabelle
August 15, 2006 at 3:20 pm
I noticed at looking at the output when I ran your code and I'm seeing duplicates, which would explain the additional rows:
100032 NID PTY LTD 18 STROKES AVENUE 2006-07-31 00:00:00
100032 NID PTY LTD 18 STROKES AVENUE 2006-07-31 00:00:00
100032 NID PTY LTD 18 STROKES AVENUE 2006-07-31 00:00:00
100032 NID PTY LTD 18 STROKES AVENUE 2006-07-31 00:00:00
Isabelle
Thanks!
Bea Isabelle
August 15, 2006 at 3:33 pm
That's why the sub-query on Max() is not the correct solution. It doesn't handle customers who made 2 or more payments on the same date.
What does this do with no FROM ?
where p.vendor_code NOT in
(select p.vendor_code
where p.vendor_addr_l1 is null
August 15, 2006 at 3:43 pm
I see. The query:
where p.vendor_code NOT in
(select p.vendor_code
where p.vendor_addr_l1 is null
omits any customer who has all those fields NULL. It works the same wether I put 'from dbo.pocvnms' or not, because I already specified it in the above lines. Is that not correct?
Isabelle
Thanks!
Bea Isabelle
August 15, 2006 at 3:46 pm
>>Is that not correct?
Nope, not correct. You've structured it as a NOT IN which creates an unnecessary sub-query.
If you've already selected FROM the required table, and simply need to filter based on columns in that table, then just place the condition in the WHERE, without the IN.
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply