September 27, 2022 at 1:17 pm
Guys! For the three customers in Table A I want to get the amount of the nearest date from Table B, which is based on the date in Table A. Not really sure how to describe it but I hope the example and table "Expected output" will clarify what I want to do.
Example: For the Customer 1 in Table A I want return the amount 10 000 because 2020-11-28 is closer to 2020-10-12 than what the date of 2020-01-05 is.
Anyone who knows?
Table A:
Table B:
Expected output Table C:
September 27, 2022 at 2:01 pm
select a.CustNo,t.Amount
from TableA a
outer apply(select top 1 b.Amount
from TableB b
where b.CustNo = a.CustNo
order by abs(datediff(day,a.[Date],b.[Date]))) t(Amount)
order by a.CustNo;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537September 27, 2022 at 3:18 pm
Thanks, works fine!
But.... I realized that I want to do a group by on departments below - is this possible? See additional column in Table A. I didn't manage to make it by using the group by-clause. I also tried to do a subquery of your code but then I received the message 'SQL The ORDER BY clause is invalid in subqueries.
Any idea how to solve the same situation by ending up with a group by summary on departments?
Table A:
Table B:
Expected Output Table C:
September 27, 2022 at 3:52 pm
Try this
select a.Dept,SUM(t.Amount) AS Amount
from TableA a
outer apply(select top 1 b.Amount
from TableB b
where b.CustNo = a.CustNo
order by abs(datediff(day,a.[Date],b.[Date]))) t(Amount)
group by a.Dept
order by a.Dept;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537September 28, 2022 at 10:59 am
Works perfectly, thanks! If I want to make this code work as a subquery - is it possible?
September 28, 2022 at 1:55 pm
Just remove the 'order by'
select t.* from (
select a.Dept,SUM(t.Amount) AS Amount
from TableA a
outer apply(select top 1 b.Amount
from TableB b
where b.CustNo = a.CustNo
order by abs(datediff(day,a.[Date],b.[Date]))) t(Amount)
group by a.Dept) t;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537September 28, 2022 at 5:38 pm
Just to play the devil's advocate here and make you think: 🙂
How do you want to handle if the date in Table A is right in the middle between two dates in table B, i.e. both rows in table B are equally close to the date in table A?
The current solution just arbitrarily chooses one of them. Is this what you want to happen in that case?
Or should the row with the lowest or the highest amount be chosen? Or both (sum of both amounts)? Or something else?
Can table B have more than one row with the same date? If yes, how should that be handled?
September 28, 2022 at 9:57 pm
Just to play the devil's advocate here and make you think: 🙂
How do you want to handle if the date in Table A is right in the middle between two dates in table B, i.e. both rows in table B are equally close to the date in table A?
The current solution just arbitrarily chooses one of them. Is this what you want to happen in that case?
Or should the row with the lowest or the highest amount be chosen? Or both (sum of both amounts)? Or something else?
Can table B have more than one row with the same date? If yes, how should that be handled?
+1000 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
September 29, 2022 at 4:09 pm
Here is a different approach. It's better on some metrics, but worse on others. I'm not sure which will play out better in the long run.
WITH TableBRanges AS
(
SELECT *
, COALESCE(DATEADD(DAY, -DATEDIFF(DAY, LAG(tb.Dt) OVER(PARTITION BY tb.CustNo ORDER BY tb.Dt), tb.Dt)/2, tb.Dt), '19000101') AS RangeStart
, COALESCE(DATEADD(DAY, -DATEDIFF(DAY, LEAD(tb.Dt) OVER(PARTITION BY tb.CustNo ORDER BY tb.Dt), tb.Dt)/2, tb.Dt), '99991231') AS RangeEnd
FROM #TableB AS tb
)
SELECT tb.CustNo, tb.Amount
FROM #TableA AS ta
INNER JOIN TableBRanges AS tb
ON ta.CustNo = tb.CustNo
AND ta.Dt >= tb.RangeStart
AND ta.Dt < tb.RangeEnd
The main difference is that the midpoints are calculated before the join.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 3, 2022 at 2:33 pm
I am on an old Chrome book wthout any SQL on it for testing, but could we use MIN( DATEDIFF(DAY, ..) OVER OVER(PARTITION BY tb.CustNo ..) ? People forget the OVER() works with all the aggregate functions.
Please post DDL and follow ANSI/ISO standards when asking for help.
October 3, 2022 at 4:36 pm
In the beginning - I wanted to get a understanding of the basic principle, but the more I work with it I face a significant problem.
Example: For the Customer 1 in Table A I want return the amount 10 000 because 2020-11-28 is closer to 2020-10-12 than what the date of 2020-01-05 is.
For the customer nr 2 though (which basically has raised my concern), is that it has two hits in Table A. In this case, I want to start from the one which has the latest purchase date (column PurDate), which for customer nr 2 is 2018-09-18, and then return the amount 8 000 from Table B.
Someone of the experts out there who could help me with this?
Table A
Table B
Expected Output Table C
October 3, 2022 at 5:13 pm
This was removed by the editor as SPAM
October 3, 2022 at 9:25 pm
I'm still interested in the question the Kaj asked above. Basically, what do you want to do if the there's a tie of dates at one end or the other and what if there's a tie across all 3 dates. Which row to select from amongst the ties?
--Jeff Moden
Change is inevitable... Change for the better is not.
October 4, 2022 at 5:44 am
OK, I will try to make this clear. The dates in my tables have timestamps including milliseconds which makes it almost impossible that one date in Table A would get right in the middle of two dates in Table B. But if this was suppose to happen, I want to pick the amount from the latest date of the possible ones in Table B. If the latest date has exactly the same date, it won’t matter which I got returned because this will have the same Amount.
Not really sure what you mean with “across all three dates”, hope I explained above.
Worth to mention is that I basically want to use the latest purchase date from table A to start from, the possible other ones for the same customer is irrelevant.
Let me know if you have some further questions that you want me to clarify.
October 4, 2022 at 3:26 pm
OK, I will try to make this clear. The dates in my tables have timestamps including milliseconds which makes it almost impossible that one date in Table A would get right in the middle of two dates in Table B. But if this was suppose to happen, I want to pick the amount from the latest date of the possible ones in Table B. If the latest date has exactly the same date, it won’t matter which I got returned because this will have the same Amount.
Not really sure what you mean with “across all three dates”, hope I explained above.
Worth to mention is that I basically want to use the latest purchase date from table A to start from, the possible other ones for the same customer is irrelevant.
Let me know if you have some further questions that you want me to clarify.
From what you say in the first paragraph above, it sounds like you're counting on a condition that's not enforced by a unique index/constraint. It also doesn't match the data example that you presented, which uses only whole dates instead of dates and times. That was the big reason for the question. You can save folks some time in the future and get better answers more quickly by taking a little extra time to post "readily consumable data", which would also include the CREATE TABLE statements for your example tables. Please see the first link in my signature line below for one of many ways to do that.
What I meant by the 3 dates is, you have 1 date that you're using as a "base date" and you want to find the row with the nearest possibility date-wise. That means there are two dates you need to compare to... the maximum date in the table that's less than the "base date" and the minimum date in the table that's greater than the "base date".
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply