April 11, 2007 at 9:44 am
Morning all,
I think that (subject) describes this scenario???
task - for a given post date range, find variance in days (post vs dos), and sum(volume), sum(charges), datediff(days, dos,post)
schema -
pid (int) , dos (date), post (date) , part (varchar), volume (int) , charges (money)
data -
1 , 1/1/2007 , 1/2/2007, parta , 2, $40.00
1 , 1/1/2007 , 1/2/2007, parta , 2, $40.00
1 , 1/1/2007 , 1/4/2007, parta , 1, $20.00
1 , 1/1/2007 , 1/4/2007, parta , 1, $20.00
1 , 1/1/2007 , 1/4/2007, parta , 4, $80.00
1 , 1/1/2007 , 1/6/2007, parta , 3, $60.00
1 , 1/1/2007 , 1/7/2007, parta , -3, -$60.00
1 , 1/3/2007 , 1/4/2007, parta , 1, $20.00
1 , 1/3/2007 , 1/4/2007, parta , 1, $20.00
output
pid , dos , post , lag(days post-dos ) , part , sum(volume), sum(charges)
1 , 1/1/2007 , 1/2/2007 , 1, parta , 4 , $ 80.00
1 , 1/1/2007 , 1/4/2007 , 3, parta , 6 , $100.00
1 , 1/3/2007 , 1/4/2007 , 1, parta , 2 , $ 40.00
Easy? Just group by pid, dos, post, part and sum the volume,charges, right? If so, you wouldn't be reading this. I have to drop out post of 1/6 and 1/7 because they cancel. I have the logic to include only when having sum(charges) > $0 and that used to work when reversal posted same day. Now when the pair are offset by a day, The reversal on 1/7 drops out, sum(charges) > 0, out but leaves the post of 1/6 in.
Can't use temp tables as this ends up in Crystal pivot also, as well as SQL CSV. Any ideas on how to eliminate the reversal while still including the post date in the grouping?
April 11, 2007 at 10:35 am
This will work on the data given but any other alternative may not
SELECT a.pid,a.dos,a.post,DATEDIFF(day,a.dos,a.post) AS [lag],a.part,SUM(a.volume),SUM(a.charges)
FROM
a
LEFT JOIN
b
ON b.pid=a.pid
AND b.dos=a.dos
AND b.part=a.part
AND (b.post=a.post+1 OR b.post=a.post-1)
AND (b.volume+a.volume)=0
AND (b.charges+a.charges)=0
WHERE b.pid IS NULL
GROUP BY a.pid,a.dos,a.post,DATEDIFF(day,a.dos,a.post),a.part
HAVING SUM(a.charges) <> 0
Far away is close at hand in the images of elsewhere.
Anon.
April 11, 2007 at 12:07 pm
David;
Did you want to Left Join
b , then placing
b on LEFT side of each pair in clause, and still ask when
b is null?I'm not familiar with using Left Join that way.
April 12, 2007 at 2:05 am
Yes, the left join is intentional.
I used to find if there was a corresponding entry for the previous/following day with the oposite values. So for the data you posted each row would have null values for b except for
1 , 1/1/2007 , 1/6/2007, parta , 3, $60.00
1 , 1/1/2007 , 1/7/2007, parta , -3, -$60.00
where each would have the corresponding other row.
By using 'WHERE b.pid IS NULL' I exclude these rows as they are not required as you requested
As I also stated this will work on the data posted but may not if the data varies from this
p.s.
Run this query on the above data
SELECT a.*,b.*
FROM
a
LEFT JOIN
b
ON b.pid=a.pid
AND b.dos=a.dos
AND b.part=a.part
AND (b.post=a.post+1 OR b.post=a.post-1)
AND (b.volume+a.volume)=0
AND (b.charges+a.charges)=0
and you will see
Far away is close at hand in the images of elsewhere.
Anon.
April 12, 2007 at 6:58 am
David;
Ok, I'm running validations to understand. I've always used 'left join' with the table I want all data from (a) on the left side of the '=' as (a.filed = b.filed) , and check for right side (b.filed = null). Placing the b on the left interested me.
Would simply changing the (b.post = a.post+1 or b.post = a.post -1) to (b.post <> a.post) work to catch more that one day span?
April 12, 2007 at 7:10 am
...with the table I want all data from (a) on the left side of the '='... |
My preference is to put the JOIN'd table alias first but to SQL it makes no difference
(a.filed = b.filed) is the same as (b.filed = a.filed)
catch more that one day span |
Yes, providing the rest of the columns match
Far away is close at hand in the images of elsewhere.
Anon.
April 12, 2007 at 8:28 am
David;
Thanks. I know I read somewhere in SQL book about the left/right side of the join syntax, and took for granted that impacted the parser.
I've made those change and validating and it's looking great! I can use this logic elsewhaere and thats the bonus. This pulls out corrections to mistakes that impact financial reports. Upto now the reversal was always posted same day. Now there can be days/months for the reversal to post.
Again, thanks for the insight.
Randy.
April 13, 2007 at 6:20 am
Validation turned up missing record that should have been included.
ID PART DOS POST LAG VOL CHARGES
5 198 03/22/2006 04/03/2007 377 -1.0000 -115.00
5 198 03/22/2006 04/04/2007 378 -1.0000 -115.00
5 198 03/22/2006 04/03/2007 377 -1.0000 -115.00
5 198 03/22/2006 04/06/2007 380 1.0000 115.00
5 198 03/22/2006 04/06/2007 380 1.0000 115.00
5 206 03/22/2006 04/04/2007 378 -1.0000 -191.00
5 206 03/22/2006 04/03/2007 377 1.0000 191.00
5 206 03/22/2006 04/03/2007 377 1.0000 191.00
The first 5 lines for part 198 all have DOS of 3/22, even though there are 2 positive posts, but three reversals, the extra reversal should not count and not be included. The last three lines, part 206, have same DOS with 2 posts and 1 reversal, leaving one positive post, so this is a wanted record.
The code eliminated this from the results. Too bad, so close. If you take the POST date out of the select and group, having only >0 results it's perfect, just can't calculate the lag and show the POST.
April 13, 2007 at 7:17 am
You could try this
SELECT pid,part,dos,MAX(post) AS [post],DATEDIFF(day,dos,MAX(post)) AS [lag],SUM(volume),SUM(charges)
FROM
GROUP BY pid,dos,part
HAVING SUM(charges) <> 0
It will give you the greatest lag for pid,part,dos and since you have no ordering for the data (unless you failed to mention this) then the lag is imaterial as you cannot guarantee which one anyway
Far away is close at hand in the images of elsewhere.
Anon.
April 13, 2007 at 8:20 am
David;
Were you listening in? We just discussed this in my team meeting an hour ago and I'm off running a preview to get signed off, based on max lag!
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply