December 3, 2018 at 11:25 pm
An example of correctness is: 1Z3V1F0359168993, with one tracking_number sharing the same ship_date.
Rule:
(1) if > 1 tracking_number has >1 ship_dates, then take the ship_date having earliest Date_submitted_to_carrier .
(2) if > 1 tracking_number has >1 ship_dates and earliest Date_submitted_to_carrier are same, take the earliest ship_date.
(3) if claim type for same tracking number are CCNR and Late, use CCNR.
(4) if claim type for same tracking number are all PDMM, use PDMM
And
(5) If ship_dates are more than 15 days apart for same tracking_number then they are different orders (tracking_numbers were reused) eg. 1Z90134200324274. Only update those tracking_numbers having different ship_dates, if ship_dates are within 15 days of each other ( don't know how to do this )
sample dataset (DDL below)
Select 'tracking_Number', 'ship_date', 'claim_type', 'date_submitted_to_carrier' UNION ALL
select '1Z90124202029501','2018-09-10 00:00:00.000','Late','2018-10-30 00:00:00.000' UNION ALL
select '1Z90124202029501','2018-09-07 00:00:00.000','CCNR','2018-09-26 00:00:00.000' UNION ALL
select '1Z90134200323640','2018-09-30 00:00:00.000', 'CCNR', '2018-10-17 00:00:00.000' UNION ALL
select '1Z90134200324274','2018-01-30 00:00:00.000','CCNR', '2018-02-17 00:00:00.000' UNION ALL
select '1Z90134200324274','2018-10-01 00:00:00.000','Late','2018-10-30 00:00:00.000' UNION ALL
select '1Z3V1F0359146828','2017-12-31 00:00:00.000','PDMM','2018-01-15 00:00:00.000' UNION ALL
select '1Z3V1F0359146828','2018-01-10 00:00:00.000','CCNR','2018-01-16 00:00:00.000' UNION ALL
select '1Z3V1F0359168993','2018-01-03 00:00:00.000','CCNR','2018-01-13 00:00:00.000' UNION ALL
select '1Z3V1F0359168993','2018-01-03 00:00:00.000','CCNR','2018-01-13 00:00:00.000' UNION ALL
select '1Z3V1F0359168055', '2018-01-02 00:00:00.000','PDMM', '2018-01-13 00:00:00.000' UNION ALL
select '1Z3V1F0359168055', '2018-01-03 00:00:00.000', 'PDMM', '2018-01-13 00:00:00.000'
DDL
use testdb
go
create table Claims (tracking_Number varchar(30) not null, ship_date datetime, claim_type varchar(4), date_submitted_to_carrier datetime)
INSERT INTO [TestDB].[dbo].[Claims] ([tracking_Number],[ship_date],[claim_type],[date_submitted_to_carrier])
VALUES ('1Z90124202029501','2018-09-10 00:00:00.000','Late','2018-10-30 00:00:00.000'),
('1Z90124202029501','2018-09-07 00:00:00.000','CCNR','2018-09-26 00:00:00.000'),
('1Z90134200323640','2018-10-01 00:00:00.000','Late','2018-10-30 00:00:00.000'),
('1Z90134200323640','2018-09-30 00:00:00.000', 'CCNR', '2018-10-17 00:00:00.000'),
('1Z90134200324274','2018-01-30 00:00:00.000','CCNR', '2018-02-17 00:00:00.000'),
('1Z90134200324274','2018-10-01 00:00:00.000','Late','2018-10-30 00:00:00.000'),
('1Z3V1F0359146828','2017-12-31 00:00:00.000','PDMM','2018-01-15 00:00:00.000'),
('1Z3V1F0359146828','2018-01-10 00:00:00.000','CCNR','2018-01-16 00:00:00.000'),
('1Z3V1F0359168993','2018-01-03 00:00:00.000','CCNR','2018-01-13 00:00:00.000'),
('1Z3V1F0359168993','2018-01-03 00:00:00.000','CCNR','2018-01-13 00:00:00.000'),
('1Z3V1F0359168055', '2018-01-02 00:00:00.000','PDMM', '2018-01-13 00:00:00.000'),
('1Z3V1F0359168055', '2018-01-03 00:00:00.000', 'PDMM', '2018-01-13 00:00:00.000');
My long solution
--(3) in case of matching date_submitted_to_carrier in prior step, take min ship_date
--(4) update the original dataset
--Quote me
December 4, 2018 at 12:07 am
Why not use LAG() to get the previous ship date for a tracking number? This is ugly, but I think it'll work:SELECT x.tracking_number
, x.ship_date
, x.PrevShipDate
, DATEDIFF(day,x.PrevShipDate,x.ship_date) AS DaysBetween
FROM (
SELECT c.tracking_number
, c.ship_date
, LAG(c.ship_date,1) OVER (PARTITION BY tracking_number ORDER BY ship_date) AS PrevShipDate
FROM Claims c
) x
WHERE x.PrevShipDate IS NOT NULL;
December 4, 2018 at 12:49 am
pietlinden - Tuesday, December 4, 2018 12:06 AMWhy not use LAG() to get the previous ship date for a tracking number? This is ugly, but I think it'll work:SELECT x.tracking_number
, x.ship_date
, x.PrevShipDate
, DATEDIFF(day,x.PrevShipDate,x.ship_date) AS DaysBetween
FROM (
SELECT c.tracking_number
, c.ship_date
, LAG(c.ship_date,1) OVER (PARTITION BY tracking_number ORDER BY ship_date) AS PrevShipDate
FROM Claims c
) x
WHERE x.PrevShipDate IS NOT NULL;
no, because the difference between ship_dates for same tracking number can be up to 15 days apart. If more than 15 days apart it is indicator of possibly different order. I updated DDL so that ship_date differences are not always shown as 1 day apart.
I thought something like this, but I get error:
select distinct c.tracking_number
from Claims c join
(select DISTINCT ship_date, Tracking_Number from Claims) s on c.tracking_number = s.tracking_number
where c.ship_date <> s.ship_date
AND claim_type in ('late','ccnr')
AND MAX(c.ship_date) >= DATEADD(day, -15, c.ship_date)
AND MIN(c.ship_date) <= DATEADD(day, 15, c.ship_date)
order by c.tracking_number;
error
Msg 147, Level 15, State 1, Line 4
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
--Quote me
December 13, 2018 at 1:03 am
pietlinden, I appreciate the help and intro to LAG. It would have worked if ship_dates were never more than one day apart, as I had presented my DDL.
In the end I found that there was another column that I could use as a way to distinguish between two different use phases of the same tracking number, called order_id (which I did not have in the DDL).
I was then able to join to correlated subquery on both tracking_number and order_id to make sure I was identifying the ones with issue and also taking the min(ship_date) for the right set of tracking numbers:
IF OBJECT_ID('tempdb..#bad_shipdate1') IS NOT NULL DROP TABLE #bad_shipdate1
select distinct c.tracking_number, c.order_id
into #bad_shipdate1
from Claims c join
(select DISTINCT ship_date, Tracking_Number, order_id from Claims) s on c.tracking_number = s.tracking_number and c.order_id = s.order_id
where c.ship_date <> s.ship_date
order by c.tracking_number;
go
select * from #bad_shipdate1;
IF OBJECT_ID('tempdb..#good_shipdate1') IS NOT NULL DROP TABLE #good_shipdate1
select DISTINCT c1.Tracking_Number, c1.Ship_Date, c1.order_id
INTO #good_shipdate1
FROM Claims c1 join #bad_shipdate1 bad on c1.Tracking_Number = bad.Tracking_Number and c1.order_id = bad.order_id
WHERE c1.Date_Submitted_to_Carrier = (select TOP 1 Date_Submitted_to_Carrier
FROM Claims c2 where c1.Tracking_Number = c2.Tracking_Number and c1.order_id = c2.order_id
ORDER BY Tracking_Number, order_id, Date_Submitted_to_Carrier ASC);
select * from #good_shipdate1
IF OBJECT_ID('tempdb..#final_shipdate1') IS NOT NULL DROP TABLE #final_shipdate1
Select tracking_number, order_id, min(ship_date) as ship_date
into #final_shipdate1
from #good_shipdate1
group by tracking_Number, order_id;
select * from #final_shipdate1
Update sc
set sc.ship_date = f.ship_date
from #final_shipdate1 f join
#sc sc on f.tracking_Number = sc.tracking_Number and f.order_id = sc.order_id
select * from #sc
Thanks for taking time to reply!
--Quote me
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply