June 2, 2003 at 4:01 pm
if given a table A
(Customer_ID, Cost, Trip_EndDate)
sorted by Customer_ID, Trip_EndDate
Customer_ID COST TRIP_ENDDATE
------------------------------------
001 200.00 01/14/1997
001 300.00 03/06/1997
001 300.00 09/06/1997
001 300.00 09/22/1997
001 300.00 09/26/1997
determine if they are the same trip by assigning a trip ID in new table called it B such that
Customer ID COST TRIP_ENDDATE TRIP_ID
------------------------------------------------
001 200.00 01/14/1997 1
001 300.00 03/06/1997 2
001 300.00 09/16/1997 3
001 300.00 09/22/1997 3
001 300.00 09/26/1997 3
determine whether each record is the same trip by
if the difference between first the records' trip_enddate with the same customer_ID is within 7 days
(e.g. 09/16/1997 and 09/22/1997 is within 7 days and then 09/22/1997 and 09/26/1997 is within 7 days) then assign same trip ID
how do you do this in SQL? does a CURSOR has to be implemented?
June 2, 2003 at 5:22 pm
Ng,
You present a unique question. I need a little more info:
You said that the CustomerID records need to match within 7 days of each other. It looks like the Cost column helps to better identify the trip too. For any given trip, would the cost be different on one of the lines?
June 3, 2003 at 12:21 am
I believe this is what you are looking for
use pubs
go
-------------------------------------------------
Create table taba(
Customer_ID int
,COST money
,TRIP_ENDDATE datetime
)
Insert into taba select 001, 200.00, '01/14/1997'
Insert into taba select 001, 300.00, '03/06/1997'
Insert into taba select 001, 300.00, '09/06/1997'
Insert into taba select 001, 300.00, '09/22/1997'
Insert into taba select 001, 300.00, '09/26/1997'
--determine if they are the same trip by assigning a trip ID in new table called it B such that
----------------------------------------------------
Create table tabb(
Customer_ID int
,COST money
,TRIP_ENDDATE datetime
,TRIP_ID int
)
------------------------------------------------
INsert into tabB select 001, 200.00, '01/14/1997', 1
INsert into tabB select 001, 300.00, '03/06/1997', 2
INsert into tabB select 001, 300.00, '09/16/1997', 3
INsert into tabB select 001, 300.00, '09/22/1997', 3
INsert into tabB select 001, 300.00, '09/26/1997', 3
Select b.*, a.*
from tabb as b
Inner join taba as a
on a.Customer_ID = b.Customer_ID
AndAbs( datediff( day, a.TRIP_ENDDATE, b.TRIP_ENDDATE ) ) <= 7
DRop table taba
DRop table tabb
June 3, 2003 at 4:12 am
You will need to use at least a WHILE loop or a CURSOR to get the data properly. However, if you need to perform this often I would add a column to Table A and create a Trigger to do this each time a record is created to keep up.
June 3, 2003 at 4:14 am
I don't see a solution to your problem in a single query. The problem lies in the assignment of a unique Trip_ID in the new table.
However, you could do it in two goes. First insert one record for each trip using an identity field for Trip_id. After that remove the identity (or 'set IDENTITY_INSERT OFF') and insert all records that are not yet present.
The query that was posted by AnzioBake is a good start to identify records belonging to the same trip.
June 3, 2003 at 12:16 pm
Here is another stab at it. I am assuming you will not have a customer with two trips ending on the same day. If so you need some sort of unique key on this table to help with the id generation.
drop table test1
go
create table test1 (customer_id int, cost float, trip_enddate datetime)
go
insert into test1
select1,200.00,'1/14/1997'
union all
select1,300.00,'3/06/1997'
union all
select1,300.00,'9/16/1997'
union all
select1,300.00,'9/22/1997'
union all
select1,300.00,'9/26/1997'
union all
select1,300.00,'10/26/1997'
union all
select1,300.00,'10/27/1997'
union all
select1,300.00,'10/30/1997'
union all
select1,300.00,'11/26/1997'
go
drop table test2
go
create table test2 (customer_id int, cost float, trip_enddate datetime, trip_id int not null)
go
insert into test2 (customer_id, cost, trip_enddate, trip_id)
select customer_id,
cost,
trip_enddate,
(select count(*) from test1 as b where a.customer_id = b.customer_id and a.trip_enddate >= b.trip_enddate)
from test1 a
go
update t2
set t2.trip_id = dt.dup_trip_id
from test2 t2
join (select a.customer_id,
a.cost,
a.trip_enddate,
a.trip_id as current_trip_id,
(select min(trip_id)
from test2 c
where c.customer_id = a.customer_id
and datediff(day,c.trip_enddate, b.trip_enddate) <= 7
and c.trip_enddate < a.trip_enddate) as dup_trip_id
from test2 a
join test2 b
onb.customer_id = a.customer_id
anddatediff(day,b.trip_enddate, a.trip_enddate) <= 7
and b.trip_enddate < a.trip_enddate
) as dt
on dt.customer_id = t2.customer_id
and dt.current_trip_id = t2.trip_id
go
select * from test2
go
June 3, 2003 at 12:47 pm
Another quick note about my previous post. The business rule my script follows is that trips after the first duplicate must be within seven days of the first duplicate to also be considered duplicates. In other words, trip #1 is on 01/01, trip#2 is on 01/05 (duplicate), trip #3 is on 01/10 (duplicate), trip #4 is on 1/12 (duplicate) trip #5 is on 1/13 (not a duplicate since it is seven days from first dup). If this is no good, you are going to have to loop!
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply