How to do this in SQL?

Question

Post reply

How to do this in SQL?

ngkh

Mr or Mrs. 500

Points: 531
More actions
June 2, 2003 at 4:01 pm

#357792

if given a table A
(Customer_ID, Cost, Trip_EndDate)
sorted by Customer_ID, Trip_EndDate
Customer_ID COST TRIP_ENDDATE
------------------------------------
001 200.00 01/14/1997
001 300.00 03/06/1997
001 300.00 09/06/1997
001 300.00 09/22/1997
001 300.00 09/26/1997
determine if they are the same trip by assigning a trip ID in new table called it B such that
Customer ID COST TRIP_ENDDATE TRIP_ID
------------------------------------------------
001 200.00 01/14/1997 1
001 300.00 03/06/1997 2
001 300.00 09/16/1997 3
001 300.00 09/22/1997 3
001 300.00 09/26/1997 3
determine whether each record is the same trip by
if the difference between first the records' trip_enddate with the same customer_ID is within 7 days
(e.g. 09/16/1997 and 09/22/1997 is within 7 days and then 09/22/1997 and 09/26/1997 is within 7 days) then assign same trip ID
how do you do this in SQL? does a CURSOR has to be implemented?

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

eljeffo Ten Centuries Points: 1040 More actions · Answer 1

Ng,

You present a unique question. I need a little more info:

You said that the CustomerID records need to match within 7 days of each other. It looks like the Cost column helps to better identify the trip too. For any given trip, would the cost be different on one of the lines?

AnzioBake SSCarpal Tunnel Points: 4134 More actions · Answer 2

I believe this is what you are looking for

use pubs

go

-------------------------------------------------

Create table taba(

Customer_ID int

,COST money

,TRIP_ENDDATE datetime

)

Insert into taba select 001, 200.00, '01/14/1997'

Insert into taba select 001, 300.00, '03/06/1997'

Insert into taba select 001, 300.00, '09/06/1997'

Insert into taba select 001, 300.00, '09/22/1997'

Insert into taba select 001, 300.00, '09/26/1997'

--determine if they are the same trip by assigning a trip ID in new table called it B such that

----------------------------------------------------

Create table tabb(

Customer_ID int

,COST money

,TRIP_ENDDATE datetime

,TRIP_ID int

)

------------------------------------------------

INsert into tabB select 001, 200.00, '01/14/1997', 1

INsert into tabB select 001, 300.00, '03/06/1997', 2

INsert into tabB select 001, 300.00, '09/16/1997', 3

INsert into tabB select 001, 300.00, '09/22/1997', 3

INsert into tabB select 001, 300.00, '09/26/1997', 3

Select b.*, a.*

from tabb as b

Inner join taba as a

on a.Customer_ID = b.Customer_ID

AndAbs( datediff( day, a.TRIP_ENDDATE, b.TRIP_ENDDATE ) ) <= 7

DRop table taba

DRop table tabb

Antares686 SSC Guru Points: 125444 More actions · Answer 3

You will need to use at least a WHILE loop or a CURSOR to get the data properly. However, if you need to perform this often I would add a column to Table A and create a Trigger to do this each time a record is created to keep up.

NPeeters SSChampion Points: 12541 More actions · Answer 4

I don't see a solution to your problem in a single query. The problem lies in the assignment of a unique Trip_ID in the new table.

However, you could do it in two goes. First insert one record for each trip using an identity field for Trip_id. After that remove the identity (or 'set IDENTITY_INSERT OFF') and insert all records that are not yet present.

The query that was posted by AnzioBake is a good start to identify records belonging to the same trip.

tkbr0wn Ten Centuries Points: 1092 More actions · Answer 5

Here is another stab at it. I am assuming you will not have a customer with two trips ending on the same day. If so you need some sort of unique key on this table to help with the id generation.

drop table test1

go

create table test1 (customer_id int, cost float, trip_enddate datetime)

go

insert into test1

select1,200.00,'1/14/1997'

union all

select1,300.00,'3/06/1997'

union all

select1,300.00,'9/16/1997'

union all

select1,300.00,'9/22/1997'

union all

select1,300.00,'9/26/1997'

union all

select1,300.00,'10/26/1997'

union all

select1,300.00,'10/27/1997'

union all

select1,300.00,'10/30/1997'

union all

select1,300.00,'11/26/1997'

go

drop table test2

go

create table test2 (customer_id int, cost float, trip_enddate datetime, trip_id int not null)

go

insert into test2 (customer_id, cost, trip_enddate, trip_id)

select customer_id,

cost,

trip_enddate,

(select count(*) from test1 as b where a.customer_id = b.customer_id and a.trip_enddate >= b.trip_enddate)

from test1 a

go

update t2

set t2.trip_id = dt.dup_trip_id

from test2 t2

join (select a.customer_id,

a.cost,

a.trip_enddate,

a.trip_id as current_trip_id,

(select min(trip_id)

from test2 c

where c.customer_id = a.customer_id

and datediff(day,c.trip_enddate, b.trip_enddate) <= 7

and c.trip_enddate < a.trip_enddate) as dup_trip_id

from test2 a

join test2 b

onb.customer_id = a.customer_id

anddatediff(day,b.trip_enddate, a.trip_enddate) <= 7

and b.trip_enddate < a.trip_enddate

) as dt

on dt.customer_id = t2.customer_id

and dt.current_trip_id = t2.trip_id

go

select * from test2

go

tkbr0wn Ten Centuries Points: 1092 More actions · Answer 6

Another quick note about my previous post. The business rule my script follows is that trips after the first duplicate must be within seven days of the first duplicate to also be considered duplicates. In other words, trip #1 is on 01/01, trip#2 is on 01/05 (duplicate), trip #3 is on 01/10 (duplicate), trip #4 is on 1/12 (duplicate) trip #5 is on 1/13 (not a duplicate since it is seven days from first dup). If this is no good, you are going to have to loop!