Records with 2 dates 48 hours from each other

Question

Post reply

Records with 2 dates 48 hours from each other

cidr

Ten Centuries

Points: 1353
More actions
September 1, 2010 at 4:31 am

#237503

Hi folks and thanks in advance for any help.
This is boggling me a bit. I have a table that is to do with fault callouts for a company. If there is a fault in a shop store, an Engineer is called out and this is recorded in the table.
I've been asked to identify when a callout for the same store has been made with 48 hours of the last one. This is to show if an engineer has not corrected the problem and has had to come back out to fix it. This will evaluate Engineers performance.
Here is sample data that contains the storeID, the callout dates and always different faultIDs
FAultID StoreID Callout Date
6279170142222010-04-26 00:00:00.000
6418740142222010-06-28 00:00:00.000
10002780142222010-06-28 16:35:00.000
10003010142222010-07-01 17:08:00.000
10003950142222010-07-05 12:00:00.000
10003220142222010-07-06 07:00:00.000
10003980142222010-07-06 08:44:00.000
As you can see, there are three sets of dates that are within 48 hours of each other:
FaultIDs 641874 and 1000278 are 16.35 hours from each other.
faultIDs 1000395, 1000322 are around 19 hours from each other.
Also, 1000322 and 1000398 are aound 1.44 hours from each other.
These five records would be shown with the callout notes to determine if an engineer has not corrected the fault properly and has had to come back out.
It should look like this
FAultID StoreID Callout Date
6418740142222010-06-28 00:00:00.000
10002780142222010-06-28 16:35:00.000
10003950142222010-07-05 12:00:00.000
10003220142222010-07-06 07:00:00.000
10003980142222010-07-06 08:44:00.000
FaultID 627917 and 1000301 are not included in this report because they're outwith 48 hours of any other dates in the column for that particular store number.
I'm very unsure how to do grab only records that fit the above criteria. Is there anyone that can help me?
Many thanks

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

matt6288 SSCertifiable Points: 5058 More actions · Answer 1

I haven't checked performance on this but it should work:

CREATE TABLE #temp

(

faultID INT,

storeID VARCHAR(6),

calloutDate DATETIME

)

INSERT INTO #temp

SELECT 627917, '014222', '04/26/10'

UNION ALL

SELECT 641874, '014222','06/28/10'

UNION ALL

SELECT 1000278, '014222', '06/28/10 16:35'

UNION ALL

SELECT 1000307, '014222','07/01/10 17:08'

UNION ALL

SELECT 1000395, '014222','07/05/10 12:00'

UNION ALL

SELECT 1000322, '014222','07/06/10 7:00'

UNION ALL

SELECT 1000398, '014222','07/06/10 8:44'

SELECT * FROM #temp T

WHERE EXISTS (SELECT * FROM #temp T2

WHERE t.storeID = t2.storeID

AND t2.calloutDate > t.calloutDate

AND t2.calloutDate < DATEADD(hh,48,t.calloutDate))

union

SELECT * FROM #temp T

WHERE EXISTS (SELECT * FROM #temp T2

WHERE t.storeID = t2.storeID

AND t2.calloutDate < t.calloutDate

AND t2.calloutDate > DATEADD(hh,-48,t.calloutDate))

ORDER BY T.calloutDate

DROP TABLE #temp

cidr Ten Centuries Points: 1353 More actions · Answer 2

Hey Matt, very smart, thanks a lot for this. It seems to work and performance seems ok

🙂

WayneS SSC Guru Points: 95461 More actions · Answer 3

This form of a query is known as a "triangular join". You might want to check out this article[/url] for clues on why this query just might be the worse thing you've ever done to your server. If you are running this on a large table, you can quite literally bring the server to it's knees.

Edit: Just an FYI: I ran this query against those 7 rows of sample data... it had to touch 98 rows to get the answer.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

WayneS SSC Guru Points: 95461 More actions · Answer 4

Okay, so now that I blasted the previous query, here's one that will perform better.

Remarks are within the code.

if OBJECT_ID('tempdb..#temp') IS NOT NULL DROP TABLE #temp;

CREATE TABLE #temp

(-- need to sequential number the rows, and have a clustered index on it

RowID INT IDENTITY PRIMARY KEY CLUSTERED,

faultID INT,

storeID VARCHAR(6),

calloutDate DATETIME);

INSERT INTO #temp

SELECT 627917, '014222', '04/26/10'

UNION ALL

SELECT 641874, '014222','06/28/10'

UNION ALL

SELECT 1000278, '014222', '06/28/10 16:35'

UNION ALL

SELECT 1000307, '014222','07/01/10 17:08'

UNION ALL

SELECT 1000395, '014222','07/05/10 12:00'

UNION ALL

SELECT 1000322, '014222','07/06/10 7:00'

UNION ALL

SELECT 1000398, '014222','07/06/10 8:44'

-- get the results

-- this first query gets the first record if it's within 48 hours of the next

SELECT t1.*

FROM #temp t1

JOIN #temp t2

ON t2.RowID = t1.RowID + 1

AND DATEDIFF(hour, t1.calloutDate, t2.calloutDate) <= 48

UNION -- to eliminate any duplicates, do not use UNION ALL

-- this second query gets the second record if it's within 48 hours of the previous

SELECT t1.*

FROM #temp t1

JOIN #temp t2

ON t2.RowID +1= t1.RowID

AND DATEDIFF(hour, t2.calloutDate, t1.calloutDate) <= 48

For a comparison, this has to touch just 20 records. Almost 20% of the other query.

From the IO stats: 2 scans vs 4.

Edit: I forgot to note that there is probably an even more efficient way to do this, but I can't quite figure it out right now.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

matt6288 SSCertifiable Points: 5058 More actions · Answer 5

Wayne,

Thanks for pointing out the triangular join I had going on. I never see them until someone points them out. I can definitely see the benefits you show in your query but it looks like that assumes order in the table and only 1 storeID exisiting. Since this is in the SQL 7 or 2000 section would the next thing to use would be a numbers table to join to the previous/next row.

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 6

This recent article:

Linking To Previous Row

If you're in 2k5 or greater, this is an elegant solution to your self referencing issues.

[edit] Sorry, wrong section for that solution. [/edit]

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

WayneS SSC Guru Points: 95461 More actions · Answer 7

matt6288 (9/1/2010)
Wayne,
Thanks for pointing out the triangular join I had going on. I never see them until someone points them out.

No problem. (Just look for < or >)

I can definitely see the benefits you show in your query but it looks like that assumes order in the table and only 1 storeID exisiting. Since this is in the SQL 7 or 2000 section would the next thing to use would be a numbers table to join to the previous/next row.

The code I used should be SQL 7/2000 compatible. However, you are right - I don't think I considered different storeIDs. You guarantee the order by the identity column in the temp table - you just need to ensure that the select that populates it is ordered by storeID/Date.

I first came up with a "quirky update" solution, but then realized that this was only getting me the rows that were 48 hours after the previous - it wasn't getting the previous row unless it also was within 48 hours of it's previous row. And this did handle multiple storeIDs. Looks like I might have to revisit it... I'll be back with a better version shortly.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

cidr Ten Centuries Points: 1353 More actions · Answer 8

Just to let you know. I tried this one from another forum and it touched 56 rows. I've not tried your method yet Wayne, but I wil let you know how I get on.

create table#Sample

(

FaultID INT NOT NULL,

StoreID INT NOT NULL,

CallOut SMALLDATETIME NOT NULL

)

INSERT#Sample

SELECT 627917, 14222, '2010-04-26 00:00' UNION ALL

SELECT 641874, 14222, '2010-06-28 00:00' UNION ALL

SELECT1000278, 14222, '2010-06-28 16:35' UNION ALL

SELECT1000301, 14222, '2010-07-01 17:08' UNION ALL

SELECT1000395, 14222, '2010-07-05 12:00' UNION ALL

SELECT1000322, 14222, '2010-07-06 07:00' UNION ALL

SELECT1000398, 14222, '2010-07-06 08:44'

-- Solution here

SELECT DISTINCTx.FaultID,

x.StoreID,

x.CallOut

FROM(

SELECTFaultID,

StoreID,

CallOut,

DATEADD(HOUR, -48, CallOut) AS FromTime,

DATEADD(HOUR, 48, CallOut) AS ToTime

FROM#Sample

) AS s

INNER JOIN#Sample AS x ON x.StoreID = s.StoreID

WHEREx.CallOut BETWEEN s.FromTime AND s.ToTime

AND x.FaultID <> s.FaultID

Mark Cowne One Orange Chip Points: 26990 More actions · Answer 9

cidr (9/2/2010)
Just to let you know. I tried this one from another forum and it touched 56 rows. I've not tried your method yet Wayne, but I wil let you know how I get on.
create table#Sample
(
FaultID INT NOT NULL,
StoreID INT NOT NULL,
CallOut SMALLDATETIME NOT NULL
)
INSERT#Sample
SELECT 627917, 14222, '2010-04-26 00:00' UNION ALL
SELECT 641874, 14222, '2010-06-28 00:00' UNION ALL
SELECT1000278, 14222, '2010-06-28 16:35' UNION ALL
SELECT1000301, 14222, '2010-07-01 17:08' UNION ALL
SELECT1000395, 14222, '2010-07-05 12:00' UNION ALL
SELECT1000322, 14222, '2010-07-06 07:00' UNION ALL
SELECT1000398, 14222, '2010-07-06 08:44'
-- Solution here
SELECT DISTINCTx.FaultID,
x.StoreID,
x.CallOut
FROM(
SELECTFaultID,
StoreID,
CallOut,
DATEADD(HOUR, -48, CallOut) AS FromTime,
DATEADD(HOUR, 48, CallOut) AS ToTime
FROM#Sample
) AS s
INNER JOIN#Sample AS x ON x.StoreID = s.StoreID
WHEREx.CallOut BETWEEN s.FromTime AND s.ToTime
AND x.FaultID <> s.FaultID

This may be slightly better

SELECT x.FaultID,

x.StoreID,

x.CallOut

FROM #Sample x

WHERE EXISTS (SELECT * FROM #Sample s

WHERE x.FaultID <> s.FaultID

AND x.CallOut BETWEEN DATEADD(HOUR, -48, s.CallOut) AND DATEADD(HOUR, 48, s.CallOut))

____________________________________________________

Deja View - The strange feeling that somewhere, sometime you've optimised this query before

How to get the best help on a forum

http://www.sqlservercentral.com/articles/Best+Practices/61537

cidr Ten Centuries Points: 1353 More actions · Answer 10

mark-101232 02/09/10
This may be slightly better
SELECT x.FaultID,
x.StoreID,
x.CallOut
FROM #Sample x
WHERE EXISTS (SELECT * FROM #Sample s
WHERE x.FaultID <> s.FaultID
AND x.CallOut BETWEEN DATEADD(HOUR, -48, s.Call

I tried this with the whole table which ouputs 5233 rows and it uses a lazy spool. I'm not really sure about lazy spools and will have to look it up but it seems Actual number of rows was 10726980 amonst other activity.

cidr Ten Centuries Points: 1353 More actions · Answer 11

cidr

Ten Centuries

Points: 1353

September 2, 2010 at 5:54 am

#1215527

Repeat post mistake