September 1, 2010 at 4:31 am
Hi folks and thanks in advance for any help.
This is boggling me a bit. I have a table that is to do with fault callouts for a company. If there is a fault in a shop store, an Engineer is called out and this is recorded in the table.
I've been asked to identify when a callout for the same store has been made with 48 hours of the last one. This is to show if an engineer has not corrected the problem and has had to come back out to fix it. This will evaluate Engineers performance.
Here is sample data that contains the storeID, the callout dates and always different faultIDs
FAultID StoreID Callout Date
6279170142222010-04-26 00:00:00.000
6418740142222010-06-28 00:00:00.000
10002780142222010-06-28 16:35:00.000
10003010142222010-07-01 17:08:00.000
10003950142222010-07-05 12:00:00.000
10003220142222010-07-06 07:00:00.000
10003980142222010-07-06 08:44:00.000
As you can see, there are three sets of dates that are within 48 hours of each other:
FaultIDs 641874 and 1000278 are 16.35 hours from each other.
faultIDs 1000395, 1000322 are around 19 hours from each other.
Also, 1000322 and 1000398 are aound 1.44 hours from each other.
These five records would be shown with the callout notes to determine if an engineer has not corrected the fault properly and has had to come back out.
It should look like this
FAultID StoreID Callout Date
6418740142222010-06-28 00:00:00.000
10002780142222010-06-28 16:35:00.000
10003950142222010-07-05 12:00:00.000
10003220142222010-07-06 07:00:00.000
10003980142222010-07-06 08:44:00.000
FaultID 627917 and 1000301 are not included in this report because they're outwith 48 hours of any other dates in the column for that particular store number.
I'm very unsure how to do grab only records that fit the above criteria. Is there anyone that can help me?
Many thanks
September 1, 2010 at 6:24 am
I haven't checked performance on this but it should work:
CREATE TABLE #temp
(
faultID INT,
storeID VARCHAR(6),
calloutDate DATETIME
)
INSERT INTO #temp
SELECT 627917, '014222', '04/26/10'
UNION ALL
SELECT 641874, '014222','06/28/10'
UNION ALL
SELECT 1000278, '014222', '06/28/10 16:35'
UNION ALL
SELECT 1000307, '014222','07/01/10 17:08'
UNION ALL
SELECT 1000395, '014222','07/05/10 12:00'
UNION ALL
SELECT 1000322, '014222','07/06/10 7:00'
UNION ALL
SELECT 1000398, '014222','07/06/10 8:44'
SELECT * FROM #temp T
WHERE EXISTS (SELECT * FROM #temp T2
WHERE t.storeID = t2.storeID
AND t2.calloutDate > t.calloutDate
AND t2.calloutDate < DATEADD(hh,48,t.calloutDate))
union
SELECT * FROM #temp T
WHERE EXISTS (SELECT * FROM #temp T2
WHERE t.storeID = t2.storeID
AND t2.calloutDate < t.calloutDate
AND t2.calloutDate > DATEADD(hh,-48,t.calloutDate))
ORDER BY T.calloutDate
DROP TABLE #temp
September 1, 2010 at 8:04 am
Hey Matt, very smart, thanks a lot for this. It seems to work and performance seems ok
🙂
September 1, 2010 at 9:56 am
This form of a query is known as a "triangular join". You might want to check out this article[/url] for clues on why this query just might be the worse thing you've ever done to your server. If you are running this on a large table, you can quite literally bring the server to it's knees.
Edit: Just an FYI: I ran this query against those 7 rows of sample data... it had to touch 98 rows to get the answer.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
September 1, 2010 at 11:39 am
Okay, so now that I blasted the previous query, here's one that will perform better.
Remarks are within the code.
if OBJECT_ID('tempdb..#temp') IS NOT NULL DROP TABLE #temp;
CREATE TABLE #temp
(-- need to sequential number the rows, and have a clustered index on it
RowID INT IDENTITY PRIMARY KEY CLUSTERED,
faultID INT,
storeID VARCHAR(6),
calloutDate DATETIME);
INSERT INTO #temp
SELECT 627917, '014222', '04/26/10'
UNION ALL
SELECT 641874, '014222','06/28/10'
UNION ALL
SELECT 1000278, '014222', '06/28/10 16:35'
UNION ALL
SELECT 1000307, '014222','07/01/10 17:08'
UNION ALL
SELECT 1000395, '014222','07/05/10 12:00'
UNION ALL
SELECT 1000322, '014222','07/06/10 7:00'
UNION ALL
SELECT 1000398, '014222','07/06/10 8:44'
-- get the results
-- this first query gets the first record if it's within 48 hours of the next
SELECT t1.*
FROM #temp t1
JOIN #temp t2
ON t2.RowID = t1.RowID + 1
AND DATEDIFF(hour, t1.calloutDate, t2.calloutDate) <= 48
UNION -- to eliminate any duplicates, do not use UNION ALL
-- this second query gets the second record if it's within 48 hours of the previous
SELECT t1.*
FROM #temp t1
JOIN #temp t2
ON t2.RowID +1= t1.RowID
AND DATEDIFF(hour, t2.calloutDate, t1.calloutDate) <= 48
For a comparison, this has to touch just 20 records. Almost 20% of the other query.
From the IO stats: 2 scans vs 4.
Edit: I forgot to note that there is probably an even more efficient way to do this, but I can't quite figure it out right now.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
September 1, 2010 at 12:26 pm
Wayne,
Thanks for pointing out the triangular join I had going on. I never see them until someone points them out. I can definitely see the benefits you show in your query but it looks like that assumes order in the table and only 1 storeID exisiting. Since this is in the SQL 7 or 2000 section would the next thing to use would be a numbers table to join to the previous/next row.
September 1, 2010 at 1:04 pm
This recent article:
If you're in 2k5 or greater, this is an elegant solution to your self referencing issues.
[edit] Sorry, wrong section for that solution. [/edit]
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
September 1, 2010 at 2:31 pm
matt6288 (9/1/2010)
Wayne,Thanks for pointing out the triangular join I had going on. I never see them until someone points them out.
No problem. (Just look for < or >)
I can definitely see the benefits you show in your query but it looks like that assumes order in the table and only 1 storeID exisiting. Since this is in the SQL 7 or 2000 section would the next thing to use would be a numbers table to join to the previous/next row.
The code I used should be SQL 7/2000 compatible. However, you are right - I don't think I considered different storeIDs. You guarantee the order by the identity column in the temp table - you just need to ensure that the select that populates it is ordered by storeID/Date.
I first came up with a "quirky update" solution, but then realized that this was only getting me the rows that were 48 hours after the previous - it wasn't getting the previous row unless it also was within 48 hours of it's previous row. And this did handle multiple storeIDs. Looks like I might have to revisit it... I'll be back with a better version shortly.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
September 2, 2010 at 2:30 am
Just to let you know. I tried this one from another forum and it touched 56 rows. I've not tried your method yet Wayne, but I wil let you know how I get on.
create table#Sample
(
FaultID INT NOT NULL,
StoreID INT NOT NULL,
CallOut SMALLDATETIME NOT NULL
)
INSERT#Sample
SELECT 627917, 14222, '2010-04-26 00:00' UNION ALL
SELECT 641874, 14222, '2010-06-28 00:00' UNION ALL
SELECT1000278, 14222, '2010-06-28 16:35' UNION ALL
SELECT1000301, 14222, '2010-07-01 17:08' UNION ALL
SELECT1000395, 14222, '2010-07-05 12:00' UNION ALL
SELECT1000322, 14222, '2010-07-06 07:00' UNION ALL
SELECT1000398, 14222, '2010-07-06 08:44'
-- Solution here
SELECT DISTINCTx.FaultID,
x.StoreID,
x.CallOut
FROM(
SELECTFaultID,
StoreID,
CallOut,
DATEADD(HOUR, -48, CallOut) AS FromTime,
DATEADD(HOUR, 48, CallOut) AS ToTime
FROM#Sample
) AS s
INNER JOIN#Sample AS x ON x.StoreID = s.StoreID
WHEREx.CallOut BETWEEN s.FromTime AND s.ToTime
AND x.FaultID <> s.FaultID
September 2, 2010 at 2:50 am
cidr (9/2/2010)
Just to let you know. I tried this one from another forum and it touched 56 rows. I've not tried your method yet Wayne, but I wil let you know how I get on.
create table#Sample
(
FaultID INT NOT NULL,
StoreID INT NOT NULL,
CallOut SMALLDATETIME NOT NULL
)
INSERT#Sample
SELECT 627917, 14222, '2010-04-26 00:00' UNION ALL
SELECT 641874, 14222, '2010-06-28 00:00' UNION ALL
SELECT1000278, 14222, '2010-06-28 16:35' UNION ALL
SELECT1000301, 14222, '2010-07-01 17:08' UNION ALL
SELECT1000395, 14222, '2010-07-05 12:00' UNION ALL
SELECT1000322, 14222, '2010-07-06 07:00' UNION ALL
SELECT1000398, 14222, '2010-07-06 08:44'
-- Solution here
SELECT DISTINCTx.FaultID,
x.StoreID,
x.CallOut
FROM(
SELECTFaultID,
StoreID,
CallOut,
DATEADD(HOUR, -48, CallOut) AS FromTime,
DATEADD(HOUR, 48, CallOut) AS ToTime
FROM#Sample
) AS s
INNER JOIN#Sample AS x ON x.StoreID = s.StoreID
WHEREx.CallOut BETWEEN s.FromTime AND s.ToTime
AND x.FaultID <> s.FaultID
This may be slightly better
SELECT x.FaultID,
x.StoreID,
x.CallOut
FROM #Sample x
WHERE EXISTS (SELECT * FROM #Sample s
WHERE x.FaultID <> s.FaultID
AND x.CallOut BETWEEN DATEADD(HOUR, -48, s.CallOut) AND DATEADD(HOUR, 48, s.CallOut))
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537September 2, 2010 at 4:32 am
mark-101232 02/09/10
This may be slightly better
SELECT x.FaultID,
x.StoreID,
x.CallOut
FROM #Sample x
WHERE EXISTS (SELECT * FROM #Sample s
WHERE x.FaultID <> s.FaultID
AND x.CallOut BETWEEN DATEADD(HOUR, -48, s.Call
I tried this with the whole table which ouputs 5233 rows and it uses a lazy spool. I'm not really sure about lazy spools and will have to look it up but it seems Actual number of rows was 10726980 amonst other activity.
September 2, 2010 at 5:54 am
Repeat post mistake
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply