August 29, 2008 at 5:47 am
Hello,
Situation: I've got a situation with one of our apps that approximately 50% of the time a certain query runs very long (5-10 min.). However, when I run the query on my machine it runs in under 30 seconds. When I run the query thru the app on my machine it runs in under 30 seconds. I have optimized the query as best as I can...but the main issue is why it runs long sometimes and not others. I have traced the DB on several occasions and have concrete examples of it running long and examples of it running fast.
Question: What is the best strategy for debugging this kind of intermittent issue. How do I figure in/measure other factors in my testing (i.e., db load (other queries running at same time), server load, etc.)
I mainly have experience optimizing queries that always run slow rather than this intermittent kind.
Any strategical hints or tips would be greatly appreciated.
George H.
August 29, 2008 at 6:09 am
I understand that you are running the query with the same parameters, as I remeber having performance problems with different values being passed for the query, and having different execution times. The problem was solved by manually updating statiscitcs for particular table.
Is the memory utilization on you server high?
Any chance of posting your query?
August 29, 2008 at 6:10 am
George,
it does sound like there could be execution plan issues, as I guess on your machine you have a fairly clean procedure cache, whilst the 'live' server has a relatively busy proc cache.
Having said that I don't know what your query is doing.....
A similar tale here?http://www.sqlservercentral.com/articles/Performance+Tuning/2863/
Kev
August 29, 2008 at 6:24 am
kevriley,
Interesting. So I can test this using the examples from the traces I have? If I inspect the execution plan of the fast running query and compare it to the slow running version, it might tell me something. Then try to figure out what about the slow execution plan is making it slow?
I'll give that a try and post the results...
Thanks,
George H.
August 29, 2008 at 6:27 am
Max,
Unfortunately we're a split shop here so I don't have access to the server and OS. I even have to request traces to be run for me. It definitely makes debugging harder and I have a request into the Server guys for some insight.
The query does take different parameters and values...I'll start in that area (I don't know why I didn't think of that myself)...thanks for the idea!
I'm not sure if posting the query would really help...I'm looking more for a hi-level plan to follow or debug strategy rather than specific fixes for this one query. Especially since I have other queries with the same issue (teach a man to fish...)
Thanks again,
George H.
August 29, 2008 at 6:33 am
George,
sometimes a good indication of having a bad plan cached is that the query miraculously works fine after a reboot (or a DBCC FREEPROCCACHE), but I'm guessing from your earlier posts that you have neither the access nor the inclination to do these to a live server!!
Kev
August 29, 2008 at 8:19 am
Sounds like it could be parameter sniffing.
http://sqlinthewild.co.za/index.php/2007/11/27/parameter-sniffing/
http://sqlinthewild.co.za/index.php/2008/02/25/parameter-sniffing-pt-2/
http://sqlinthewild.co.za/index.php/2008/05/22/parameter-sniffing-pt-3/
If you post the query, we could point out potential problems with it, give you an idea what to look for, and why with the other procs.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 29, 2008 at 8:54 am
OK, I'll post the query...but it's not pretty. I've made a little progress adding one index, but it's still running great and then it runs long. I've been running it at around 1 second this morning, but I just kicked it off again and it took 2 min. 21 seconds...
select
tab.TreatmenRegimenNumber,
tab.RevisionNumber,
tab.Protocol,
'8/29/2008' as 'Treatment Date',
tab.PIName 'PI Name',
tab.Species,
tab.ABSL,
tab.CageCardid 'Cage Card',
case tab.usdacovered
when 0 then tab.secondarycagecardID
when 1 then convert(varchar,tab.AnimalID)
end AS 'ID',
'No Action' as 'status',
tab.Description AS Treatment
from
(
select
sp.usdacovered,
a.secondarycagecardID,
a.AnimalID,
a.TreatmenRegimenNumber,
a.RevisionNumber,
a.ProtocolNumber AS Protocol ,
C.LastName + ', ' + C.FirstName AS PIName,
Sp.SpeciesDescription AS Species,
a.CageCardID ,
a.CreatedDate,
a.VivariumCode,
a.RoomCardID,
a.TreatmentSheetType,
a.ClinicalChargeCode,
a.NumberOfTimes,
a.Instructions,
a.NoCharge,
a.ModifiedBy,
a.ModifiedDate,
isnull( A.NextInstanceDate ,T.FutureDate) as 'TreatMentDate' ,
A.startdate ,
A.enddate,
V.VivariumName,
S.ULARCodeDescription ,
R.RoomCode,
CH.DESCRIPTION ,
a.ProtocolNumber,
P.ABSLCode 'ABSL',
a.holidayrule,
S.ULARCode,
SpeciesOrderCode
from TreatmentRegimen A
Left Join CLINICALCHARGECODES CH
on A.CLINICALCHARGECODE=CH.CLINICALCHARGECODE
left join TreatmentRegimenFutureDates T
on A.TreatmenRegimenNumber = T.TreatmenRegimenNumber
And A.RevisionNumber = T.RevisionNumber
and T.futuredate =
(
select max(futuredate)
from TreatmentRegimenFutureDates
where futuredate ='2008-08-29 00:00:000'
)
and A.endDate > ='2008-08-29 00:00:000'
Left Join Protocols P
on a.ProtocolNumber=P.ProtocolNumber
Left Join Contacts C
on P.PIPennID=C.PennID
Left Join Species Sp
on a.SpeciesCode=SP.SpeciesCode
left join vivarium v
on a.VivariumCode=V.VivariumCode
join
(
select ULARCode,ULARCodeDescription
from ularcodes
where ULARCodeType=7
) as s
on A.TreatmentSheetType = s.ULARCode
left join rooms R
on A.RoomCardID =R.RoomCardID
where
a.RevisionNumber=
(
select max(b.RevisionNumber)
from TreatmentRegimen b
group by B.TreatmenRegimenNumber
having b.TreatmenRegimenNumber=a.TreatmenRegimenNumber
)
and A.regimenStatus is null
and
isnull( A.NextInstanceDate ,T.FutureDate) >= a.startdate
) as tab
where
(Tab.TreatMentDate ='2008-08-29 00:00:000')
and
convert(varchar,tab.TreatmenRegimenNumber)+ ','+convert(varchar,tab.RevisionNumber)+',' + convert(varchar,Tab.TreatMentDate)
not in /* REG/REV/DATE COMBO CAN'T ALREADY BE IN TREATMENT INSTANCES */
(
select convert(varchar,TreatmenRegimenNumber)+ ','+convert(varchar,RevisionNumber)+','+convert(varchar,TreatMentDate)
from Treatmentinstances
where treatmentdate = '2008-08-29 00:00:000'
)
and convert(varchar,tab.TreatmenRegimenNumber)+ ','+convert(varchar,tab.RevisionNumber)
not in
(
select convert(varchar,TreatmenRegimenNumber)+ ','+ convert(varchar,RevisionNumber)
from TreatmentRegimen TR
join holidays h
on TR.scheduleinstancedate = h.[date]
and TR.holidayrule = 6
)
and convert(varchar,tab.TreatmenRegimenNumber)+ ','+convert(varchar,tab.RevisionNumber)
not in
(
select convert(varchar,TreatmenRegimenNumber)+ ','+ convert(varchar,RevisionNumber)
from TreatmentRegimenfuturedates
where futuredate in
(
select date from holidays
where futuredate = '2008-08-29 00:00:000'
)
and tab.holidayrule=6
)
and tab.enddate>='2008-08-29 00:00:000'
/*NOTE: after this point additional restrictions can be added based on the search criteria from the app */
And (tab.TreatmentDate < = '2008-08-29 00:00:000')
And tab.ULARCode = '1'
And tab.VivariumName = 'Richards'
You'll notice several joins on the inner clause that don't have corresponding select columns on the outer clause - these inner joins are used by the concatenating WHERE clause based on the search criteria from the app. For example, the vivarium (v) table doesn't have any outer clause columns in the select clause...but a vivarium where clause can be added if someone limits their search by a vivarium name or code.
August 29, 2008 at 9:12 am
One thing that may help.
convert(varchar,tab.TreatmenRegimenNumber)+ ','+convert(varchar,tab.RevisionNumber)
not in
(
select convert(varchar,TreatmenRegimenNumber)+ ','+ convert(varchar,RevisionNumber)
from TreatmentRegimen TR
join holidays h
on TR.scheduleinstancedate = h.[date]
and TR.holidayrule = 6
)
The convert and concatenation means no index seeks. You may get better performance changing that to an if exists. I say may, because the exists requires a correlated subquery... Test carefully.
EXISTS (
select 1
from TreatmentRegimen TR
join holidays h
on TR.scheduleinstancedate = h.[date]
and TR.holidayrule = 6
WHERE TR.TreatmenRegimenNumber = a.TreatmenRegimenNumber
AND TR.RevisionNumber = a.RevisionNumber
)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 29, 2008 at 9:14 am
I assume this is in a stored proc. Can you post the rest of the proc please?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 29, 2008 at 9:50 am
Gila,
Ha ha!!! If only it were in an SP. This is a system I was hired to "fix" just 10 months ago. There are numerous issues with the system including limited usage of SPs. No...it resides in a VB.NET form and gets passed to the DB via ExecuteNonQuery.
The entire process actually consists of over 10 large SQL statements. I have traced the whole process numerous times and this query is the longest, which is why I'm attacking it first.
I, personally, would move all of these into SPs but I haven't been allocated the time.
However, my primary issue is finding out how to debug a query that runs fast sometimes and slow other times....I'm looking more for the theoretical steps one would take rather than anything specifically related to this query.
August 29, 2008 at 10:13 am
Ah. Fun.
How is the query constructed on the front end? Parmaterised (I somehow doubt) or concatenated strings?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 29, 2008 at 10:47 am
The worst way possible! They didn't even take the time to use parameters, which isn't hard at all to do in .Net.
I could go on a rant for hours...but hey! this is why they're paying me money, right? I just wish they'd let me rewrite a little of it.
FYI...I did find part of the query listed above that was a duplicate and removed it...sped it up from 2 minutes to less than 1 second.
I also added an index that converted 2 clustered index scans into index seeks.
August 29, 2008 at 12:57 pm
Right now I'm thinking it's poossibly got something to do with compile time. Perhaps try getting the admins to run a trace on Cache hit, cache miss, cache insert, RPC complete and TSQL: Batch complete. Get, at the minimum, the start time and the duration (not all the events have duration), and then run the query from the front end app (or get someone to). Filter the trace on the machine name that you're testing from
Do you get the sometimes fast, sometimes slow behaviour from query analyser with a single query? Is the sometimes fast, sometimes slow with the same query?
The thing with the proc cache, around adhoc code, is that two queries must match identically, down to the white space, for SQL to consider them the same and reuse the plan. SQL probably won't auto-parameterise something that complex, so there's no chance of a benefit there.
I'll dig up a query for the 2000 plan cache for you over the weekend, so you can see what's in the cache.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 2, 2008 at 6:27 am
Gail,
Thanks for the help.
Whenever I run query...whether in QA or from the front-end app itself...it runs fast. I haven't been able to get the query to run slow...but I have traces that show it running slow.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply