May 2, 2007 at 1:51 pm
We have few instances that occur from time to time of a peculiar problem. Many of our apps run SQL queries using ADO from Delphi code. Some of these run fine most of the time, but occasionally start timing out. For example, there are a couple of queries which are essentially regular jobs run by a scheduler that usually run well within the timeout limit set for them (e.g. in one case a job that usually runs in about 120 - 140 seconds against a 240 second timeout suddenly starts timing out).
Sometimes, this problem fixes itself on a subsequent invocation. However, the strange thing is that if I take the SQL that it executes and run it in a Query Analyser window immediately after it has just failed, it usually runs within the expected duration, and if I cause it to run again under the control of the Delphi app, it again starts behaving properly, as if just running it once in the Query Analyser has changed something.
Obviously, it might result in some caches being populated, and I guess it will be running with a server-side cursor in Query Analyser rather than client-side as in the app. However, I don't see how this can explain this behaviour.
I've fastened onto ADO being a possible cause of this, although I could be convinced otherwise. Does anyone have any idea what is going on with this?
Thanks,
Tony
May 2, 2007 at 4:15 pm
Keep in mind Tony that the amount of time a query takes is not solely dependant on that query alone; rather, on the cumulative load that all concurrent queries are placing on the SQL Server. It is possible that when the query ran the first time and timed out, that there were other queries running at the same time against the same resources and this caused the query in question to hit the timeout threshold. You are also correct in saying that the buffer cache plays a part in this as well in that any subsequent run of the same query, whether through ADO or QA, may benefit by the data pages already existing in cache. So I would have to disagree that the problem is just ADO. It would help for you to know more about what other processes are jockeying for resources at the time of the timeout.
On another note, I couldn't help but notice that you mentioned this process may be a cursor. This could be part of the problem as cursors are in-efficient by nature. You may stand to benefit by posting your cursor to this or another thread. If it can be re-written in a set based way, someone here will help you.
May 3, 2007 at 1:01 am
> So I would have to disagree that the problem is just ADO.
Let me amplify on the evidence for this. Another instance of this same type of issue was the problem I had last night. One of the users complained that a particular report, which runs under interactive control, again from an ADO-based app, was repeatedly timing out. This report usually runs inside 30 seconds against a 2 minute timeout. At around 10pm (the quietest time of day on our SQL Server as it happens) I repeated her actions using the app and reproduced the problem. I then took the exact SQL that the application was running and ran it in a Query Analyser window - it ran in under 30 seconds. I then went back to the application, and, hey presto, the report was running in under 30 seconds again. In addition, when it is having problems, I run my script for identifying blocked SPID chains (using sysprocesses..master.blocked) and it doesn't show anything.
This is a pattern that repeats itself with this problem (which actually only seems to occur every few weeks) - when it's broken, it seems to stay broken when I run it from within the app. First time I run it in Query Analyser, it's fine, and then it works thereafter in the app. All of this seems to point very strongly away from it being a straightforward locking contention problem.
> you mentioned this process may be a cursor
I'm not talking about T-SQL "DECLARE CURSOR" type things here - the cursor location is an ADO concept really I guess - e.g. see http://www.sqlteam.com/item.asp?ItemID=11842
Queries running in QA seem to be more akin to running with a server side cursor (although in a sense I guess the concept perhaps doesn't apply) - it is just one possible source of difference.
Once I've cleared the problem, it doesn't recur till the next time. I regret not doing a detailed profiler trace on it when it was failing last night, and I think I might put a "hidden" option in the app to switch to an infinite timeout, to see if the process will ever finish.
Tony
May 3, 2007 at 2:25 am
It is not clear whether your Delphi apps are local or remote, but since they are scheduled runs could there be occasional clashes with
And maybe you could you recast the jobs as Stored Procedures. With the Delphi apps I inherited I stripped out all the inline SQL, replacing it with calls to SPs. Apart from running at the server end it allows you to alter some procedural logic without recompiling.
May 3, 2007 at 2:44 am
Not sure what you mean by "local or remote" - they're not running on the actual SQL server box, but they're all on machine with low latency network links via a LAN.
I come back to the actual symptoms - run from the app repeatedly, and it fails repeatedly; run it once in QA, works first time, and thereafter it's fine in the app. This pattern recurs for a selection of different scheduled jobs and interactive reports perhaps every couple of weeks over several months. Of course, this could be sheer coincidence, and it could by down to "occasional clashes" of the nature you mention, but in view of these symptoms, I'd put the mortgage on it not being that!
The jobs are all stored procedures.
It's a really strange symptom, and I have to admit I'm fairly amazed that despite fairly extensive searching (on message boards like this, Usenet etc.), I've not come across anyone who reports seeing anything similar.
Tony
May 3, 2007 at 3:05 am
Next time the problem occurs you could try not to run the procedure in QA but run "exec sp_recompile procedure". Then run the procedure from Delphi at least two times. If it works again you might do a recompile now and then...
regards karl
Best regards
karl
May 3, 2007 at 3:15 am
That's an interesting idea, that it might having something to do with the query plan, and maybe it gets implicitly recompiled when using QA for some reason? I guess next time it happens I'll enable recompile events in the profiler. Not sure why the query plan should suddenly deteriorate to such an extent, but certainly something to explore.
Tony
May 3, 2007 at 6:42 am
Tony may be on to something....a typical problem where something works fine, and then slows down over time often points to out of date statistics on the database; this can happen if auto statistics were turned off, and the data is inserted/updated more frequently than the scheduled job updates statistics manually(if that job even exists)
I worked at a place for a while where a new dba came on board, turned off statistics without telling anyone (in theory to improve performance) and all the stored procs failed with timeouts 3 days later, as the statistics went out of date and the query plans were no longer valid. that guy got fired for it, which might have been an overreaction, but noone knew what happened, and the issue was found only after profiling to see where the problem was with performance.
the actual load for updating statistics automatically is minor compared to an application's timing out, that's for sure .
Enterprise Manager>>Right click a database for properties>>Options>>Auto Update Statistics should be checked
Lowell
May 3, 2007 at 10:24 am
Is your application using DTC to manage transactions? We have a problem from time-to-time in which DTC basically refuses to close a transaction and release all of it's locks. We have a somewhat crazy environment here and the source code was written alomst a decade ago, but we find that a transaction is opened and even though the code closes the connection, DTC leaves things open. We are in the process of replacing the application that is doing this to us, so we are not invensting in replacing it, so I cannot give you direction to fixing it.
Specifically, we have found that DTC can create a deadlock that SQL does not recognize as a deadlock (why would it? - it is two different connections.) We were able to get to this point by waiting for it to happen and then stopping and starting the DTC services and the process would complete.
When building a delphi com application, DTC is really easy to use for transaction management, but it seems to cause some strange problems now and then.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply