October 27, 2005 at 3:19 pm
We recently had a problem with performing inserts on a table via a stored procedure that had been working well for about 10 months. Interesting thing is that about 10 months ago we had the same problem as described below.
We have multiple applications that call a windows service which in turn calls a stored procedure that inserts data into a table that basically acts as a queue. This table isn't extremely active however it does get about 15-30 inserts a minute.
When we ran into the problem the other day we didn't have any locking or blocking however we did encounter large i/o wait times, latch waits, and higher using cpu times than we typically see, especially i/o wait times. As soon as the problem was encountered, all insert attempts by the applications were failing. We stopped and restared the windows service thinking it could be having a problem but that had no effect. The database server was running at only 10% cpu and memory wasn't a problem. In addition, we had ample hard drive space and the data file was about 500mb from the threshhold before it would need to have it's size increased. The problem was only limited to the one table so we didn't focus on the database beyond that initial space check.
We corrected the issue by doing the same thing we did the last time we had the problem. We created an insert statement that was similar to the one in the sp and ran it in query analyzer. After about 2 minutes, it completed and then everything started working again.
I can't understand how that would magically make it start working, but it's worked twice now. Here's my questions:
1. Why would all of those attempts to insert a record into the table fail. We estimate over 500 attempts were made by applications during this time and not one made it into the table. We get that number because the table has an auto incremented id column that we noticed had increased by over 500 on the first successful entry after the failure?
2. Why would successfully inserting a record through query analyzer make everything work again. Even the sp being called by the apps started working without any intervention.
I would be happy to supply more information if anyone is willing to provide some insight to this matter.
SQL 2000 sp3a. The table in question has about 350,000 rows currently and is about 4.6gb in size
I'd appreciate any thoughts.
Todd
October 27, 2005 at 4:04 pm
What was the error encountered when the inserts failed?
October 28, 2005 at 6:31 am
could this be related to an index with a high fill factor? if an index is created with a high fill factor, say 90%, but you have a lot of inserts into the table with the index, when the pages reserved for inserted data are filled, won't they hit a certain point and the index needs to be rebuilt, I would suspect that the table would be locked while that operation is occuring, and then of course it goes away/automatically corrects itself after SQL Server has rebuilt the index.
why it works in QA and not a web page/application command? i'd bet that a page times out after 30 seconds of waiting, but QA waits forever for the operation to complete.
just a guess, but HTH
Lowell
October 28, 2005 at 8:59 am
Lowell,
Thanks for the feedback
We do have a couple of indexes on the table and indexes are being rebuilt on a weekly basis and I thought that they might somehow play into this but I didn't think that indexes would just recreate themselves without explicitly requesting them to do so through reindexing commands. Is that an incorrect assumption?
As far as the application having problems, I agree that it could be a time out issue but the time is set to 90 seconds and it still doesn't answer as to why having a transaction complete, even though it's through QA would cause everything to start working fine again. That's probably the most perplexing issues for me at this point.
October 28, 2005 at 9:49 am
I am confused! You are guessing it is timing out, is there no log or run time error being thrown? For all you know you are inserting duplicate records and a PK violation is being thrown.
October 28, 2005 at 10:00 am
He didn't specifically mention any errors, so i jumped to the conclusion tha that it must be a hardware issue;Kory is right, are you getting any errors back before we assume something more exotic?
an index(or multiple indexes) on a 4.6 gig table would be pretty big, and if it caused a page split on the table so that it could organize/structure the data for availability(due to the high insert volume), i figured that was a reasonable guess as to why he might have have to wait a long time for the server to respond again.
Lowell
October 28, 2005 at 10:11 am
How large is the table (#rows) ? could you post the DDL for the table and indexes ? It is possible that the index(es) were 'splitting' ...
RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."
October 28, 2005 at 10:21 am
Kory, we aren't getting any errors being thrown at least none that are being noticed or captured that I am aware of.
The table is about 350,000 rows
Here's the ddl for the table and indexes
CREATE TABLE [dbo].[SERVICE_QUEUE] (
[Service_Queue_ID] [numeric](18, 0) IDENTITY (1, 1) NOT NULL ,
[Data1] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Data2] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[VBComponent] [varchar] (256) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ResponseText] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CurrentStatus] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[DateAdded] [datetime] NULL ,
[AddedBy] [numeric](18, 0) NULL ,
[DateCompleted] [datetime] NULL ,
[DateProcessed] [datetime] NULL ,
[MachineName] [varchar] (128) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[UUID] [uniqueidentifier] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
CREATE INDEX [IX_SERVICE_QUEUE] ON [dbo].[SERVICE_QUEUE]([Service_Queue_ID], [CurrentStatus]) WITH FILLFACTOR = 90 ON [PRIMARY]
GO
CREATE INDEX [IX_CurrentStatus] ON [dbo].[SERVICE_QUEUE]([CurrentStatus]) WITH FILLFACTOR = 90 ON [PRIMARY]
GO
CREATE INDEX [IX_UUID] ON [dbo].[SERVICE_QUEUE]([UUID]) WITH FILLFACTOR = 90 ON [PRIMARY]
GO
October 28, 2005 at 11:34 am
Kory,
I just wanted to add some info about the other part of your response. The primary key field is an incremented id field and none of our indexes are concerned about uniqueness so this shouldn't be a problem should it?
October 28, 2005 at 11:55 am
All, my PK violation was just an example to get us thinking! I think we shouold take a moment and step through this and not get ahead of ourselves by guessing what may be taking place with the table (that doesn't mean it is not the table)
Lets establish some facts:
1) We have applications calling a service 2) The service call as stored procedure 3) No blocking locks are present 4) No lingering locks are present during an insert, but we do see some I/O wait at times 5) Your quote "As soon as the problem was encountered, all insert attempts by the applications were failing" or maybe they were never attempted 6) We have the table structure and alot of info about what may be happening with the table 7)Contains 350,000 rows. The table is small as far as rows, but each row can be large do to 3 text data types.
Questions:
How do we know the inserts are failing or even being attempted? It may be the service calling the proc! Do you have a log for the service?
If we know the service is calling the proc with out flaw then why are the inserts failing? We should get an error? Deadlock, PK violation , FK violation, Invalid syntax! something (assuming you are testing for and raising errors '@@error' in the proc)
Lets see the proc?
October 28, 2005 at 12:24 pm
Index splitting anybody ?
RegardsRudy KomacsarSenior Database Administrator"Ave Caesar! - Morituri te salutamus."
October 28, 2005 at 2:30 pm
Rudy,
Could you elaborate on that? When an index splits, my understanding is that it only creates 1 additonal page at a time and only moves some of the rows from that the page that's "full" to the newly created page? It doesn't physically move all the data in the entire index around on disk
Kory,
I have attached the sp code. The service has no logs available, but I know that attempts were being made because our monitoring solution was able to capture that this sp had been attempted many times after the incident started. Sadly, I don't have any logs for the service
CREATE Procedure sp_ADD_SERVICE_QUEUE
(
@Data1 text,
@Data2 text,
@VBComponent varchar (256),
@AddedBy numeric,
@Service_Queue_ID numeric OUTPUT,
@UUID uniqueidentifier = null
 
As
declare @LastError int
INSERT INTO
Service_Queue
(
Data1,
Data2,
VBComponent,
AddedBy,
DateAdded,
UUID
 
values
(
@Data1,
@Data2,
@VBComponent,
@AddedBy,
GetDate(),
@UUID
 
select @LastError = @@ERROR
If @LastError = 0
SELECT @Service_Queue_ID = @@IDENTITY
Return @LastError
GO
October 28, 2005 at 2:51 pm
If there is no blocking locks or lingering locks (during insert/indexing) what is happening? It took two minutes to run this procedure manually? Is the database db_option 'auto close' set and the service is timing out before the database can open (long shot)?
Questions about the service:
Does it process the inserts to the queue serially?
What is the timeout?
Dreams for how the service uses the below procedure code? Per process log Success/Failure a) Success with output variable (identity) and return status (obviously 0 for no error) b) Failuer with timeout/runtime error or Failure with Return status (obviously non 0)
select @LastError = @@ERROR
If @LastError = 0
SELECT @Service_Queue_ID = @@IDENTITY
Return @LastError
October 28, 2005 at 3:04 pm
We don't have auto close set on any of our databases so not that
Yeah, the inserts are processed by the service serially. The timeout is set to 90 seconds.
October 28, 2005 at 3:46 pm
My issues are I don't see an insert taking more than 90 seconds (causing the service to time out) based on all the previos info about the table.
Your sure the service does nothing with the procedures output value or return value below?
select @LastError = @@ERROR
If @LastError = 0
SELECT @Service_Queue_ID = @@IDENTITY
Return @LastError
The only thing I can think of would be to add logging to the service with proc call info, start times, end time, output parm, and return val. This would definitly allow you to monitor the loading of the queue.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply