December 17, 2009 at 12:16 am
HI,
Our Production DB is in SQl 2000 Sp4 , recently we had numerious performance issues , we use to get process id error from spid 4 (lock monitor) , I have increased the memory from 12 to 16 gb for sql , but still i noticed few pagelatch_sh and application timeouts, the following is the data from SQlperf , please post your thoughts.
MISCELLANEOUS
848.00.00.0LCK_M_SCH_S
0.00.00.0LCK_M_SCH_M
0.00.00.0LCK_M_S
8592.04.8018608E+71379.0LCK_M_U
571.031487.016.0LCK_M_X
157.056499.020000.0LCK_M_IS
5982.06.3662284E+715210.0LCK_M_IU
5.019264.00.0LCK_M_IX
714.06128549.0379.0LCK_M_SIU
0.00.00.0LCK_M_SIX
0.00.00.0LCK_M_UIX
0.00.00.0LCK_M_BU
0.00.00.0LCK_M_RS_S
0.00.00.0LCK_M_RS_U
0.00.00.0LCK_M_RIn_NL
0.00.00.0LCK_M_RIn_S
0.00.00.0LCK_M_RIn_U
0.00.00.0LCK_M_RIn_X
0.00.00.0LCK_M_RX_S
0.00.00.0LCK_M_RX_U
0.00.00.0LCK_M_RX_X
0.00.00.0SLEEP
163718.09.4931856E+79.392788E+7IO_COMPLETION
31212.0359625.058841.0ASYNC_IO_COMPLETION
26.01921.00.0RESOURCE_SEMAPHORE
0.00.00.0DTC
0.00.00.0OLEDB
74705.01.6345974E+93.7003581E+9FAILPOINT
0.00.00.0RESOURCE_QUEUE
901846.01.9207293E+84.8216068E+7ASYNC_DISKPOOL_LOCK
605.00.00.0UMS_THREAD
0.00.00.0PIPELINE_INDEX_STAT
737.024204.04093.0PIPELINE_LOG
0.00.00.0PIPELINE_VLM
0.00.00.0WRITELOG
511603.07539372.038251.0LOGBUFFER
1691.0983766.0625.0PSS_CHILD
0.00.00.0EXCHANGE
5599.07441.0532.0XCB
0.00.00.0DBTABLE
2.017328.00.0EC
0.00.00.0TEMPOBJ
0.00.00.0XACTLOCKINFO
0.00.00.0LOGMGR
0.00.00.0CMEMTHREAD
3415678.0805722.0104205.0CXPACKET
2964140.08.2519504E+7377315.0PAGESUPP
281440.0240487.027569.0SHUTDOWN
0.00.00.0WAITFOR
0.00.00.0CURSOR
0.00.00.0EXECSYNC
132.00.00.0LATCH_NL
0.00.00.0LATCH_KP
0.00.00.0LATCH_SH
90.048.016.0LATCH_UP
13013.029995.03321.0LATCH_EX
4.0914988E+72.7929955E+86408823.0LATCH_DT
0.00.00.0PAGELATCH_NL
0.00.00.0PAGELATCH_KP
50565.061519.057946.0PAGELATCH_SH
2189192.02161577.0790007.0PAGELATCH_UP
6449710.06.4527844E+72227482.0PAGELATCH_EX
582509.0132978.070621.0PAGELATCH_DT
0.00.00.0PAGEIOLATCH_NL
0.00.00.0PAGEIOLATCH_KP
0.00.00.0PAGEIOLATCH_SH
7616121.04.1792989E+8202022.0PAGEIOLATCH_UP
876.036066.062.0PAGEIOLATCH_EX
122434.03279873.03941.0PAGEIOLATCH_DT
0.00.00.0TRAN_MARK_NL
0.00.00.0TRAN_MARK_KP
0.00.00.0TRAN_MARK_SH
0.00.00.0TRAN_MARK_UP
0.00.00.0TRAN_MARK_EX
0.00.00.0TRAN_MARK_DT
0.00.00.0NETWORKIO
240008.0148651.00.0Total
6.6549508E+72.8996262E+93.8529142E+9
December 17, 2009 at 11:10 am
Our Production DB is in SQl 2000 Sp4 , recently we had numerious performance issues , we use to get process id error from spid 4 (lock monitor) , I have increased the memory from 12 to 16 gb for sql , but still i noticed few pagelatch_sh and application timeouts, the following is the data from SQlperf , please post your thoughts.
What's error r u getting?Can u post that.
What's the version of sql/OS.
Muthukkumaran Kaliyamoorthy
https://www.sqlserverblogforum.com/
December 19, 2009 at 7:34 am
Error 1229: Process ID %d:%d owns resources that are blocking processes on scheduler %d.
I have refered the following article but its of no use , currently i have disabled the error by enabling the traceflags 1261
http://support.microsoft.com/kb/319892
When ever the above issue occurs clients are getting disconnected and the DB is not accessable by clients .
Steps taken:
Increased sql memory from 12 gb to 16 GB
Increased the Parallel query execution from 5 to 8
This is SQl 2000 Sp4 . and OS is windows 2003 with latesh sp's
I tried full trace to capture it but its huge , i have noticed latch_ex and a particualr delete statement is causing the lock and its clearing on its own .
Any advise on the issue is greatly appreciated.
December 19, 2009 at 10:52 am
sqlnsg (12/19/2009)
Error 1229: Process ID %d:%d owns resources that are blocking processes on scheduler %d.I have refered the following article but its of no use , currently i have disabled the error by enabling the traceflags 1261
Did u ckcek this higlighted lines.
The new error messages include the following series.
Extended Lock Detection: 1223, 1229
Scheduler Hung Detection: 17881, 17883
All Schedulers Hung Detection: 17882, 17884
Warning Note that a health related problem is often the result of a condition that was experienced previously. You must study the SQL Server error log and the system event logs carefully to determine the actual root cause.
For example, a 17883 error message may indicate a scheduler problem. However, the error log may show a previous exception that incorrectly left the SQL Server process in a poor state, or the application might have caused a severe blocking condition.
Muthukkumaran Kaliyamoorthy
https://www.sqlserverblogforum.com/
December 19, 2009 at 11:09 am
Hi,
Run the following query & find the blocking,opentran...
select open_tran,* from master.dbo.sysprocesses where spid >=51 and blocked<>0
Steps taken:
Increased the Parallel query execution from 5 to 8
Did u adjust the maxdop setting?
Chek the following URL might be help
http://www.sqlnewsgroups.net/sqlserver/t159-sp3a-concurrency-errors.aspx
Increased sql memory from 12 gb to 16 GB
AWE enabled on the box
This is SQl 2000 Sp4 . and OS is windows 2003 with latesh sp's
post the sql version (OS also)
select @@version
Edited:For add the url
Muthukkumaran Kaliyamoorthy
https://www.sqlserverblogforum.com/
December 21, 2009 at 11:19 pm
The following is the sql 2000 version
Microsoft SQL Server 2000 - 8.00.2282 (Intel X86) Dec 30 2008 02:22:41 Copyright (c) 1988-2003 Microsoft Corporation Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 2)
And AWE is enabled .
There is no specific error messages , I use to get the following message in sql logs, When is verify the spid with Dbcc input buffer (No Event 0NULL) some times i get Delete statement .
Error: 17805, Severity: 20, State: 3 Invalid buffer received from client.
Apart from this we get sql agent warning messages in applog and no errors on Syslogs .
Thanks for your time and feedback.
December 22, 2009 at 2:16 am
sqlnsg (12/21/2009)
The following is the sql 2000 versionMicrosoft SQL Server 2000 - 8.00.2282 (Intel X86) Dec 30 2008 02:22:41 Copyright (c) 1988-2003 Microsoft Corporation Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 2)
And AWE is enabled .
There is no specific error messages , I use to get the following message in sql logs, When is verify the spid with Dbcc input buffer (No Event 0NULL) some times i get Delete statement .
Error: 17805, Severity: 20, State: 3 Invalid buffer received from client.
Apart from this we get sql agent warning messages in applog and no errors on Syslogs .
Thanks for your time and feedback.
Check the MS link Error: 17805, Severity: 20, State: 3
Also,Is there any N/W problem
Muthukkumaran Kaliyamoorthy
https://www.sqlserverblogforum.com/
December 22, 2009 at 2:26 am
Hi,
There is no specific error messages , I use to get the following message in sql logs, When is verify the spid with Dbcc input buffer (No Event 0 NULL) some times i get Delete statement .
Did u check the delete statement ?
select open_tran,* from master.dbo.sysprocesses where spid >=51 and blocked<>0
select open_tran,* from master.dbo.sysprocesses where spid >=51 and blocked <> 0
Apart from this we get sql agent warning messages in applog and no errors on Syslogs .
Also double check ur windows error log
Run-->Eventvwr.msc
Muthukkumaran Kaliyamoorthy
https://www.sqlserverblogforum.com/
January 4, 2010 at 4:40 am
Hi,
Happy new year ! I have run the select statementand uploaded the output.txt , PAGEIOLATCH_SH and PARALLEL_PAGE_SUPPLIER (43D71784) , recently we have upgraded the storage firmware as well to overcome any i/o related issues , increased sql memory as well, some times during peak hours we have db un-availability issues , appreciate your feedback on this ouput .
Best regards
January 5, 2010 at 10:14 am
Please check the following:
1. You have to check the IO of the database you are using most. -- try to load balance.
2. What time is this issue happening. if its specific time monitor on that time closely. --could be other activity might be happening during that time (maintainance job/other)
3. From how long is it happening? any specific changes happened --db/application/number of user?
4. Check the application is any changes into that?
5. Check the queries which are taking longer time/resource. try to tune it. -- include (nolock) and check execution plan.
6. Include H/W and network team to check the system during peak/impacted time.
7. Lastly... as it looks like huge database, plan to upgrade to sql server 2005/2008 🙂
My understanding is:
PAGEIOLATCH_SH :generally due to IO and
Latch_EX: poor query.
http://www.mombu.com/microsoft/sql-server/t-long-waits-on-latch-ex-and-cxpacket-294445.html
Generally both should not impact much.
HTH
Vinay
Thanx.
Vinay
http://rdbmsexperts.com/Blogs/
http://vinay-thakur.spaces.live.com/
http://twitter.com/ThakurVinay
January 6, 2010 at 8:19 am
sqlnsg (1/4/2010)
Hi,Happy new year ! I have run the select statementand uploaded the output.txt , PAGEIOLATCH_SH and PARALLEL_PAGE_SUPPLIER (43D71784) , recently we have upgraded the storage firmware as well to overcome any i/o related issues , increased sql memory as well, some times during peak hours we have db un-availability issues , appreciate your feedback on this ouput .
Best regards
Hi,
Thanks same to u.
I saw ur attachment and please check what Vinay said
Muthukkumaran Kaliyamoorthy
https://www.sqlserverblogforum.com/
January 6, 2010 at 8:22 am
For example, the following conditions may occur:
1. SPID 55 wants to read a data page that does not exist in the buffer pool.
2. SPID 55 acquires an EX latch on the page. Because the page does not exist yet in memory, the requested latch mode is EX. The EX latch mode forces other SPIDs that may also want to access the page to wait for the I/O request to finish. The EX latch mode also prevents other SPIDs from issuing a duplicate I/O request for the same page.
3. SPID 55 issues the I/O request to read the page from disk.
4. Because SPID 55 wants to read the page, SPID 55 must wait for the I/O request to finish. To wait for the I/O request to finish, SPID 55 tries to acquire another latch that has the shared (SH) latch mode on the same page. Because an EX latch has already been acquired, the SH latch request is blocked, and the SPID is suspended. Because the EX latch that blocks the SH latch request was also acquired by SPID 55, the SPID is temporarily reported as blocking itself.
5. When the I/O request finishes, the EX latch on the page is released.
6. The release of the EX latch gives the SH latch to SPID 55.
7. SPID 55 can now read the page.
Between step 4 and step 5, the sysprocesses table indicates that SPID 55 is blocked by itself together with a waittype of PAGEIOLATCH_XX. In this waittype, XX may be SH, UP, or EX. This behavior indicates that SPID 55 issued an I/O request and SPID 55 is waiting for the I/O request to finish.
http://support.microsoft.com/kb/906344
Muthukkumaran Kaliyamoorthy
https://www.sqlserverblogforum.com/
January 6, 2010 at 8:34 am
recently we have upgraded the storage firmware as well to overcome any i/o related issues , increased sql memory as well, some times during peak hours we have db un-availability issues , appreciate your feedback on this ouput .
What do u mean db un-availability?Is there any corruption.
what is the disk configuration r u using?
http://www.sqlnewsgroups.net/group/microsoft.public.sqlserver.server/topic4167.aspx%5B/url%5D
I'd recommended run the perfmon.msc during the peak hours and collect the data and analyze it.
Muthukkumaran Kaliyamoorthy
https://www.sqlserverblogforum.com/
January 6, 2010 at 8:39 am
sqlnsg increased sql memory as well
Check the server memory using the following query
select * from master.dbo.sysperfinfo
where counter_name like 'Target Server Memory (KB)%'
or counter_name like 'Total Server Memory (KB)%'
Muthukkumaran Kaliyamoorthy
https://www.sqlserverblogforum.com/
January 8, 2010 at 4:17 am
Thanks a lot for all your inputs and feedback. Ok basically the issue here is SQl dose'nt respond to client request in particulat , its diffcult to capture the trace as its a heavy transactions , Client is reporting slow response , some sometime timeout issue , . I am attaching fe of the perfmon counter snapshot for your reference . And yester we received the following I/O error as well.
SQL Server has encountered 32 occurrence(s) of IO requests taking longer than 15 seconds to complete on file [G:\Program Files\Microsoft SQL Server\MSSQL\data\tempdb.mdf] in database [tempdb] (2). The OS file handle is 0x00000700. The offset of the latest long IO is: 0x000000af9b2000
Request to post your feedback. Appreciate all your time .
Best regards,
SQLNSG
Viewing 15 posts - 1 through 15 (of 21 total)
You must be logged in to reply to this topic. Login to reply