May 28, 2003 at 4:51 pm
We recently put a Dell 6600 w/ 4 CPU's into production and have had some problems with it. It has locked up solid twice in 3 weeks. There was no response from the OS, keyboard, mouse, screen, etc. We were forced to do a hard boot from the front panel. The system logs show nothing other than "the system shutdown unexpectedly at 4:53pm." The SQL logs show nothing...they just end. The first time it happened, the system logs were corrupt and had to be wiped. This 6600 has Intel Xeon hyperthreading CPUs. I'm wondering if that could be causing the problem. The SQL server(2000 Ent. Edition, SP1) reports 8 CPUs, as does the OS (Windows Advanced Server). The MS documentation says Ent. Edition with SP1 supports HyperThreading.
In a possibly related event, the following query will always deadlock with itself if run on this server:
SELECT
ic.Code,
s.Descr,
Cnt = COUNT(DISTINCT a.Key1)
FROM Industry_Codes ic WITH (NOLOCK)
INNER JOIN Address a WITH (NOLOCK) on ic.Key1 = a.Key1
INNER JOIN Sic s WITH (NOLOCK) ON ic.code = CONVERT(INT,s.sic)
WHERE a.Primary_Flag = 2
AND ic.Code IN
3341,3571,3572,3575,3577,3579,3661,3669,3672,3679,3911,4813,4822,
5044,5045,5065,5093,5734,5932,6221,6512,6514,7349,7359,7373,7376,
7377,7378,7379,8211,8221,8249,9199,9431,9511)
GROUP BY ic.Code, s.Descr
ORDER BY ic.Code
The following error is returned:
Server: Msg 1205, Level 13, State 2, Line 2
Transaction (Process ID 55) was deadlocked on {communication buffer} resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
I think we may just see if we can disable the HyperThreading...Any idea how?
-Dan
Edited by - dj_meier on 05/28/2003 5:06:32 PM
Edited by - dj_meier on 05/28/2003 5:07:20 PM
-Dan
May 29, 2003 at 12:54 am
Email the base table creates and I will try on my hyperthreaded machine..
Chris Kempster
www.chriskempster.com
Author of "SQL Server Backup, Recovery & Troubleshooting"
Author of "SQL Server 2k for the Oracle DBA"
May 29, 2003 at 10:03 am
I'm not sure if having the empty tables will help...I created empty versions of the tables on my machine and the query works fine. On our server, the address table has about 17 million rows and the industry_codes table has about 32 million rows and the sic table has only a few thousand.
The query not working is less problematic than the lock-ups. Does anyone know how to disable the Hyperthreading? Is it in the BIOS?
-Dan
-Dan
June 3, 2003 at 9:46 pm
Hi
Im not sure you can! take a look at the affinity options for CPU alloc in sql server, you can set it up in such a way (from memory here) that you can force the use of the same cpu's and not the hyper threaded ones.
Btw, have a read of this for some general info on hyper threading.
http://www.arstechnica.com/paedia/h/hyperthreading/hyperthreading-1.html
Cheers
Ck
Chris Kempster
www.chriskempster.com
Author of "SQL Server Backup, Recovery & Troubleshooting"
Author of "SQL Server 2k for the Oracle DBA"
June 4, 2003 at 10:07 am
in c:\boot.ini you can specify /NUMPROC=4
this will tell Windows to use the first 4 CPUs which are your "real CPUs"
hth,
joe
June 12, 2003 at 6:49 pm
Thanks for the post Joe...
Cheers
Ck
Chris Kempster
www.chriskempster.com
Author of "SQL Server Backup, Recovery & Troubleshooting"
Author of "SQL Server 2k for the Oracle DBA"
June 12, 2003 at 8:18 pm
Have you tried changing the value for Lock_Timeout sys. var
Also you can refer to "Troubleshooting Deadlocks" on BOL
MW
MW
June 13, 2003 at 12:19 pm
The funny thing is the same query run on a non-hyperthreaded machine has no problems. Our @@LOCK_TIMEOUT is set to -1.
-Dan
-Dan
June 13, 2003 at 6:24 pm
I tried this query, which is similar to yours and the execution plan seem very efficient
I don't know, but you can try the query to see if you get same error msg.
Select a.au_id, c.title_id, CNT = COUNT(DISTINCT a.au_id)
From Pubs..Authors a With (NOLOCK)
INNER JOIN Pubs..TitleAuthor b With (NOLOCK) on a.au_id = b.au_id
INNER JOIN Pubs..Titles c With (NOLOCK) on b.title_id = c.title_id
Where b.au_ord = 2 AND
a.au_id IN('172-32-1176','213-46-8915','238-95-7766','267-41-2394',
'274-80-9391','341-22-1782','409-56-7008','427-17-2319',
'472-27-2349','486-29-1786','527-72-3246','648-92-1872',
'672-71-3249','712-45-1867','722-51-5454','724-08-9931',
'724-80-9391','756-30-7391','807-91-6654','846-92-7186',
'893-72-1158','899-46-2035','998-72-3567')
Group By a.au_id, c.title_id
Order By a.au_id
MW
MW
June 19, 2003 at 1:46 pm
Well....I think we isolated the problem with lock-ups to a hardware problem. We had a Dell service-tech replace the backplane, cpu board, power-distribution board, SCSI card and all the internal SCSI cables. Previous to this, we had a lock-up occur with hyperthreading off. So the lock-ups mose likely were NOT caused by HyperThreading. The query I listed before still causes a deadlock every time after about 3-5 seconds. I'll post any new developments.
-Dan
-Dan
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply