August 15, 2006 at 11:39 am
I've got a 4 CPU Xeon with HT turned on, showing up as 8 CPUs according to Windows. SS2K Std has SP4 applied. Max_DOP defaulted to 4 but I kept getting CXPACKET locks that were lasting hundreds of thousands of milliseconds. According to BOL turning down Max_DOP can help this so I set it to 2. Still getting the CX waits but not quite as bad. Still, enough to cause problems for users.
Would turning off HT help or what other suggestions does anyone have? I thought SS2K SP4 was supposed to know that HT procs were present and utilize them appropriately so I haven't tried that yet. Also, it's a major change process to get the server guys to switch that off which is a serious hassle so I'm trying to avoid it. It's not a casual test.
PS - The CXPACKET waits are almost always accompanied by PAGEIOLATCH type waits, although those only last a few dozen ms at a time. Why do the CXs last so dang long?
Thanks for any help
August 16, 2006 at 8:29 am
I expect you'll get shed loads of posts about applying maxdop on the server setup. I'd suggest you let your server use all procs - by setting the processor mask you are restricting the multithreading of your index builds, dbcc commands and lots of other sql things.
I have worked with multiproc boxes all my years with sql server ( from 6.0 ) including 8 way boxes with HT ( = 16 procs ) and to date I've never made this setting on a box. OK sermon over!!
Your cxpacket waits are bad news and show you have queries which the optimiser decides a parallel plan is best - if in fact the query doesn't benefit ( basically as it is a poor query ) then you get blocking from the threads, and it's bad news.
To solve your problem find out which procs/sql are causing parallelism and optimise them to remove the parallelism or improve it to work correctly. select * table scans on large tables will cause parallelism, missing indexes, poorly written views, inappropriate use of functions - all these are likely candidates.
Once you've isolated your problem queries - if you can't resolve the parallelism add the maxdop hint to the query ONLY - this avoids crippling your whole server.
The best analogy I can suggest is removing the spark plugs from 6 of your 8 cylinders of your V8 car so it can't exceed the 30mph speed limit in a built up area ( sorry this is a UK thing ) - you wouldn't do this would you?
I never had HT problems in w2000 and yes I'm told w2k3 suppports HT much better. I've had cxpacket problems but I've always tuned them out of the queries, not cripped the server.
Oh yes you can use profiler to capture the parallel plan stuff too. I have a process which I run on the server which can capture the information - too large to post, sorry.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
August 16, 2006 at 8:30 am
Oh yes I was going to add, as you've discovered, altering the server setting doesn't make it go away ( unless you set it to 1 ) which shows it is code not hardware related.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
August 16, 2006 at 9:34 am
Thanks for the thoughts; I should have mentioned the query. It doesn't seem poorly written and the plan shows an index seek on every table. Looks like this (simplified):
Select sales_office_id, invoice_id from data_warehouse
where (date range, order type, misc criteria)
and not exists ( select sales_office_id, invoice_id from data_warehouse_stage_valid where (date range, order type, misc criteria))
and not exists (select sales_office_id, invoice_id from data_warehouse_stage_rejects)
Data_warehouse has 15M rows, data_warehouse_stage_valid has 3M rows, and data_warehouse_stage_rejects has 200K rows. All three are index seeks according to the plan. The clustered indexes are already in use by other columns used to make more important considerations happy.
It's an attempt to find sales_office_id and invoice_id that have mysteriously disappeared from the sales office feeds. This happens a lot with our source systems, alas we can't do anything about that from here. We need to identify these records for the data analysts to track down what happened.
Is there some better way to write this than using NOT EXISTS (subquery)? There are multiple columns so the simpler NOT IN doesn't work.
August 16, 2006 at 9:50 am
You can try to load the data of the select without subquery into a temp table:
INSERT #TempTbl(sales_office_id, invoice_id)
Select sales_office_id, invoice_id from data_warehouse
where (date range, order type, misc criteria)
Then, remove existing records from the temp table
DELETE A FROM #TempTbl A INNER JOIN data_warehouse_stage_valid B ON ...
DELETE A FROM #TempTbl A INNER JOIN data_warehouse_stage_rejects B ON ...
Now the temp table has your results.
August 16, 2006 at 3:10 pm
This is the query which shows parallelism in the query plan? You should be able to see it. Just add a maxdop=1 hint in the query, this will stop the parallelism.. do a comparision to see what happens. What tends to happen with the paralleism ( when it goes wrong ) lets say your query does 100k i/o , if the plan splits 8 ways you then get 8 x 100k i/o which is a real nightmare.
not exists are a real pain - I can't give a blanket answer to resolving it, but you have to find a way around the coding.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
August 16, 2006 at 6:45 pm
NOT IN is also bad. Perhaps trying a left join and testing for null in the where clause would be faster than using not exists. If there are a lot of columns in the table and the columns required are not covered by an index, then derived tables or temp tables can reduce the IO. A group by or distinct is not required to prevent duplicates - your looking for no matches. If there are a lot of duplicates in the derived table, then perhaps a group by would reduce IO. You can join back to the original tables to get the other columns. I've seen query times reduced from several hours reduced to a few minutes doing stuff like this. It can take a lot of trial and error to get good results.
I was looking at some CXPACKET waits from a DBCC REINDEX in our maintaince plan. Is there any way to MAXDOP this? We have dual a CPU with HT enabled. I set the global max DOP to 2, but it appears the active threads can still end up on the same CPU. I still see CXPACKET waits, but not on as many threads. Would I still see the CXPACKET waits if we had 4 physical processors for the 4 logical processors?
Randy
RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉
August 17, 2006 at 7:38 am
You mention that the query is doing index seeks, but what kinds of joins is it doing. Loop, Hash, or Merge. If it is doing a loop join and returning a relatively large resultset that could be the problem. I would also look at the io stats (set statistics io on) and see what tables are having the most logical reads. Try putting different join options to force the various loop,hash and merge joins.
Tom
August 17, 2006 at 3:43 pm
I finally figured out a re-write using LEFT JOIN instead of NOT EXISTS and it runs in much less time. If anyone is interested, here's my proof of concept query I used before fiddling with the real thing:
create
table #w (office int, order_id char(1), order_dt datetime)
create
table #s (office int, order_id char(1), order_dt datetime)
create
table #r (office int, order_id char(1), order_dt datetime)
insert
#w (office, order_id, order_dt) values (1,'a','1/1/2000')
insert
#w (office, order_id, order_dt) values (1,'b','1/1/2000')
insert
#w (office, order_id, order_dt) values (1,'c','1/1/2000')
insert
#w (office, order_id, order_dt) values (1,'d','1/1/2000')
insert
#s (office, order_id, order_dt) values (1,'a','2/1/2000')
insert
#s (office, order_id, order_dt) values (1,'b','2/1/2000')
insert
#r (office, order_id, order_dt) values (1,'c','2/1/2000')
select
w.office,w.order_id
from
#w w
left join #r r on w.office = r.office and w.order_id = r.order_id and r.order_dt >= '2/1/2000'
left join #s s on w.office = s.office and s.order_id = w.order_id and s.order_dt >= '2/1/2000'
where
r.office is null and s.office is null
and
w.order_dt between '12/1/1999' and '2/1/2000'
drop
table #w
drop
table #s
drop
table #r
August 18, 2006 at 8:00 am
with reagrd to dbcc and parallelism - ent edition is optimised to use multiple threads with dbcc commands - limiting the procs is akin to going back to std edition.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
August 18, 2006 at 8:24 am
I actually am stuck with std ed. The 2GB RAM limit is brutal.
August 18, 2006 at 10:21 am
Magarity, I'm curious. Can you tell us the actual performance improvement?
RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉
August 18, 2006 at 11:49 am
5 minutes cut down to 1 minute on my test server with less data; No reason not to hope for similar ratio on the production but it has to go wait in line for the QA people before getting deployed so I won't know for sure for a while yet...
August 19, 2006 at 9:10 am
btw HT is allegedly correctly supported in w2k3, although I never had problems with w2k.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
August 19, 2006 at 10:27 am
On the point about HT you may find the following link interesting. Its to one of the sql server teams blogs where he talks about whether to use HT or not.
http://blogs.msdn.com/slavao/archive/2005/11/12/492119.aspx
David
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply