February 19, 2014 at 8:59 am
our production server has 2 numa nodes with 8 core each. MAXDOP is set to default, which is 0.
A SQL Agent job that extracts 13m records and then load to other tables constantly run over 5-8 hours.
Strange Scenario 1:
We take a snapshot of the production databases and restore to a much much smaller box (not NUMA), it only runs 40 minutes - the same codes against the same data with the same statistics!
Strange Scenario 2:
Once a week, the antivirus scan (Trend Micro officescan) will wake up on this production server (I know we probably shouldn't do this, but anyways) at approximately the same time as the job, and when the scan runs, the job surprisingly takes only 40 minutes. It's counter-intuitive because you would think when there is additional load on the server, the job should run slower!
I have experimented this:
added a hint to the extract query - (option maxdop 1), the query runs much faster (from 32 minutes to 12 minutes).
I have read some articles about NUMA, and strange things can happen....
but can anyone explain what is going on here?
Thank you!
February 20, 2014 at 9:25 am
For diagnosis, try a few runs with different query hints - I'd usually try MAXDOP 1, 2, 4, and 8, then perhaps even 5 (just enough to guarantee crossing nodes).
Also, exactly how many and which model CPU's do you have? I'm guessing at least some of your "cores" are actually just hyperthreads, but I don't know how old they are.
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply