January 10, 2013 at 2:17 pm
Hi,
I have a curious case here: there is a SQL Query that is causing 90% of CPU on one of the SQL servers also it run in 14 seconds, on other server with similar specs the query executes in less than a second and no cpu spikes also the server configurations are the same:
What can be happening here?
select distinct
spid = cast(p.spid as int),
program = p.program_name,
host = p.hostname,
login = p.loginame,
databasename = replace(db_name(ISNULL(convert(int,l.rsc_dbid),-1)),char(39),char(39)+char(39)),
object = ISNULL(l.rsc_objid,-1),
request_mode = substring (v.name, 1, 8),
blockedwait = b.waittime,
inputbuffer = case when isnull(p.net_address, '') = '' then null else (select case when dmv.objectid is null then left(max(text),4000) else null end from sys.dm_exec_sql_text(p.sql_handle) dmv group by dmv.objectid) end,
starttime = dateadd(ss,cast(b.waittime as float)/1000 * -1,getutcdate())
from
master..sysprocesses p (nolock)
left join master..sysprocesses b (nolock)
on p.spid = b.blocked
left outer join master..syslockinfo l (nolock)
on b.spid = l.req_spid
and l.req_status in (2,3)
left join master.dbo.spt_values v
on convert(int,l.[req_mode]) + 1 = v.number
where
p.blocked = 0
and p.program_name not like 'SQLAgent%'
and p.program_name not in ('SQL PerfMon')
and b.blocked <> 0
and b.waittime > 166000
and v.type = 'L'
Thanks for your help
January 10, 2013 at 7:05 pm
Check if actual execution plans are different. Are those virtual machines?
January 10, 2013 at 8:12 pm
Please provide the actual execution plan for both servers for the query.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
January 10, 2013 at 9:06 pm
How can I send the execution plan on an xml format?
Thanks
January 10, 2013 at 9:13 pm
See the first link in my signature. The instructions for posting the exec plan are there.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
January 10, 2013 at 11:54 pm
APA0876 (1/10/2013)
there is a SQL Query that is causing 90% of CPU on one of the SQL servers also it run in 14 seconds, on other server with similar specs the query executes in less than a second and no cpu spikes also the server configurations are the same:
did you run the query at the same time?
are both(or other servers ) returning same records for this query ?
do both servers have same load (like job or background prcosses)
do they have same equal no of connections (at the time of query) ?
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
January 11, 2013 at 8:10 am
Bhuvnesh (1/10/2013)
APA0876 (1/10/2013)
there is a SQL Query that is causing 90% of CPU on one of the SQL servers also it run in 14 seconds, on other server with similar specs the query executes in less than a second and no cpu spikes also the server configurations are the same:did you run the query at the same time?
are both(or other servers ) returning same records for this query ?
do both servers have same load (like job or background prcosses)
do they have same equal no of connections (at the time of query) ?
Yes I run the query at the same time
Yes 0 records
different Load the server with the spikes is 20 % avg CPU and the other 5% Avg CPU
On the server having the CPU spike there are 1700 connections and the other 10 connections.
But what should I do, this query is exected every 6 minutes because is part of a diagnostic Tool.
Thanks
Ailyn
January 11, 2013 at 8:18 am
I would completly re-write it all, for one all the tables used are depreciated and are in the product for SQL 2000 compatability only, you really should be using the DMV's.
January 11, 2013 at 8:18 am
APA0876 (1/11/2013)
Bhuvnesh (1/10/2013)
APA0876 (1/10/2013)
there is a SQL Query that is causing 90% of CPU on one of the SQL servers also it run in 14 seconds, on other server with similar specs the query executes in less than a second and no cpu spikes also the server configurations are the same:did you run the query at the same time?
are both(or other servers ) returning same records for this query ?
do both servers have same load (like job or background prcosses)
do they have same equal no of connections (at the time of query) ?
Yes I run the query at the same time
Yes 0 records
different Load the server with the spikes is 20 % avg CPU and the other 5% Avg CPU
On the server having the CPU spike there are 1700 connections and the other 10 connections.
But what should I do, this query is exected every 6 minutes because is part of a diagnostic Tool.
Thanks
Ailyn
Post the execution plans so we can better see what is happening and provide better answers.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
January 11, 2013 at 8:40 am
APA0876 (1/10/2013)
How can I send the execution plan on an xml format?Thanks
I posted two SQLPlans one for the server with the spike and the other sql for server that is OK
January 11, 2013 at 8:52 am
That diagnostic tool you use seems dangerous. I've seen people clogged-up their servers with third-party diagnostic tools, and also with SQL Profiler. Use lighter methods instead, like dmv's and highly filtered server-side trace or extended events.
First-aid would be to try sp_whoisactive. See what is in WAIT column. It also has option to display execution plan of running query.
See perfmon counters, especially System->Processor Queue Length. It should be 0 most of the time, and if it is above 2 for more than 1 second, you have a serious CPU problem.
See also SQLServer: Batch Statistics-> Compilations / sec and Recompilations / sec.
January 11, 2013 at 1:21 pm
anthony.green (1/11/2013)
I would completly re-write it all, for one all the tables used are depreciated and are in the product for SQL 2000 compatability only, you really should be using the DMV's.
Hi,
Based of further investigation, it does not matter that I re-write the query using the DMV as I execute a select to the equivalent view SELECT * FROM sys.dm_tran_locks and I get the cpu spike too.
The issue is related to the following views:syslockinfo, sys.dm_tran_locks apparently when the server have 12000 sessions which is our case, the query to those views is causing CPU spike as we have another server with similar specs but the difference is the number of sessions and does have the spike. We are talking here of a very powerfull machine , 80 CPUs, 256 GB RAM
Can anyone offer some feedback regarding this, this is just happening with SQL Server 2008 R2 SP2,
Thanks
January 11, 2013 at 2:10 pm
How many of this 12 000 connections is active (actually executing something, not just sitting there opened)?
How many NUMA nodes that machine have? Did you messed with CPU affinity mask setting?
Do you have any HA solution involved (cluster, mirror, replication) ?
Do you have compressed tables/indexes,
How many sql instances do you have on that machine? How many databases are there per instance?
What are overall waiting stats on the server?
What was the output of the sp_whoisactive (WAIT column) ?
It's very hard to solve the problem with little or no information.
This requires serious investigation. Consider hiring an expert to solve this.
January 12, 2013 at 9:35 am
Vedran Kesegic (1/11/2013)
How many of this 12 000 connections is active (actually executing something, not just sitting there opened)?How many NUMA nodes that machine have? Did you messed with CPU affinity mask setting?
Do you have any HA solution involved (cluster, mirror, replication) ?
Do you have compressed tables/indexes,
How many sql instances do you have on that machine? How many databases are there per instance?
What are overall waiting stats on the server?
What was the output of the sp_whoisactive (WAIT column) ?
It's very hard to solve the problem with little or no information.
This requires serious investigation. Consider hiring an expert to solve this.
1- those connections are just opened we can have just 20 or 15 active
2-We haven't changed the affinity mask, we haven't change any configuration regarding NUMA,
3-We have transactional replication, but just some tables not the whole database and it is an Active/Passive Cluster
4-No compressed Tables or Indexes
5- Just one SQL Instance and 3 databases but just one database is the one in Production with at least 1500 users connected
6- top 5 waits of the server
wait_typewaiting_tasks_countwait_time_msmax_wait_time_mssignal_wait_time_ms
SLEEP_TASK36852490152.61E+09125491.44E+09
CXPACKET30237645401.09E+1024060371.35E+09
OLEDB20844140503.43E+087685720
SOS_SCHEDULER_YIELD800830129858699881318084942124
LAZYWRITER_SLEEP7985025838.53E+09540594122082789
7- This only happens when there is more than 12000 sessions opened and it is happening on this SQL version 2008 r2 enterprise as in SQL server 2005 enterprise I executed the same query under the same conditions and no CPU spike, apparently this is something related with the way Microsoft build this view on this SQL version
January 12, 2013 at 6:58 pm
This should not spike your CPU:
SELECT count(*) FROM sys.dm_tran_locks WITH(NOLOCK)
Run it on both machines and compare the numbers. Almost idle system is a very different situation than a busy server.
SQL server will have much more locks to wait for, much more memory allocated by connections (and thus memory will be probably spread across several NUMA nodes that additionally slows down the CPU when it wants to access that memory), different execution plans, different IO pressure, memory pressure, basically everything is different if the load is different.
Other thoughts:
If you only have 20 active connections for 12 000 opened connections - that is overwhelming number of idle connections and you have to reduce that number. You can do that by adjusting connection strings to open less connections initially, and all the clients should use one (or very few) connection strings. Because each connection string that is different by other connection strings even by one letter or one space - will start a new connection pool with initially opened connections that are idle.
Here are the connection string parameters for System.Data.SqlClient namespace, SqlConnection class's ConnectionString property: http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlconnection.connectionstring.aspx
CXPACKET wait means you have parallelism involved, and one thread has finished before others and waits for others to finish. It can be controlled at system level by max degree of parallelism parametar, or on query level with MAXDOP hint. In DWH it makes sense that maxdop is 0 (auto) or more than 1, but in OLTP you probably want set that parameter to 1!
Can you describe that CPU spike little more? Is it really a "sipke" (a very short, sub-second period of time with very high activity)?
By "90% CPU" you mean:
a) one CPU usage climbed to 90% for a very short period of time
b) all 80 CPU's climbed to 90% (for a very short period of time)
c) 72 CPU's climbed to 100% and 8 CPUs were idle (which gives total cpu 90%)
Viewing 15 posts - 1 through 15 (of 28 total)
You must be logged in to reply to this topic. Login to reply