January 26, 2020 at 6:11 pm
Hello.Does anybody have a way to identify what transactions ran in an interval (i.e. 1 minute or 5 minute)?
I have Solarwinds telling me that the Baseline transaction rate is increasing 10 fold every 2nd minute - i.e theTransactions / second follows this pattern - 100 / 1300 / 100 / 1300.........
Obviously this looks like an application running on a timer (it isn't a SQL Agent Job) - however this is a cluster shared by various applications and nobody will admit to running something on a 2 minute interval.
Although I can pull various information about counts I am unable to identify what is causing this and would like to identify a DB, User, Server, SQL - whatever and struggling to see why I am struggling with this?
Regards - Steve O.
January 26, 2020 at 6:47 pm
setup some Extended Events to capture all sql alongside host where it originated and username - let it run for awhile and that will tell you whats/who is doing it
January 27, 2020 at 12:26 pm
Solarwinds should have a way to tell you what queries were being run, I thought. The product it used to be had that.
However, agree, if you need to capture the query metrics yourself, extended events is your best bet in this situation.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 29, 2020 at 7:54 pm
Thanks for the responses.
I was digging into the DMVs and managed to replicate the sort of info DPA gave me.
I got nowhere with Solarwinds other than I now have a graph showing transaction counts.
Solarwinds give me the most executed transaction in an hour - but I cannot correlate this info with the jump in transaction/rate i.e. the highest values were similarly high before the issue manifested.
I was going to investigate how to set up extended events (from scratch) when I got back home last night but then discovered that things had reverted to normal at around 4:30 yesterday (after 10+days). Today however I found that it started again at around 7:30 this morning. As you may expect nobody can account for this behaviour.
I am looking at this as an aside to an other reported problem (as is usual for me multiple anomalies present when looking at something else). We do not have issues with CPU, Memory or Blocking (IO is a bit variable but that is something else) that I normally associate with a problem. What I am seeing is a spike in transaction/rate alone and this spike seems to be constrained by something in as far as it never seems to exceed around 8K transactions/second however many users and how much activity there is on the system. I am now questioning what is constraining it - we don't seem to be anywhere near exhausting worker threads and have an ever increasing PLE, with CPU averaging way below 50% and no Blocking. I had assumed that the maximum transaction rate for a server would be determined as a function of the HW components available to that server assuming no Blocking issues - but there appears to be some other limiting factor in play here. Is a high transaction rate in and of itself a problem - as logically some resources are being used to satisfy this? Our main Enterprise Edition server only ever hits about 1K transactions/second as opposed to the 8K I am seeing on this server.
Regards
Steve O.
January 29, 2020 at 9:57 pm
Our main Enterprise Edition server only ever hits about 1K transactions/second as opposed to the 8K I am seeing on this server.
What is the server with the problems being used for? Is it strictly within the domain or can it be reached from outside the organization?
Alan H
MCSE - Data Management and Analytics
Senior SQL Server DBA
Best way to ask a question: http://www.sqlservercentral.com/articles/Best+Practices/61537/
January 29, 2020 at 10:29 pm
Hello.
The Enterprise server has multiple large databases in it and each has 1 or more web portals plus services and a large employee user base hitting it.
The server in question is used mainly in house with one moderately sized application and a handful of smaller applications. There is little or no external web traffic to this server. It has always run Standard Edition and in fact applications have migrated to another instance over the past year reducing the load on the server so we never have PLE issues (as we do on the other) and the workload does not trouble the CPUs or have IO issues.
As it happened I logged on from home and see that the issue stopped again tonight at 6PM.
I will endeavour to read up on extended events if I can in preparation for when it reoccurs......
Regards
Steve O.
January 30, 2020 at 9:12 am
What about wait statistics? Have you compared before and after there to see if it gives you a clue as to what's up?
As to Extended Events. They're not terribly difficult to set up. Be sure you filter them well or you'll end up with a lot of data. Unfortunately the GUI, which is actually quite powerful, is not intuitive at all. I have a video that may help. Also, check my blog for a whole bunch of posts on the topic (not all of them are rants about how bad Profiler is, promise, some are helpful). Plan to output to file. Then either use the GUI to munge the data, or, steal some of the XQuery code on my blog or others. Either way, you should be able to identify what's happening and, largely, where it's coming from.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 2, 2020 at 12:48 pm
Thanks Grant - I will take a look and have gleaned some clues.
I have identified a number of issues but the Tranactions/Sec Count does not seem to be related it just comes and goes and DPA flags it as an issue vs Baseline. I did run the GUI and thought I could capture a minute of 2 of events (200K+) but it won't let me copy more than 1K to analyse in Excel.
Regards
Steve O.
February 2, 2020 at 1:43 pm
Ideally you would load the events onto a sql server table and do the analysis and initial filtering there - easier with high volumes.
sample example that I use to do such analysis - you may be interested on some of the fields that are commented out so play around as required.
This requires that you have access to the xel files - if not on the prod server then ask for those files to be copied to another location - does not matter if you load/process them on the same server where they were generated
below are both the create session command that I normally use to debug parts of our applications, as well as the "load and shred" of the output events files.
on the create events I name them according to the application and I output the trace to a shared folder on my "DBA" server.
create event session [-- add session name here --] on server
add event sqlserver.exec_prepared_sql,
add event sqlserver.existing_connection (
action (package0.callstack, package0.collect_system_time, sqlserver.client_app_name, sqlserver.client_hostname
, sqlserver.database_name, sqlserver.nt_username, sqlserver.query_hash, sqlserver.session_id, sqlserver.sql_text, sqlserver.tsql_stack, sqlserver.UserName)),
add event sqlserver.login (set collect_options_text = (1)
action (package0.callstack, package0.collect_system_time, sqlserver.client_app_name, sqlserver.client_hostname
, sqlserver.database_name, sqlserver.nt_username, sqlserver.query_hash, sqlserver.session_id, sqlserver.sql_text, sqlserver.tsql_stack, sqlserver.UserName)),
add event sqlserver.logout (
action (package0.callstack, package0.collect_system_time, sqlserver.client_app_name, sqlserver.client_hostname
, sqlserver.database_name, sqlserver.nt_username, sqlserver.query_hash, sqlserver.session_id, sqlserver.sql_text, sqlserver.tsql_stack, sqlserver.UserName)),
add event sqlserver.rpc_starting (
action (package0.callstack, package0.collect_system_time, sqlserver.client_app_name, sqlserver.client_hostname
, sqlserver.database_name, sqlserver.nt_username, sqlserver.query_hash, sqlserver.session_id, sqlserver.sql_text, sqlserver.tsql_stack, sqlserver.UserName)
where ([package0].[greater_than_uint64] ([sqlserver].[database_id], (4)) and [package0].[equal_boolean] ([sqlserver].[is_system], (0)))),
add event sqlserver.sp_statement_completed (set collect_statement = (1)
action (package0.callstack, sqlserver.client_hostname, sqlserver.database_name, sqlserver.nt_username, sqlserver.plan_handle
, sqlserver.query_plan_hash, sqlserver.session_id, sqlserver.sql_text, sqlserver.tsql_stack)),
add event sqlserver.sql_batch_completed (
action (package0.callstack, package0.collect_system_time, sqlserver.client_app_name, sqlserver.client_hostname
, sqlserver.database_name, sqlserver.nt_username, sqlserver.query_hash, sqlserver.session_id, sqlserver.sql_text, sqlserver.tsql_stack, sqlserver.UserName)
where ([package0].[greater_than_uint64] ([sqlserver].[database_id], (4)) and [package0].[equal_boolean] ([sqlserver].[is_system], (0)))),
add event sqlserver.sql_batch_starting (
action (package0.callstack, package0.collect_system_time, sqlserver.client_app_name, sqlserver.client_hostname
, sqlserver.database_name, sqlserver.nt_username, sqlserver.query_hash, sqlserver.session_id, sqlserver.sql_text, sqlserver.tsql_stack, sqlserver.UserName)
where ([package0].[greater_than_uint64] ([sqlserver].[database_id], (4)) and [package0].[equal_boolean] ([sqlserver].[is_system], (0))))
add target package0.event_file (set filename = N'-- add destination folder and session prefix here --', max_rollover_files = (10))
with (max_memory = 4096 kb, event_retention_mode = allow_single_event_loss, max_dispatch_latency = 30 seconds, max_event_size = 0 kb, memory_partition_mode = none, track_causality = on, startup_state = off)
go
-- by default session is created in stop mode - if required enable it as part of the create script
-- alter event session [-- add session name here --] on server state = start -- once session is created just issue this one to start recording
-- alter event session [-- add session name here --] on server state = stop -- once we are done with the sampling just stop the session
if object_id('tempdb..#events2') is not null
drop table #events2
;
select cast(event_data as xml) as event_data
, object_name
into #events2
from sys.fn_xe_file_target_read_file('\\servername\traces\events_prefix*.xel', null, null, null)
;
if object_id('tempdb..#traces') is not null
drop table #traces
;
select *
into #traces
from (select n.value('(@name)[1]', 'varchar(50)') as event_name
, n.value('(@package)[1]', 'varchar(50)') as package_name
, n.value('(@timestamp)[1]', 'datetime2') as [utc_timestamp]
--, n.value('(data[@name="duration"]/value)[1]', 'bigint') as duration
--, n.value('(data[@name="cpu_time"]/value)[1]', 'bigint') as cpu
--, n.value('(data[@name="physical_reads"]/value)[1]', 'bigint') as physical_reads
--, n.value('(data[@name="logical_reads"]/value)[1]', 'bigint') as logical_reads
--, n.value('(data[@name="writes"]/value)[1]', 'bigint') as writes
, n.value('(data[@name="row_count"]/value)[1]', 'bigint') as row_count
, n.value('(data[@name="last_row_count"]/value)[1]', 'bigint') as last_row_count
--, n.value('(data[@name="line_number"]/value)[1]', 'bigint') as line_number
--, n.value('(data[@name="offset"]/value)[1]', 'bigint') as offset
--, n.value('(data[@name="offset_end"]/value)[1]', 'bigint') as offset_end
, n.value('(action[@name="username"]/value)[1]', 'nvarchar(max)') as username
, n.value('(data[@name="statement"]/value)[1]', 'nvarchar(max)') as statement
, n.value('(data[@name="batch_text"]/value)[1]', 'nvarchar(max)') as batch_text
, n.value('(action[@name="sql_text"]/value)[1]', 'nvarchar(max)') as sql_text
, n.value('(action[@name="database_name"]/value)[1]', 'nvarchar(128)') as database_name
, n.value('(action[@name="client_app_name"]/value)[1]', 'nvarchar(128)') as client_app_name
, n.value('(action[@name="client_hostname"]/value)[1]', 'nvarchar(128)') as client_hostname
, n.value('(action[@name="attach_activity_id_xfer"]/value)[1]', 'nvarchar(128)') as attach_activity_id_xfer
, n.value('(action[@name="attach_activity_id"]/value)[1]', 'nvarchar(128)') as attach_activity_id
, n.value('(action[@name="session_id"]/value)[1]', 'bigint') as session_id
, ed.event_data
from #events2 ed
cross apply ed.event_data.nodes('event') as q (n)
) t
;
select t.event_name
, t.session_id
, t.utc_timestamp
, t.statement
, t.sql_text
, t.batch_text
, t.event_data
, t.client_app_name
, t.database_name
, t.client_hostname
, t.attach_activity_id_xfer
, t.attach_activity_id
, t.username
from #traces t
where 1=1
and t.client_hostname = 'servername'
and t.utc_timestamp between '2019-10-31 09:34:30' and '2019-10-31 09:34:54'
and t.session_id in (611, 179)
--and t.statement like '%INSERT%INTO%job_run_event%'
--and t.statement is not null
--and t.UserName = 'XXXXXX\XXXX'
--and t.client_app_name <> 'dbForge SQL Complete'
--and t.event_name like 'cursor%'
--and coalesce(sql_text, '') not like 'FETCH API%'
and coalesce(sql_text, '') not like 'exec sp_unprepare%'
and coalesce(t.statement, '') not like 'exec sp_unprepare%'
order by t.session_id
, t.utc_timestamp
select t.session_id
, count(*)
, sum(case when event_name like 'cursor_open%' then 1 else 0 end) as has_open
, sum(case when event_name like 'cursor_close%' then 1 else 0 end) as has_close
, min(t.utc_timestamp)
, max(t.utc_timestamp)
from #traces t
where 1=1
and t.client_hostname = 'servername'
and t.utc_timestamp between '2019-10-31 09:34:30' and '2019-10-31 09:34:54'
group by t.session_id
order by min(t.utc_timestamp)
select distinct replace(t.sql_text, 'select SCOPE_IDENTITY() AS GENERATED_KEYS', '')
from #traces t
where t.session_id = 179
and coalesce(sql_text, '') not like 'FETCH API%'
order by 1
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply