February 13, 2023 at 3:23 pm
Hi All,
Need some guidance here.
We have multiple integrations with our centralized data warehouse database.
Recently, a new application got integrated and management has defined an SLA of 2secs for every query coming from that application.
Its kind of an API call. Now, that particular team is complaining that sometimes the api calls are taking 16 secs and sometimes 25 secs and sometimes finishes fast.
The app team is complaining it is all database issue. A lot of drama is in place as of now. I need some inputs here.
My observations
1. They are tracking overall time taken for the API call and pointing fingers at DB Team.
2. API call not just sends one sql stmts but it does a bunch of things in the call ( eg . . selects , inserts , updates ...).
3. The app team cannot repro at will. API call happens randomly in a week by the external users. There is no pattern or a particular time frame.
4. App Team didn't provide the sql stmts which the call as part of the API call. All they say is, to run continous trace and 1 day it will trace filled like 2.6 TB worth data even after keeping db filter in the server side trace.
we are collecting sql:starting,sql stmt:completed. batch starting , batch completed, RPC starting, RPC completed. events with selected columns like spid,textdata,txnid,dbid,etc..
5. We don't have any monitoring tool in place to really tell whether it is db is the issue or db has completed run queries but it is something else which is slowing down.
They just chasing db team.
6. We are seeing heavy blocking and txn's being held for long durations. A lot of stale connections opened for more than 2-4 days status=sleeping coming from JDNC client app.
This issue is ongoing for past 2 weeks and we are no where. sometimes, we haven't collected the right events and we had to redo our data collections. Sometimes, the issues doesn't happen at all because external users didn't make that call.
Initially , we thought of collecting statements taking more duration than 2 milli secs and when we show cased those sql stmts, they never belonged to that API call. The App team were not able to provide us the right info and going through back and forth.
I suggested to call server side trace before API call and stop it after API call. But they are not listening and even management is supporting the App team. Going through a rough time. We suggested few things like OPTION WITH RECOMPILE for statements sometimes taking more and sometimes taking less time. Also, we have enabled RCSI and Snapshot isolation levels at database level. However, they are still seeing blocking.
Please suggest on how can we go about troubleshooting this particular issue. we don't really know if sql is problem or their app processing has any problem or any network issues.
We are on SQL 2017 Enterprise Edition.
Regards,
Sam
February 13, 2023 at 4:46 pm
You can capture blocking session with extended events https://learn.microsoft.com/en-us/troubleshoot/sql/database-engine/performance/understand-resolve-blocking
Have you enabled query store? Whilst 2ms probably won't show up, the 16 sec and 25 sec durations would
February 13, 2023 at 4:51 pm
You can also investigate https://github.com/spaghettidba/XESmartTarget
February 14, 2023 at 1:33 pm
A bunch to unpack here. Let's start with the nerdier stuff.
You list a bunch of events for capturing query metrics, but, you're missing a few, sp_statement_completed for example, since you're capturing sql statements within batches, you'll also need individual calls within procedures if you want a more complete picture.
However, just know that running this stuff for any length of time does two things. It does put a load on the server, especially capturing statement level commands. It generates a lot of data. So, I'd be very cautious about running this long term. Instead, I'd run it in short bursts. Capture 20 minutes worth of data in the middle of the day and then worry at that for a while.
Why OPTION RECOMPILE? You know that will slow down a given query because it must create a new execution plan at each execution. This is only helpful if you're experiencing a certain kind of parameter sniffing issue. In fact, if you're not hitting that issue, you're putting additional load on the machine, slowing down your queries. I'm extremely wary of query hints, especially without evidence that the hint is going to help.
"5. We don't have any monitoring tool in place to really tell whether it is db is the issue or db has completed run queries but it is something else which is slowing down.
They just chasing db team."
Yeah. They are. This is your moment. It's about identifying the issue. So, yeah, blocked process report, the queries (always the queries), wait stats (these don't tell you the problem, but the indicate where to look for the problem). Your development team is absolutely chasing you in order to help them identify the problem and help them fix it. You should absolutely be able to point to a given batch or procedure and say "This one is causing the most pain." Then, you should be able to drill into that batch or proc and say "These three statements are the worst and here's why." Then, show them an execution plan that illustrates the problem. Then, propose a solution, that you've TESTED, along with the improved measures and superior execution plan.
This is why you're there and have superior knowledge of SQL Server and its internals. Yeah, they're chasing the DBA team, as they should.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 15, 2023 at 2:53 pm
A bunch to unpack here. Let's start with the nerdier stuff.
You list a bunch of events for capturing query metrics, but, you're missing a few, sp_statement_completed for example, since you're capturing sql statements within batches, you'll also need individual calls within procedures if you want a more complete picture.
However, just know that running this stuff for any length of time does two things. It does put a load on the server, especially capturing statement level commands. It generates a lot of data. So, I'd be very cautious about running this long term. Instead, I'd run it in short bursts. Capture 20 minutes worth of data in the middle of the day and then worry at that for a while.
Why OPTION RECOMPILE? You know that will slow down a given query because it must create a new execution plan at each execution. This is only helpful if you're experiencing a certain kind of parameter sniffing issue. In fact, if you're not hitting that issue, you're putting additional load on the machine, slowing down your queries. I'm extremely wary of query hints, especially without evidence that the hint is going to help.
"5. We don't have any monitoring tool in place to really tell whether it is db is the issue or db has completed run queries but it is something else which is slowing down.
They just chasing db team."
Yeah. They are. This is your moment. It's about identifying the issue. So, yeah, blocked process report, the queries (always the queries), wait stats (these don't tell you the problem, but the indicate where to look for the problem). Your development team is absolutely chasing you in order to help them identify the problem and help them fix it. You should absolutely be able to point to a given batch or procedure and say "This one is causing the most pain." Then, you should be able to drill into that batch or proc and say "These three statements are the worst and here's why." Then, show them an execution plan that illustrates the problem. Then, propose a solution, that you've TESTED, along with the improved measures and superior execution plan.
This is why you're there and have superior knowledge of SQL Server and its internals. Yeah, they're chasing the DBA team, as they should.
Sir, the blocked process report is giving me what process blocked which process information. However, I am not able to see what are the locks been held on what object by <blocking process> i.e. blocker process. It giving information of only requested lock by the blocked process.
For example.
<blocked-process-report monitorLoop="13896">
<blocked-process>
<process id="process159f7f39848" taskpriority="0" logused="0" waitresource="KEY: 22:72057594041270272 (d08358b1108f)" waittime="3764367" ownerId="405173" transactionname="SELECT" lasttranstarted="2023-02-15T18:32:20.257" XDES="0x15dddb13a40" lockMode="S" schedulerid="4" kpid="21076" status="suspended" spid="51" sbid="0" ecid="0" priority="0" trancount="0" lastbatchstarted="2023-02-15T18:32:20.257" lastbatchcompleted="2023-02-15T18:32:20.257" lastattention="1900-01-01T00:00:00.257" clientapp="Microsoft SQL Server Management Studio - Query" hostname="RSI1" hostpid="21492" loginname="Administrator" isolationlevel="read committed (2)" xactid="405173" currentdb="22" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">
<executionStack>
<frame line="1" stmtend="34" sqlhandle="0x0200000094ddcd2b20c12b9876943b9d564f33dd7ba0239f0000000000000000000000000000000000000000" />
</executionStack>
<inputbuf>
select * from dept; </inputbuf>
</process>
</blocked-process>
<blocking-process>
<process status="sleeping" spid="53" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2023-02-15T18:32:17.537" lastbatchcompleted="2023-02-15T18:32:17.537" lastattention="1900-01-01T00:00:00.537" clientapp="Microsoft SQL Server Management Studio - Query" hostname="" hostpid="21492" loginname="Administrator" isolationlevel="read committed (2)" xactid="404835" currentdb="22" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">
<executionStack />
<inputbuf>
BEGIN TRAN
UPDATE dept
SET [loc] = 'NEWZEALAND'
WHERE DEPTNO=10; --//updating 1 row and sitting idle
--- ROLLBACK;
</inputbuf>
</process>
</blocking-process>
</blocked-process-report>
February 15, 2023 at 5:15 pm
It does. It's just not obvious when looking at the XML. You can open the blocked process report as a report which will resolve the values for you. However, to resolve the values, use Kendra's guide.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 16, 2023 at 7:38 pm
It does. It's just not obvious when looking at the XML. You can open the blocked process report as a report which will resolve the values for you. However, to resolve the values, use Kendra's guide.
No Sir, for some reason it is not showing the locks held by <blocking-process>. Attaching sample report.
February 16, 2023 at 7:48 pm
repro
======
use testdb
go
CREATE TABLE [dbo].[DEPT](
[DEPTNO] [int] NOT NULL primary key,
[DNAME] [varchar](14) NULL,
[LOC] [varchar](13) NULL,
)
INSERT [dbo].[DEPT] ([DEPTNO], [DNAME], [LOC]) VALUES (10, N'ACCOUNTING',N'NEW YORK')
INSERT [dbo].[DEPT] ([DEPTNO], [DNAME], [LOC]) VALUES (20, N'RESEARCH',N'DALLAS')
INSERT [dbo].[DEPT] ([DEPTNO], [DNAME], [LOC]) VALUES (30, N'SALES',N'CHICAGO')
INSERT [dbo].[DEPT] ([DEPTNO], [DNAME], [LOC]) VALUES (40, N'OPERATIONS',N'BOSTON')
GO
spid = 55
=========
use testdb
go
BEGIN TRAN
UPDATE dept
SET [loc] = 'NEWZEALAND'
WHERE DEPTNO=10; --//updating 1 row and sitting idle
select * from dept;
--- ROLLBACK;
spid=56
======
use testdb
go
select * from dept;
February 22, 2023 at 7:55 am
The application team are saying the database is the problem. Do they have any evidence to prove this.
A well written API will have debug facilities that can produce a trace of each action the API does. Talk to your nanagement and ask for the API debug trace. It should show exactly what each action is and how long it took. If an action is a SQL call, it should show the SQL statement issued.
For now everybody knows there are problems but the API is being treated as a black box. You need to get that box opened so it can give evidence of where the problem lies.
Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.
When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply