February 23, 2017 at 3:18 pm
RVO - Thursday, February 23, 2017 1:47 PMJeff,
"...SQL will probably be exclusively hitting that one - RVO could check by looking at the virtual file stats DMV to see where all the IO is...."Can you please give me a query to find virtual file stats for tempdb files?
At a basic level on that DMV you can just to: select * from sys.dm_io_virtual_file_stats(2,NULL) ;
The first argument is the database id, If you want to capture for all DBs, just set it to NULL. 2 is tempdb.
Usually you want to get baseline and then capture. You can just insert into a table( I'd do a static table for now just to stay out of tempdb) for the baseline and then insert the same into the table, whenever. If you want to see over a half hour, run the baseline and then just insert again 30 mins later...that kind of thing.
I just did this one for you which should work...find some database to create the table, and then just execute the insert into. Play around with it, see if that works for you and let us know if it doesn't. You can just truncate the table after playing around before you start doing someone's job for them...oh sorry I mean before you want to capture for analyzing. No doubt you'll get the gist - it's just grabbing tempdb which you can change:
CREATE TABLE TempDB_IO
(DBName sysname,
file_id smallint,
num_of_reads bigint,
num_of_bytes_read bigint,
io_stall_read_ms bigint,
num_of_writes bigint,
num_of_bytes_written bigint,
io_stall_write_ms bigint,
io_stall bigint,
size_on_disk_bytes bigint,
CaptureDateTime datetime)
INSERT INTO TempDB_IO
SELECT db_name(database_id) as DBName,
file_id,
num_of_reads,
num_of_bytes_read,
io_stall_read_ms,
num_of_writes,
num_of_bytes_written,
io_stall_write_ms,
io_stall,
size_on_disk_bytes,
GETDATE()as CaptureDateTime
FROM sys.dm_io_virtual_file_stats(2,NULL) ;
Sue
February 23, 2017 at 6:11 pm
cphite - Thursday, February 23, 2017 1:48 PMRVO - Wednesday, February 22, 2017 8:01 AMPerformance did not get substantially worse yet since last reboot on January 29th.
My guess one of these things could help:
----We ran UPDATE STATISTICS 2-3 times since last reboot
----Moved Transaction Log file of one of busy databases to a proper drive (before it was sitting together with data files)Maybe I was wrong when said performance gradually gets worse.
I'm afraid it might be pretty sudden.
Like 2-3 days of slower than usual and then big slowdown.
We don't see a big Disk latency now. Like we saw last time before reboot.
Overall it stays on the same level it was right after reboot.Noticed the thing about performance being suddenly worse instead of gradually...
We had an issue a few months back with one of our servers, where the VM host was writing a journal file - it basically tracks all of the changes that have occurred on the disk since the last snapshot. Anyway, once the file would get to a certain size, things would very abruptly slow to a crawl on the SQL instance. Our hardware team explained that it was because on the storage it was having to scan through this huge file for basically every transaction. Sorry if that's vague but I'm not a hardware storage guy. In any event, they disabled the journal and we haven't had the problem since.
That's a good point, as well. I've seen some might strange things along that line. We had a tried and true job on an AS 400 that normally only took an hour to run. One morning, it changed to 6 hours. No code changes were made. Nothing was rebooted. No large amounts of data were imported, It was crazy... after a week, they gave up and called IBM for some help. To make a longer story shorter, the battery on the cache for the box had finally given up the ghost. They replaced it without even taking the machine down and performance instantly returned to normal.
So, yeah... I agree. It could be something like you say.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 24, 2017 at 8:21 am
Thanks for a script guys.
Here are the results.
Should I focus on columns like
num_of_bytes_read ?
What exactly these columns represent?
February 24, 2017 at 9:19 am
Your eye is on the wrong ball here. Not sure how long this server has been up, but queries have waited almost ONE FULL DAY PER TEMPDB DATA FILE!!!! You simply cannot be successful when your CPUs (which do BILLIONS of things per second) are waiting DAYS to process data.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 24, 2017 at 9:28 am
TheSQLGuru - Friday, February 24, 2017 9:19 AMYour eye is on the wrong ball here. Not sure how long this server has been up, but queries have waited almost ONE FULL DAY PER TEMPDB DATA FILE!!!! You simply cannot be successful when your CPUs (which do BILLIONS of things per second) are waiting DAYS to process data.
@TheSQLGuru - I'm not 100% familiar with how that would all work but is that transactions are waiting days or queries or both?
I am just wondering if maybe that server has open transactions that never get closed and it could show up as running for days?
@rvo - any chance you can archive some of that data?
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
February 24, 2017 at 10:49 am
Yes, bmg002.
I am storing the results in "tempdb_usage" table in utility database.
Kevin / bmg002,
It very well might be the case that we have a lot of open transactions.
I use this Paul Randal query to find opened transactions.
I don't see any transaction where [Begin Time] is greater than 1 minutes ago.
/***** http://www.sqlskills.com/blogs/paul/script-open-transactions-with-text-and-plans/ ****/
use tempdb
SELECT
[s_tst].[session_id],
[s_es].[login_name] AS [Login Name],
DB_NAME (s_tdt.database_id) AS [Database],
[s_tdt].[database_transaction_begin_time] AS [Begin Time],
[s_tdt].[database_transaction_log_bytes_used] AS [Log Bytes],
[s_tdt].[database_transaction_log_bytes_reserved] AS [Log Rsvd],
[s_est].text AS [Last T-SQL Text],
[s_eqp].[query_plan] AS [Last Plan]
FROM
sys.dm_tran_database_transactions [s_tdt]
JOIN
sys.dm_tran_session_transactions [s_tst]
ON
[s_tst].[transaction_id] = [s_tdt].[transaction_id]
JOIN
sys.[dm_exec_sessions] [s_es]
ON
[s_es].[session_id] = [s_tst].[session_id]
JOIN
sys.dm_exec_connections [s_ec]
ON
[s_ec].[session_id] = [s_tst].[session_id]
LEFT OUTER JOIN
sys.dm_exec_requests [s_er]
ON
[s_er].[session_id] = [s_tst].[session_id]
CROSS APPLY
sys.dm_exec_sql_text ([s_ec].[most_recent_sql_handle]) AS [s_est]
OUTER APPLY
sys.dm_exec_query_plan ([s_er].[plan_handle]) AS [s_eqp]
ORDER BY
[Begin Time] ASC;
GO
Typically I get 1-2 records.
Sometimes 10-15 (every 1 minute) like below:
February 24, 2017 at 10:55 am
What does DBCC OPENTRAN tell you? anything open?
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
February 24, 2017 at 11:02 am
I don't have permissions for most of DBCC commands like this one.
Remember i am an ETL Developer with slightly elevated access permissions.
Is there any other way to catch opened transactions?
February 24, 2017 at 11:12 am
I tried Adam Mechanic sp_WhoIsActive.
It returns a column open_tran_count
Total count is 23.
February 24, 2017 at 11:24 am
Paul Randal script
sometimes returns transactions with [Begin Time] = NULL
What does it mean?
February 24, 2017 at 11:38 am
Guys,
Just wanted to mention that our concern has been always about Disk Latency.
We have SOLARWINDS tool.
Here is a screenshot from today's SOLARWINDS "Files" perfromance.
As you can see, sometimes we spend 2000-2500 seconds on file access.
But honestly I don't understand the first diagram
because it does not correlate with
Disk Read/Write Latency that SOLARWINDS shows on the same screen
for the same time period (below)
February 24, 2017 at 3:18 pm
1) Any time I see nvarchar(4000) parameters I think ORM, mismatched data types, and a butt-ton of CONVERT_IMPLICITs with their correspondingly horrible performance and concurrency issues.
2) You clearly have repeated complete buffer flushes. That's brutal.
3) Solarwinds is aggregating all IO stalls into 15 minute buckets. Your graphs just show latency, not how many things were running concurrently and collecting fileIO stalls. Just yesterday I captured a 180 SECOND run of IO stalls on a client and they totaled a mind-blowing 2.5 MILLION MILLISECONDS in duration!!! That is 28.94 DAYS of IO stalls in 180 clock seconds!! Guess what, you can't run almost 2TB of BUSY and REALLY BADLY DESIGNED AND CODED OLTP databases with 2 synchronous AGs for each database on a four 6-disk RAID10 sets. :hehe:
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 24, 2017 at 3:30 pm
TheSQLGuru - Friday, February 24, 2017 3:18 PM1) Any time I see nvarchar(4000) parameters I think ORM, mismatched data types, and a butt-ton of CONVERT_IMPLICITs with their correspondingly horrible performance and concurrency issues.2) You clearly have repeated complete buffer flushes. That's brutal.
3) Solarwinds is aggregating all IO stalls into 15 minute buckets. Your graphs just show latency, not how many things were running concurrently and collecting fileIO stalls. Just yesterday I captured a 180 SECOND run of IO stalls on a client and they totaled a mind-blowing 2.5 MILLION MILLISECONDS in duration!!! That is 28.94 DAYS of IO stalls in 180 clock seconds!! Guess what, you can't run almost 2TB of BUSY and REALLY BADLY DESIGNED AND CODED OLTP databases with 2 synchronous AGs for each database on a four 6-disk RAID10 sets. :hehe:
Hey Kevin,
Just wondering where I can get my hands on the hardware you used to stretch 2.5 million milliseconds (exactly 2,500 seconds, or 41 and 2/3 minutes) into almost 29 years? That kind of improvement in life expectancy is worth $BILLION$, if not $TRILLION$ .... 😉 😉 😉
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
February 24, 2017 at 4:56 pm
sgmunson - Friday, February 24, 2017 3:30 PMHey Kevin,Just wondering where I can get my hands on the hardware you used to stretch 2.5 million milliseconds (exactly 2,500 seconds, or 41 and 2/3 minutes) into almost 29 years? That kind of improvement in life expectancy is worth $BILLION$, if not $TRILLION$ .... 😉 😉 😉
HAH! Clearly the recent 2-week trip to Australia, and it's drastic effect on my sleep and mental state, is still having effects! Not only did I not divide by 1000 I took the 28.9 DAYS and said YEARS. Oy vey! :blink: I need to go check if I told the client it was 29 years...
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
February 24, 2017 at 8:06 pm
TheSQLGuru - Friday, February 24, 2017 4:56 PMsgmunson - Friday, February 24, 2017 3:30 PMHey Kevin,Just wondering where I can get my hands on the hardware you used to stretch 2.5 million milliseconds (exactly 2,500 seconds, or 41 and 2/3 minutes) into almost 29 years? That kind of improvement in life expectancy is worth $BILLION$, if not $TRILLION$ .... 😉 😉 😉
HAH! Clearly the recent 2-week trip to Australia, and it's drastic effect on my sleep and mental state, is still having effects! Not only did I not divide by 1000 I took the 28.9 DAYS and said YEARS. Oy vey! :blink: I need to go check if I told the client it was 29 years...
Apparently, your trip affected me as well... I somehow saw YEARS where the word DAYS appeared. Geez Louise! This gettin' old sh_ _ is gettin' really old...
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
Viewing 15 posts - 76 through 90 (of 114 total)
You must be logged in to reply to this topic. Login to reply