January 25, 2023 at 3:50 pm
Hi,
We have this error from time to time, not all the time but when it does SQL is reporting as high as thousands of occurrences. It happens on random days, but when it does happen the timestamp is always within a couple of minutes of the same time.
We never had it from go-live of this system, then it started a year later. After some optimisation of the VM config. we hadn't had it from between mid-December to mid-January, when it started showing again.
Critically, we have a Veeam (full SQL VM) backup happening within this timestamp - if we move the SQL backup job, the timestamp moves.
So it seems obvious that Veeam is the problem, but why?
Our storage appliance:
is connected at 10Gb iSCSI
shows no cause for concern,, e.g. problematic disks or high I/O
even during a Veeam backup LUN latency report for the SQL LUN is showing in 'us' notation!
in the SQL logs I can see messages about the data files being paused because of a snapshot happening (Veeam backup job)
Veeam is targeting the Hyper-V cluster for backing up VMs, rather than VMs directly
disks are enterprise-grade SSD
So I am not convinced there is an issue with the storage.
Running a T-SQL disk latency query shows ms values in the mid-high hundreds for the same data files that are logging delayed I/O requests. However on another environment with the same SQL database/application, they show very similar values so again not convinced it's the underlying infrastructure.
Any guidance greatly received
January 25, 2023 at 6:50 pm
Does the VEEAM backup that runs and produces the I/O messages include Sql Server databases, or, are they separate? As a start, I would probably review the backup setup you have for your Sql Servers and come up possibly with a different solution. Can one be done without the other, etc. Evidently, the VEEAM backup is interfering with Sql Server normal processing. Here is an article that shows some different options with different combinations that may help - https://www.veeam.com/wp-sql-server-backup-with-veeam.html
January 26, 2023 at 1:44 pm
Are you also seeing "i/o is frozen..." messages in the logs? If so, Veeam uses Volume Shadow Copy service to backup the individual database files. Even though these last a few milliseconds, I have seen this cause issues.
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
January 27, 2023 at 2:25 pm
Veeam is "targeting" the Hyper-V cluster, in that the ONLY object in Veeam is the cluster name computer object. So Veeam is interacting with the Hypervisor to enumerate VMs, and then backing them up that way. In the SQL backup job, the 'type' is shown as 'Hyper-V Backup' and the 'Enable application-aware processing' option is disabled.
So it's not targeting specific VMs, and in turn not directly targeting SQL databases.
This whole config. predates my involvement (including the SQL side). I'm now left trying to troubleshoot it!
Yes I am seeing 'I/O is frozen'. As you say, it's only a few seconds before being resumed:
2023-01-19 19:47:42.03 spid5054 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.06 spid6369 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.06 spid6355 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.06 spid150 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.09 spid5327 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.11 spid5286 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.12 spid5436 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.13 spid5442 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.15 spid5820 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.17 spid5744 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.17 spid6282 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.17 spid6335 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.18 spid6420 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.18 spid6421 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.20 spid6376 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.20 spid6475 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.20 spid6397 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.20 spid6424 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.20 spid6441 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.20 spid6457 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.23 spid6427 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.24 spid6515 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.25 spid6491 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.27 spid6476 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.27 spid6408 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.27 spid6413 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.29 spid6535 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.30 spid6540 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.30 spid6543 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.30 spid6549 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.30 spid6500 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:42.31 spid6552 I/O is frozen on database [redacted]. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.
2023-01-19 19:47:43.43 spid6355 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:43.43 spid6376 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:43.43 spid6420 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:43.43 spid6535 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:48.78 spid6427 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:48.78 spid5820 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:48.78 spid6540 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:48.79 spid6552 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:48.79 spid6441 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:48.79 spid6369 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:48.79 spid5327 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:49.11 spid6421 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:49.11 spid5744 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:49.12 spid6413 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:49.30 Server Long Sync IO: Scheduler 10 had 1 Sync IOs in nonpreemptive mode longer than 1000 ms
2023-01-19 19:47:49.45 spid6408 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:49.46 spid6475 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.71 spid6515 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.71 spid5436 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.71 spid6549 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.72 spid150 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.72 spid5286 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.72 spid6500 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.72 spid5054 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.72 spid6491 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.72 spid6457 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.74 spid6476 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.74 spid6424 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.74 spid6335 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.75 spid6543 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.76 spid6397 I/O was resumed on database [redacted]. No user action is required.
2023-01-19 19:47:50.76 spid5442 I/O was resumed on database [redacted]. No user action is required.
January 27, 2023 at 2:28 pm
PS: There are also SQL-native log backups every 15 minutes.
January 27, 2023 at 4:24 pm
Both the I/O latency and the I/O frozen issues are caused by Volume Shadow Copy service, which is being called by Veeam.
I am not that familiar with Veeam, aside from sitting through sales pitches. I would suggest contacting their technical support. Also, and all of these third party vendors claim this, but there are NONE who can backup a SQL database and recover all of the data. I have proven this quite a few times. If you can handle that data loss, then ok. But they will tell you that SQL backups are not needed.
I think you need to pick a strategy, and go with it.
What is your RPO? Doing log backups every 15 minutes tells me it's 15 minutes. BUT, and I am making an educated guess, you could lose up either 24 or 48 hours worth of data in the current setup.
Here's how:
Before I made any changes, I would test the recovery to see if in fact you can recover to a point in time.
Also, what is the long term storage requirements? Let's assume it's 5 years. If you keep all 5 years worth of backups, you are going to cost the company some big $$$$.
We did something like this:
So, at any time, we had 30 days in short term storage, 11 months, and 4 years in long term storage.
Your strategy is dependent upon the rules and regulations your company has to abide by. It may be different for different systems. Financial systems may be longer, others may be far shorter.
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply