May 19, 2016 at 2:07 pm
How backup_start_date and backup_finish_date are calculated? Where is that value taken from?
Wrote the following TSQL query in order to retrieve backup information:
DECLARE @dbnamesysname
DECLARE @dayspastdecimal(4,2)
DECLARE @backupTypechar(1)
SET @dbname = ''
SET @dayspast = 3.5 - .1
SET @backupType = 'D'
SELECT
@@servername [ServerName]
,master.sys.sysdatabases.name [DatabaseName]
,msdb.dbo.backupset.backup_start_date [Backup Start]
,msdb.dbo.backupset.backup_finish_date [Finish DateTime]
,msdb.dbo.backupset.user_name[Login used to Backup]
,CASE
WHEN DATEDIFF(ss, msdb.dbo.backupset.backup_start_date, msdb.dbo.backupset.backup_finish_date)/(24*3600) > 0
THEN CAST(DATEDIFF(ss, msdb.dbo.backupset.backup_start_date, msdb.dbo.backupset.backup_finish_date)/(24*3600) AS nvarchar) + '.'
ELSE ''
END + RIGHT(CONVERT(nvarchar,msdb.dbo.backupset.backup_finish_date - msdb.dbo.backupset.backup_start_date,121),12) AS Duration
,msdb.dbo.backupmediafamily.physical_device_name[File Location]
,msdb.dbo.backupmediafamily.family_sequence_number[File Sequesnce#]
,msdb.dbo.backupset.position[BackupPosition in File]
,CASE
WHEN msdb.dbo.backupset.type = 'D'
THEN 'FULL'
WHEN msdb.dbo.backupset.type = 'I'
THEN 'Diff'
WHEN msdb.dbo.backupset.type = 'L'
THEN 'Logs'
END [Backup Type]
FROM
msdb.dbo.backupmediafamily,
master.sys.sysdatabases
LEFT OUTER JOIN
msdb.dbo.backupset
ON master.sys.sysdatabases.name = msdb.dbo.backupset.database_name
WHERE msdb.dbo.backupmediafamily.media_set_id = msdb.dbo.backupset.media_set_id
AND msdb.dbo.backupset.backup_start_date > '1/1/2016'
AND msdb.dbo.backupmediafamily.physical_device_name LIKE '{%'
ORDER BY [Finish DateTime] DESC;
It is showing weird strings as file locations, compatible with some SAN's snapshots that, as per our SAN admin, are taking every hour and last 1 seconds. But this does not match msdb info. msdb shows that each backup is taking around 25 seconds to complete.
Why do I have that time discrepancy between msdb and the Nimble's dashboard???
May 19, 2016 at 2:28 pm
I have no idea about your Nimble stuff, but the backup start and end times should be set by the actual backup process in SQL Server.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
May 20, 2016 at 8:04 am
TheSQLGuru (5/19/2016)
I have no idea about your Nimble stuff, but the backup start and end times should be set by the actual backup process in SQL Server.
Thanks for reply
I guess what I am saying is that our SAN's admin disagrees with those times. He (actually we) wanted to know the reason for the time's discrepancies, between what the Nimble's dashboard says vs msdb.
Why the Nimble's dashboard says that the FULL backup via snapshot took about a second but msdb says it took 30 seconds or so.
May 20, 2016 at 8:31 am
sql-lover (5/20/2016)
TheSQLGuru (5/19/2016)
I have no idea about your Nimble stuff, but the backup start and end times should be set by the actual backup process in SQL Server.Thanks for reply
I guess what I am saying is that our SAN's admin disagrees with those times. He (actually we) wanted to know the reason for the time's discrepancies, between what the Nimble's dashboard says vs msdb.
Why the Nimble's dashboard says that the FULL backup via snapshot took about a second but msdb says it took 30 seconds or so.
The Nimble Snapshot backup just makes a marker of all used blocks. Nothing is actually done other than that. You can probably see what is happening with SQL Server during those in the SQL Server error log. You will see entries about IO being "frozen" (or some such terminology - can't remember exact phrasing at the moment).
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
May 20, 2016 at 8:51 am
TheSQLGuru (5/20/2016)
sql-lover (5/20/2016)
TheSQLGuru (5/19/2016)
I have no idea about your Nimble stuff, but the backup start and end times should be set by the actual backup process in SQL Server.Thanks for reply
I guess what I am saying is that our SAN's admin disagrees with those times. He (actually we) wanted to know the reason for the time's discrepancies, between what the Nimble's dashboard says vs msdb.
Why the Nimble's dashboard says that the FULL backup via snapshot took about a second but msdb says it took 30 seconds or so.
The Nimble Snapshot backup just makes a marker of all used blocks. Nothing is actually done other than that. You can probably see what is happening with SQL Server during those in the SQL Server error log. You will see entries about IO being "frozen" (or some such terminology - can't remember exact phrasing at the moment).
Yep, I am seeing the I/O frozen stuff, but IT disagrees. He states that msdb data is wrong or somehow inaccurate.
May 20, 2016 at 9:03 am
sql-lover (5/20/2016)
TheSQLGuru (5/20/2016)
sql-lover (5/20/2016)
TheSQLGuru (5/19/2016)
I have no idea about your Nimble stuff, but the backup start and end times should be set by the actual backup process in SQL Server.Thanks for reply
I guess what I am saying is that our SAN's admin disagrees with those times. He (actually we) wanted to know the reason for the time's discrepancies, between what the Nimble's dashboard says vs msdb.
Why the Nimble's dashboard says that the FULL backup via snapshot took about a second but msdb says it took 30 seconds or so.
The Nimble Snapshot backup just makes a marker of all used blocks. Nothing is actually done other than that. You can probably see what is happening with SQL Server during those in the SQL Server error log. You will see entries about IO being "frozen" (or some such terminology - can't remember exact phrasing at the moment).
Yep, I am seeing the I/O frozen stuff, but IT disagrees. He states that msdb data is wrong or somehow inaccurate.
Just me, but I may disagree with IT. Is the Nimble snapshot just a snapshot of the database or the disk? How big is the database file? I'd be a bit surprised if the snapshot marked everything at the exact same moment based on you seeing I/O frozen messages in the SQL log.
May 20, 2016 at 12:31 pm
Lynn Pettis (5/20/2016)
sql-lover (5/20/2016)
TheSQLGuru (5/20/2016)
sql-lover (5/20/2016)
TheSQLGuru (5/19/2016)
I have no idea about your Nimble stuff, but the backup start and end times should be set by the actual backup process in SQL Server.Thanks for reply
I guess what I am saying is that our SAN's admin disagrees with those times. He (actually we) wanted to know the reason for the time's discrepancies, between what the Nimble's dashboard says vs msdb.
Why the Nimble's dashboard says that the FULL backup via snapshot took about a second but msdb says it took 30 seconds or so.
The Nimble Snapshot backup just makes a marker of all used blocks. Nothing is actually done other than that. You can probably see what is happening with SQL Server during those in the SQL Server error log. You will see entries about IO being "frozen" (or some such terminology - can't remember exact phrasing at the moment).
Yep, I am seeing the I/O frozen stuff, but IT disagrees. He states that msdb data is wrong or somehow inaccurate.
Just me, but I may disagree with IT. Is the Nimble snapshot just a snapshot of the database or the disk? How big is the database file? I'd be a bit surprised if the snapshot marked everything at the exact same moment based on you seeing I/O frozen messages in the SQL log.
Well, yes, I do disagree with IT. But I am trying to find out the reason for the discrepancy. We started to have some collisions with our backups. The regular TSQL backups are failing, generating some retries (I do use redGate backup Pro) and this is basically because the SAN snapshots and the I/O freezes that happen every hour or so.
How many databases? We have hundreds of databases per server but the problem seems to be worse in one of them, with the DIFF backups.
The Nimble snapshots are VSS snapshots, so are considered, if I am not mistaken, database backups, hence why I can see those in msdb. So technically speaking, I should be able to restore a database using those backups. I have not tested that yet though.
May 20, 2016 at 1:02 pm
sql-lover (5/20/2016)
Lynn Pettis (5/20/2016)
sql-lover (5/20/2016)
TheSQLGuru (5/20/2016)
sql-lover (5/20/2016)
TheSQLGuru (5/19/2016)
I have no idea about your Nimble stuff, but the backup start and end times should be set by the actual backup process in SQL Server.Thanks for reply
I guess what I am saying is that our SAN's admin disagrees with those times. He (actually we) wanted to know the reason for the time's discrepancies, between what the Nimble's dashboard says vs msdb.
Why the Nimble's dashboard says that the FULL backup via snapshot took about a second but msdb says it took 30 seconds or so.
The Nimble Snapshot backup just makes a marker of all used blocks. Nothing is actually done other than that. You can probably see what is happening with SQL Server during those in the SQL Server error log. You will see entries about IO being "frozen" (or some such terminology - can't remember exact phrasing at the moment).
Yep, I am seeing the I/O frozen stuff, but IT disagrees. He states that msdb data is wrong or somehow inaccurate.
Just me, but I may disagree with IT. Is the Nimble snapshot just a snapshot of the database or the disk? How big is the database file? I'd be a bit surprised if the snapshot marked everything at the exact same moment based on you seeing I/O frozen messages in the SQL log.
Well, yes, I do disagree with IT. But I am trying to find out the reason for the discrepancy. We started to have some collisions with our backups. The regular TSQL backups are failing, generating some retries (I do use redGate backup Pro) and this is basically because the SAN snapshots and the I/O freezes that happen every hour or so.
How many databases? We have hundreds of databases per server but the problem seems to be worse in one of them, with the DIFF backups.
The Nimble snapshots are VSS snapshots, so are considered, if I am not mistaken, database backups, hence why I can see those in msdb. So technically speaking, I should be able to restore a database using those backups. I have not tested that yet though.
It sounds as if you may be comparing apples to oranges based upon what you are seeing in the msdb.backupset table.
You will see "strange" entries in the msdb tables that correspond to backups created by Nimble protection manager.
While the Nimble may be showing once second, the entire process may take longer because Nimble needs to flush the data to a consistent state. That will account for the length of time stored in backupset.
There may well be a conflict with the native backups. Been there, done that. The momentary freeze from VSS is enough to cause issues, depending upon the nature of the system.
Someone sent me this link a while back that I refer to frequently:
http://www.krome.co.uk/wp-content/uploads/bpg_nimble_storage_sql_server.pdf
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
May 20, 2016 at 1:12 pm
sql-lover (5/20/2016)
Lynn Pettis (5/20/2016)
sql-lover (5/20/2016)
TheSQLGuru (5/20/2016)
sql-lover (5/20/2016)
TheSQLGuru (5/19/2016)
I have no idea about your Nimble stuff, but the backup start and end times should be set by the actual backup process in SQL Server.Thanks for reply
I guess what I am saying is that our SAN's admin disagrees with those times. He (actually we) wanted to know the reason for the time's discrepancies, between what the Nimble's dashboard says vs msdb.
Why the Nimble's dashboard says that the FULL backup via snapshot took about a second but msdb says it took 30 seconds or so.
The Nimble Snapshot backup just makes a marker of all used blocks. Nothing is actually done other than that. You can probably see what is happening with SQL Server during those in the SQL Server error log. You will see entries about IO being "frozen" (or some such terminology - can't remember exact phrasing at the moment).
Yep, I am seeing the I/O frozen stuff, but IT disagrees. He states that msdb data is wrong or somehow inaccurate.
Just me, but I may disagree with IT. Is the Nimble snapshot just a snapshot of the database or the disk? How big is the database file? I'd be a bit surprised if the snapshot marked everything at the exact same moment based on you seeing I/O frozen messages in the SQL log.
Well, yes, I do disagree with IT. But I am trying to find out the reason for the discrepancy. We started to have some collisions with our backups. The regular TSQL backups are failing, generating some retries (I do use redGate backup Pro) and this is basically because the SAN snapshots and the I/O freezes that happen every hour or so.
How many databases? We have hundreds of databases per server but the problem seems to be worse in one of them, with the DIFF backups.
The Nimble snapshots are VSS snapshots, so are considered, if I am not mistaken, database backups, hence why I can see those in msdb. So technically speaking, I should be able to restore a database using those backups. I have not tested that yet though.
Are the Nimble backups full database backups? If so, you need to be sure your differential backups run between them as they are based on the most current full backup.
May 20, 2016 at 1:14 pm
Michael L John (5/20/2016)
You will see "strange" entries in the msdb tables that correspond to backups created by Nimble protection manager.
While the Nimble may be showing once second, the entire process may take longer because Nimble needs to flush the data to a consistent state. That will account for the length of time stored in backupset.
I have no doubt about that but I am looking for some white paper or technical data that can show this, hence the reason for my post.
It is normal to see discrepancies between SAN's dashboards and MSSQL DMVs, latency is another one that comes to my mind. Both sides are usually correct but the measures or delays are for different things.
On this particular case, I would like to see some article that points to that.
-- EDIT --
Michael.
Thanks for that link/PDF. It looks very interesting. Downloading right now...
May 20, 2016 at 1:17 pm
sql-lover (5/20/2016)
Michael L John (5/20/2016)
You will see "strange" entries in the msdb tables that correspond to backups created by Nimble protection manager.
While the Nimble may be showing once second, the entire process may take longer because Nimble needs to flush the data to a consistent state. That will account for the length of time stored in backupset.
I have no doubt about that but I am looking for some white paper or technical data that can show this, hence the reason for my post.
It is normal to see discrepancies between SAN's dashboards and MSSQL DMVs, latency is another one that comes to my mind. Both sides are usually correct but the measures or delays are for different things.
On this particular case, I would like to see some article that points to that.
Yes, but it is not breaking the backup chain because they use COPY_ONLY. And yes, I tested that (with my TSQL backups) using my FULLs and DIFFs, and it worked fine. These Nimble backups are for DR purposes.
May 20, 2016 at 4:01 pm
1) Spread out your schedule to minimize overlap. I can't tell you how many times I see umpteen jobs starting at 00:00 (or on the hour) at clients!!! NEXT-NEXT-NEXT and defaulting everything SUCKS people! :w00t:
2) DIFF backups can quickly approach the size of the entire database in some scenarios (GUID Clustered PKs for example) because if you modify one bit of a record you now backup 64KB for every subsequent DIFF.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
May 20, 2016 at 4:10 pm
TheSQLGuru (5/20/2016)
1) Spread out your schedule to minimize overlap. I can't tell you how many times I see umpteen jobs starting at 00:00 (or on the hour) at clients!!! NEXT-NEXT-NEXT and defaulting everything SUCKS people! :w00t:2) DIFF backups can quickly approach the size of the entire database in some scenarios (GUID Clustered PKs for example) because if you modify one bit of a record you now backup 64KB for every subsequent DIFF.
#1
Done. Sadly, nothing I can't do (SQL side of things I mean)
#2
We do not use GUIDs and we do have daily DIFFs and that's a business model that I can't or won't change.
It is clearly a collision, so I just need data to support that argument, which now it looks like I have, and a way to tweak backup parameters (not schedule) to see if that improves things.
-- EDIT --
Actually Kevin... I think I can move the DIFF schedule 10 min forward on each of my DIFF backup jobs. I just made the change so I will take a look tomorrow and see if that helped.
May 23, 2016 at 12:41 pm
So I have a very interesting meeting with IT. It's not only my SQL backups the ones that are failing, the VSS backups are failing too. Basically, whoever "grabs the disk subsystem" 1st, that one wins.
Scenario #1
6pm. VSS FULL backup kicks in. 1 second later, RedGate Backup Pro DIFF backup starts for database "ABC", then it fails after a few seconds or so because could not read the mdf, the I/O was disconnected. RedGate tries again and then it succeeded.
Scenario #2
7pm. RedGate Backup Pro is taking a FULL backup for database ABC, all running fine, SQL side of things. At same time or a second later, Nimble VSS backup kicks in, trying to get a VSS SQL SAN snapshot backup for same database. VSS backups fails, RedGate wins.
So the problem , again, is backups are colliding. But they should coexist.
May 23, 2016 at 1:38 pm
It looks like I may have to increase worker threads on MSSQL and/or add more vCPUs:
How It Works: SQL Server – VDI (VSS) Backup Resources[/url]
I currently have above 200 databases on one single LUN. And over of 600 on the other box, single LUN as well. The problem seems to be more severe on those two, so the information on that article makes sense.
Now, I wonder if in terms of backup workload but still the same SAN, having several LUNs may or may not improved backup performance, while keeping RAM and CPU the same I mean.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply