The client reported that the log file on their main database was growing unusually large, and when they checked the backup target the database didn't have any LOG backups!
Oh, and the database was part of an Availability Group.
|
https://imgflip.com/i/28n8j8 |
--
After some investigation we found that while the Availability Group was configured to prefer the primary replica for backups, it wasn't evaluating as the preferred backup via the system function sys.fn_hadr_backup_is_preferred_replica!
What we found in the output logs for the LOG backup (Hallengren) job looked like this:
--
Availability group: DATABASE1_AG1 Availability group role: PRIMARY Availability group backup preference: PRIMARY Is preferred backup replica: No --
Even though the preferred backup replica is the primary replica, and we are on the primary replica, the is_preferred check says No.
A test showed that a manual BACKUP LOG statement worked, but both the Hallengren DatabaseBackup *and* a "regular" maintenance plan backup ignored Database1.
To check the preferred replica status we manually ran the command in a query window:
--
SELECT sys.fn_hadr_backup_is_preferred_replica(‘Database1’)
--
...it came back as 0 (false) on both nodes – SQL1 and SQL2.
As long as the preferred backup replica is 0, no intelligent backups will run against that replica, which is why backups weren’t happening on SQL1 or SQL2.
--
Now what? Back to the old standby:
|
https://imgflip.com/i/28n92w |
As described in the solution post, one possible answer is if the internal server names don't match - we checked the servers and sure enough:
SELECT SERVERPROPERTY('ServerName') as [InstanceName] , SERVERPROPERTY('ComputerNamePhysicalNetBIOS') as[ComputerNamePhysicalNetBIOS] , @@SERVERNAME as ServerName
| ComputerNamePhysicalNetBIOS | |
| | |
| | |
| ComputerNamePhysicalNetBIOS | |
| | |
The SERVERPROPERTY names reflect the friendly SQL1 and SQL2 names while the @@SERVERNAME reflects the client's standard infrastructure team ###-### names. This often means that the Windows Server was renamed after SQL was installed but the final step of renaming SQL wasn’t taken (more on that at the end of the email). Renaming the servers (taking a downtime) was not immediately acceptable so after some consideration we tried manually editing the code of the master.dbo.DatabaseBackup procedure to hard-code the fact that the primary replica is truly the preferred backup replica:
--
Original code:
IF @Version >= 11 AND @Cluster IS NOT NULL AND @CurrentAvailabilityGroup IS NOT NULL
BEGIN
SELECT @CurrentIsPreferredBackupReplica = sys.fn_hadr_backup_is_preferred_replica(@CurrentDatabaseName)
END
Modified code:
IF @Version >= 11 AND @Cluster IS NOT NULL AND @CurrentAvailabilityGroup IS NOT NULL AND @CurrentAvailabilityGroupRole = 'PRIMARY'
BEGIN
SELECT @CurrentIsPreferredBackupReplica = 1
END
--
…and that allowed the backups to run successfully by setting the procedure to true as long as the replica is primary.
|
http://2.bp.blogspot.com/_s03XW1ZvVvE/S9E3tjNwDMI/AAAAAAAAA6w/2QFwfKb-aNo/s1600/kool-aid.jpg |
One other catch that came up after that also relates directly back to the mismatched names…after making the above change, the LOG backup of Database1 was succeeding but the overall LOG backup job was still failing. Looking in the job output we found an error: Msg 50000, Level 16, State 1, Server SQL1, Procedure DatabaseBackup, Line 1041 The author of DatabaseBackup (Ola Hallengren) is ready for the fact that mismatched names can cause issues and has a custom error built into his procedure to throw the Error 50000 if the names don’t match. The procedure proceeds to run backups if it can, but still ultimately fails overall with the 50000 message. To mitigate this we further edited the code of DatabaseBackup to comment out the code block that generates this specific error: 2018-04-18 - Commented out to prevent 50000 error since @@SERVERNAME does not match ServerName IF @@SERVERNAME <> SERVERPROPERTY('ServerName') AND SERVERPROPERTY('IsHadrEnabled') = 1 SET @ErrorMessage = 'The @@SERVERNAME does not match SERVERPROPERTY(''ServerName''). See ' + CASE WHEN SERVERPROPERTY('IsClustered') = 0 THEN 'https://docs.microsoft.com/en-us/sql/database-engine/install-windows/rename-a-computer-that-hosts-a-stand-alone-instance-of-sql-server' WHEN SERVERPROPERTY('IsClustered') = 1 THEN 'https://docs.microsoft.com/en-us/sql/sql-server/failover-clusters/install/rename-a-sql-server-failover-cluster-instance' END + '.' + CHAR(13) + CHAR(10) + ' ' RAISERROR(@ErrorMessage,16,1) WITH NOWAIT This directly bypasses the error check and prevents the 50000 error for the name mismatch.
--
As an aside I want to comment on the code editing we were doing - when modifying code of any type consider the source and their licensing/intellectual property rights. Ola's licensing is very short and sweet:
--
License
The SQL Server Maintenance Solution is licensed under the MIT license, a popular and widely used open source license.
Copyright (c) 2018 Ola Hallengren
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
https://ola.hallengren.com/license.html
--
At the end of the day as long as you don't remove his copyright notices you have the right to modify his code.
This is very common with code online - ALWAYS ATTRIBUTE YOUR SOURCES - but if you are looking at any type of commercial code be extremely careful ; not only may you be violating the letter (or spirit) of the law, you may even prevent the code's creator from supporting you in the future!
--
The document title references a “stand-alone instance of SQL Server” but that is what this is since it is two SQL Servers with an availability group rather than a failover cluster (shared) instance. It is unknown from the information here whether changing the @@SERVERNAME from 123-223 to SQL1 would impact any other business or application processes – this is something the client's teams would need to determine before making any changes to the name. Also changing the server name would require a SQL downtime (service restart, not Windows reboot) for the name change to take effect (as described in the Microsoft doc) so even if you are comfortable with the name change it needs to be scheduled for a downtime window. Hope this helps!