August 20, 2013 at 7:21 pm
We are running SQL Server 2008 R2 SP1. There is a scheduled transaction log backup using SQL Server maintenance job that runs at a certain time of the day. The job runs fine on few days and fails few days and there is no specific pattern. I changed the schedule to run every 5 minutes and the log backups were fine for the first 3 attempts and started failing randomly.
-- operating system error 53(the network path was not found.) --
I am not sure why it fails few times and runs fine the rest of the times. The backups are going directly to SAN storage and use UNC path - e.g. \\vnvx\SQL_backups\
SQL Server errorlog has the following messages:
Backup Error: 3041, Severity: 16, State: 1.
Error: 18204, Severity: 16, State: 1.
BackupDiskFile::CreateMedia: Backup device
'N:\Backups\Userdatabases\Database\db _backup_200911111511.bak' failed to
create. Operating system error 53(the network path was not found.)
If it fails on a regular basis I could have checked permissions, SQL Server agent start up ID etc. However it fails few times and runs successfully few times. The job is set to run with 1 retry after 2 minutes and sometimes the 2nd attempt goes through.
Please help. Thanks in advance.
August 20, 2013 at 10:05 pm
Hi,
Error message "Operating system error 53(the network path was not found" clearly indicates that the network path could not be found in order to access/create, hence you need to check if the drive letter is disconnected at times with SAN.
Regards,
August 21, 2013 at 12:11 am
The error message states the destination path can not be reached. This indicates problems on the network or on the SAN level.
There could be too little bandwith or too much latency on the network. If the problem is storage related, it could be the buffer cache of theSAN can't handle the load or there is too much I/O on the LUN's.
August 21, 2013 at 1:14 am
Thanks guys. How to troubleshoot this problem? Is there any kind of step by step diagnosis process that we can setup to find the issue?
August 21, 2013 at 1:23 am
And also if the issue is SAN setting related, we should notice the same with other SQL Servers in the mix. However that is not the case. We are seeing this issue just on this machine. Is the problem specific to this machine where we are seeing this issue?
One of the options that I am thinking off is to take the backups to a local drive and have a file copy to move files from SQL Server host to the SAN drive. Shouldn't the file copy have the same problem?
Sorry to be vague.
August 21, 2013 at 1:36 am
N Nara (8/21/2013)
One of the options that I am thinking off is to take the backups to a local drive and have a file copy to move files from SQL Server host to the SAN drive. Shouldn't the file copy have the same problem?
Create the backup on a local drive is a good idea, it will most likely even speed-up the SQL backup process. If the copy-task will have the same problem is unknown. It depends on the time it takes and on other simultanious actions taking place on the network and SAN. The filesystem is designed to handle files, so I expect the file-copy to be more efficient and maybe even faster.
August 21, 2013 at 1:45 am
Thanks Hanshi. I probably will go with the local backup option.
However, I would like to troubleshoot the problem. Any pointers would really help.
August 21, 2013 at 2:08 am
Go for local backup and ask the backup team to backup from that location.
hope local drives are SAN drives mounted?
Regards
Durai Nagarajan
August 21, 2013 at 2:15 am
N Nara (8/21/2013)
Thanks Hanshi. I probably will go with the local backup option.However, I would like to troubleshoot the problem. Any pointers would really help.
First I would try to determine if the destination will always be reachable. Setup a process (batchfile) to continuously ping the destination server and capture the results. You can ask the network admin to monitor the available bandwith and network troughput from the server to the SAN. Also let the SAN admin monitor the buffer utilisation and I/O activities. Combine all monitoring results to see if anything is out of the ordinary at the time the probllem occurs.
August 21, 2013 at 2:27 am
If the issue is specific to these backups then check the mountpoint, and LUN setup you are writing to, could be you have a disk within the LUN (or if the LUN is only the 1 disk) that is about to crash and is causing the LUN to destabilise and drop on and off the network.
August 21, 2013 at 2:00 pm
We ran hrping with millisecond timeframes and it didn't show any packet loss.. not sure now.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply