Cluster not failing over after adding physical disk

Question

Cluster not failing over after adding physical disk

jayoub

SSCertifiable

Points: 6991
More actions
January 25, 2014 at 8:16 pm

#281636

Problem: Cluster not failing over after adding physical disk
Specs:
Server 2003
SQL 2005
Nodes: Active/Active, two node cluster each with an instance installed
Node1 has group A
Node2 has group B
Recent Event: I added a Physical Disk resource to group B on node 2 of the cluster
Story: I recently added a physical disk resource to group B on node 2 of a two node cluster. We are patching the servers and moved group A from node1 to node2, restarted node1 and moved group A back from node 2 to node 1.
An issue occurred when moving group B from node 2 to node 1. What happens is that the resources, SQL, IP… start moving over to node1 then the disk I just added fails and the whole group B moves back to node 2 (Fail Back).
I checked all the setting of the disk resource and they match perfectly with the others.
What I noticed is in computer management --disk manager of node 1 I do not see the disks for node2.
More specifically, in node 2 disk manager I can see all the local disk and the disks for node 1. The disk for node 1 are marked unreachable and have a red x. On node 1 I cannot see the disks for node 2 in the same way. I am only seeing the local disks on that node.
Another thing I noticed is that disk Manager on both nodes show a Disk 2
I rescanned disks on both server and still no luck.
Any help would be appreciated.
Jeff

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

jayoub SSCertifiable Points: 6991 More actions · Answer 1

More information:

I am in Node 2 in Computer Management, but this time SAN Disk Manager and under Disks I do not see the disk that is failing during a Move Group operation.

On node one all the disks are listed but node two does not have the disk that I just added.

Could this be my trouble.

Any help is appreciated.

Jeff

Perry Whittle SSC Guru Points: 234013 More actions · Answer 2

Once you add the new disks you must add them as a dependency to the SQL Server service cluster resource, this will require a restart of the clustered service

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

jayoub SSCertifiable Points: 6991 More actions · Answer 3

I did add the resource as a dependency to SQL Server Service, but have not yet restarted the "Cluster Service" on that node.

I will give it a try tonight after hours and let you know what happens.

Your help is appreciated.

Thanks

Jeff

Perry Whittle SSC Guru Points: 234013 More actions · Answer 4

jayoub (1/27/2014)
I did add the resource as a dependency to SQL Server Service, but have not yet restarted the "Cluster Service" on that node.
I will give it a try tonight after hours and let you know what happens.
Your help is appreciated.
Thanks

Please run the following from a command prompt on the server and post results

cluster node

cluster res

Take the new disks resource name from cluster res and put into the following

cluster res "disk resource name" /listowners

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

jayoub SSCertifiable Points: 6991 More actions · Answer 5

Cluster res

Resource Group NodeStatus

Disk L: GroupsharePointZ010online

Disk M: GroupsharePointZ010online

SQL Network Name (SQLSHAREPOINT)GroupsharePointZ010online

SQL IP Address 1 (SQLSHAREPONT)GroupsharePointZ010online

SQL Server (SHAREPOINT) GroupsharePointZ010online

SQL Server Agent (SHAREPOINT)GroupsharePointZ010online

SQL Server Fulltext (SHAREPOINT)GroupsharePointZ010online

Disk J: GroupDiscoZ009online

Disk S: GroupDiscoZ009online

SQL Network Name (SQLVIRTUAL)GroupDISCOMPZ009online

SQL IP Address 1 (SQLVIRTUAL)GroupDISCOMPZ009online

SQL Server (DISCOMP) GroupDISCOMPZ009online

Disk T: GroupDISCOMPZ009online

NEWDB (this is a disk) GroupDISCOMPZ009online

Disk K: GroupDISCOMPZ009online

SQL Server Agent (DISCOMP)GroupDISCOMPZ009online

SQL Server Fulltext (DISCOMP)GroupDISCOMPZ009online

Cluster IP Address Cluster GroupZ009online

Cluster Name Cluster GroupZ009online

Disk Q: Cluster GroupZ009online

MSDTC Cluster GroupZ009online

Disk O: (problem disk) GroupsharePointZ010online

Cluster Node

NodeNodeIDStatus

Z0102UP

Z0091UP

Cluster res "Disk O:" /listowners

Z010

Z009

Jeff

jayoub SSCertifiable Points: 6991 More actions · Answer 6

I just found out that the SAN Admin provisioned the disk to only the one node Z010 and did not include the other node Z009 in the storage software. The other drives have both hosts listed and probably this is the cause of the problem.

I will restart the cluster services or even the whole box tonight and check it and let you know.

I provided the information as best i could. I had to retype the whole thing and the formatting did not come out in a nice way.

Thank you very much

Jeff

Perry Whittle SSC Guru Points: 234013 More actions · Answer 7

jayoub (1/27/2014)
I just found out that the SAN Admin provisioned the disk to only the one node Z010 and did not include the other node Z009 in the storage software.

Thats your problem right there

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

jayoub SSCertifiable Points: 6991 More actions · Answer 8

Still no luck. I rebooted the server and tried the move and it was the same results.

I still feel like the SAN Admin has not provisoned the drive correctly. I have done the job before with another SAN admin and it went without a hitch

In computer Management there is a folder called SAN Disk Manager and the trouble drive is not listed there. All other drives are listed and working. I have a feeling that there is more to the provisioning process that must be done.

I will keep trying and let you know what happens. I may have to start digging into the SAN myself and see. Sometimes a second set of eyes can spot something.

Thanks

Jeff

Perry Whittle SSC Guru Points: 234013 More actions · Answer 9

jayoub (1/27/2014)
Still no luck. I rebooted the server and tried the move and it was the same results.
I still feel like the SAN Admin has not provisoned the drive correctly. I have done the job before with another SAN admin and it went without a hitch
In computer Management there is a folder called SAN Disk Manager and the trouble drive is not listed there. All other drives are listed and working. I have a feeling that there is more to the provisioning process that must be done.
I will keep trying and let you know what happens. I may have to start digging into the SAN myself and see. Sometimes a second set of eyes can spot something.
Thanks

Its easy to get the IDs wrong and leave a device masked, i have experienced this in the past

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

jayoub SSCertifiable Points: 6991 More actions · Answer 10

Update to the issue.

The failover is still not working and here is the problem. We have an HP EVA SAN and we also have software called Falconstore that is managing the cluster drives.

The current SAN admin needs to get Falcostore out of the equation, so he wants to provision the drives using only the HP EVA software and get that to work. Once he gets this working I am sure the physical disk will begin failing over correctly

Again thanks for the help and I will update once it is completely figured out.

Jeff

jayoub SSCertifiable Points: 6991 More actions · Answer 11

Update to the issue.

The failover is still not working and here is the problem. We have an HP EVA SAN and we also have software called Falconstore that is managing the cluster drives.

The current SAN admin needs to get Falcostore out of the equation, so he wants to provision the drives using only the HP EVA software and get that to work. Once he gets this working I am sure the physical disk will begin failing over correctly

Again thanks for the help and I will update once it is completely figured out.

Jeff