HPE StoreVirtual Storage / LeftHand
cancel
Showing results for 
Search instead for 
Did you mean: 

P4000, Manager Starting/Stopping.

Nahim_k
Frequent Visitor

P4000, Manager Starting/Stopping.

Hi experts,

 

we have the following configuration:

2-node cluster with Network RAID10. After power off one of our nodes hangs with "Manager Starting/Stopping status".

 

The problem is:

we have no quorum, data is unavailable. “The following systems report that they are running managers in a management group, but the management group thinks they are not running managers: Node1”.

 

Failover Manager was in this broken cluster, so it's unavailable too.

We tried to create new Failover with the same mac, but get: “The FOM reports that it is part of a management group, but management group does not agree…”

 

Yes, we know about Jumbo Frames and other network settings. Trying to connect our nodes to different switches with/without Jumbo support, experimenting with enabling/disabling Flow Control give us nothing.

 

I can see the following scenarios to save the data:

1. Find the way to correctly recreate FOM to connect to the Management Group.

or

2. Repair faulty node with Recovery CD. In this case i'm afraid that Group will not correctly detect the node. We can't correctly doing Repair Storage System process because manager is in Starting/Stopping state.

 

Are there any methods to manually reconfigure Management Group, or deleting cluster without data loss?

Any other thoughts?

 

 

Thx in advance!

5 REPLIES
oikjn
Honored Contributor

Re: P4000, Manager Starting/Stopping.

correct me if I'm wrong, but it sounds like you have two problems.  The first being that one node is stuck starting/stoping and the other is that your management group lost quorum.

 

Sounds like you have a network communication problem around with whatever you were playing with on the switches. 

 

Once you make sure the switches are all configured correctly, you should reboot the FOM and the starting/stopping node.  Hopefully they will then talk and everything will start up again.  If not, it sounds like an ideal time to call support to walk through how to force the quorum back online.

 

 

Usually a manager stuck starting/stopping is due to a network communication error and once that is fixed and then the node is rebooted, that will fix itself.

Nahim_k
Frequent Visitor

Re: P4000, Manager Starting/Stopping.

oikjn, thank you for reply.

But I think that our case is a little bit tricky. As I said, we have tried to connect both nodes and laptop with CMC to asingle switch (and disconnect switch from LAN) and play with network settings. No effect.

 

We also have access to saniq filesystem (boot, root, log and configuration partitions) through some kind of a rescueCD. Any changes in saniq configuration files on faulty node get no result. We've done a backup of readable saniq partitions on a faulty node and replace them with the saniq files from healthy node. And no effect again! Faulty node has the same status: Manager Starting/Stopping. CMC still reports: faulty node is running manager, but Management Group does not agree.

 

Is there any way to mount the data partitions from P4000 while booting the node from LIveCD/rescueCD?

oikjn
Honored Contributor

Re: P4000, Manager Starting/Stopping.

sounds like time for a call in to support :-/

David_Tocker
Regular Advisor

Re: P4000, Manager Starting/Stopping.

Yep, sounds like you are screwed.

Would be very interested to know if HP support managed help you out.

Good luck :)
Regards.

David Tocker
Northwoods
Frequent Advisor

Re: P4000, Manager Starting/Stopping.

I was able to reset the start/stop by shutting down every node in the cluster and then powering up all the nodes again. Support should be able to force it while everything is up though.