- Community Home
- >
- Storage
- >
- Midrange and Enterprise Storage
- >
- StoreVirtual Storage
- >
- Manager and Failover Manager Issue
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-08-2012 08:45 PM - last edited on тАО11-20-2013 05:29 PM by Maiko-I
тАО09-08-2012 08:45 PM - last edited on тАО11-20-2013 05:29 PM by Maiko-I
Manager and Failover Manager Issue
I've got a 2 node 4500G1 cluster. Both nodes are running managers and I have a FOM running on an ESX server. I recently re-IP'd the nodes. I can see all resources online and I can actively ping them (the FOM and both 4500G1) but the FOM and one of the nodes is stating that their manager is offline.
I've tried several things like rebooting all the resources, but I can't seem to get the managers online on the failover manager and the one 4500 node. The other 4500 node is reporting it's manager as online. Again, all resources can be pinged and they all appear to be online.
Anyone seen this behavior before?
P.S. This thread has been moved from HP 3PAR StoreServ Storage to HP StoreVirtual Storage / LeftHand. - Hp Forum Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2012 02:34 PM
тАО09-09-2012 02:34 PM
Re: Manager and Failover Manager Issue
This sounds like a network issue.
Have you tired re-scaning the managers?
Also can all the nodes ping each other?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2012 03:30 PM
тАО09-09-2012 03:30 PM
Re: Manager and Failover Manager Issue
Yeah, its a deep issue. I got off the phone with LHN and basically the VIP is locked on a node right now and it's down. The quorom therefore is down. Rebooting the node with the failed VIP does not result in the VIP being brought up on the second node. It's a huge mess right now. So basically the FOM can be pinged, node 1 can be pinged, and node 2 can be pinged. The VIP can't be pinged and it's located on node 2. Node 2 is stating that the manager is down and offline.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-09-2012 11:32 PM
тАО09-09-2012 11:32 PM
Re: Manager and Failover Manager Issue
Did you get anywhere with this?
When we got our first set of nodes, I did a lot of experimentation - tried to burn the units down in anyway I could before I put them into production.
Found out about FOMs the hard way while testing, had a loss of quorum issue and actually had to reload one of the units after a hard evict to repair.
We are based in NZ and apparently if you are not in the US timetable for support, you can't have it, which is total crap and HP should hang their heads in shame over that one. What a sorry state of affairs (you are okay if you have a 3par) So I had to figure it all out byself via the pretty average documentation.
However in your case, If the CMC is not pointing out an obvious quorum issue then you may need to just wait on the experts to sort it out. Have you considered dropping the 'dodgy unit' and forcing quorum using the FOM and the other Node? Keep in mind the possibility of data loss, but if you havent written anything to the nodes since the failure you should be okay...
David Tocker
- Tags:
- 3PAR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-10-2012 12:36 PM
тАО09-10-2012 12:36 PM
Re: Manager and Failover Manager Issue
Upon noticing that jumboframes were not explicitly stated on the switch ports, I decided to turn them on. When I turned them on, I must have configured them improperly the first time and that's when everything went offline/sideways. So I then attempted to back out the jumbo frames statements from the ports corresponding to the SAN traffic. Still no dice.
I got LHN support on the phone (finally after 24 hours of contract bickering) and they managed to SSH into the p4000 and they ran a ping with oversized packets. Once they saw that standard ping was working but oversized ping packets were not, it pretty obviously pointed to a misconfiguration on the network. I again went back in to the switches and reconfigured the ports for jumboframes but oversized the frames a few bytes bigger. Once I did this, the network came right online, the quorum dispute went away, and the SANs resynched.
As best as I can tell, by default, the HP A5800 switches let all packets pass whether they are jumbo or basic. Once you turn on the jumboframe type, its a 1 or a 0 at that point. You either have to have the command enabled or disabled. Subsequently if you disable the jumbo frame statement, it still leaves:
undo jumboframe enable
Instead of clearing the entire statement from the port. Very peculiar. Hope this helps someone and if you need more explicit information, I can go into more detail.