- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Redhat 6.1 Cluster Active/Active power off
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2012 07:53 AM
12-26-2012 07:53 AM
Redhat 6.1 Cluster Active/Active power off
Hi team,
I have a two node Active/Active cluster with redhat 6.1 OS., with HP proliant DL785 G6 server and HP ILO fence device.
I have noticed my two servers is in power down state. Could you guide me plesae what is the issue behind of this.
is fence device issue?Very urgent
Arun Basu
- Tags:
- iLO
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2012 04:00 PM
12-26-2012 04:00 PM
Re: Redhat 6.1 Cluster Active/Active power off
Since you have provided only very few details of your configuration, I can only guess. My guess is that this is a textbook case of "fence death" in a two-node cluster.
Explanation:
The root cause for this might be a failure in your heartbeat network. When the cluster heartbeat stops working, both nodes in a two-node RedHat cluster will attempt to fence the other node. The node that first successfully fences the other one may continue cluster operations. This is called a "fence race". Clusters with more than two nodes (or two nodes and a quorum disk) will use voting logic instead of a fence race.
RedHat Cluster is built to assume that only one fencing operation may succeed at a time. The RedHat example configurations use a common remote-controllable PDU or a similar device for power fencing. If such a device acceptsonly one power switch command at a time, it will satisfy the design assumption of the RedHat Cluster software.
Using iLOs of each node as fence devices does not satisfy the design assumption, since each iLO can be accessed separately and thus both nodes can simultaneously power off each other, if the fence race ends up being a tie. The resulting situation is known as "fence death" - both nodes will simultaneously power off each other.
Solutions:
You must either eliminate the fence race or ensure that the fence race will always have one clear winner (no ties).
If it's OK that a specific node will always win a fence race, you could add a delay parameter to the fencedevice definition for that node. Please see:
https://access.redhat.com/knowledge/solutions/54829
Adding a third node or a quorum disk would allow the cluster to use the quorum voting logic: if a 3-node cluster splits into two parts of 2 + 1 nodes because of a network failure, the 2-node fragment will still have quorum and will attempt to fence the single node. The single node will recognize that it has lost quorum and will passively wait for fencing. The quorum disk serves the same purpose, providing extra quorum vote(s) to only one cluster fragment that is healthy according to any configured qdiskd heuristics tests. In such a configuration, iLOs are an appropriate fencing solution.
If you cannot add a quorum disk or a third node and adding a fence delay value is not an appropriate solution for you, you might need to re-think your fencing solutions.