- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Fence Hang when detach power supply, Urgent
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2009 01:45 PM
03-17-2009 01:45 PM
I've a big problem.
I've installed RH enterprise 5.2 in cluster modality on 2 hp dl380 server, and I've decide to use the fence ilo to fence server.
This solution work very well if I detach ethernet cable, or manual "switch" (clusvcadm) services from a server to another and so on.
But if I detach the 2 cable from the power supply from one server (that go down :-)), the up server start to try to fencing the server that is down and don't understand that the other is down!!!
So the fencing does not work (is normal the other on hasn't no electricity cable on), and so the up server wait wait and wait... and the services that was on the down server don't "switch" because the cluster wait for fencing.
On the up server log I get:
mhs2 fenced[4137]: fence "mhs1.local" failed
mhs2 fenced[4137]: fencing node "mhs1.local"
and other line that I've attached on "error.txt" attach.
Only when I restart the down server, the cluster restart to works correctly.
Please, HELP ME!!!!!!
Bye
mb from Italy
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2009 02:37 PM
03-17-2009 02:37 PM
Re: Fence Hang when detach power supply, Urgent
I don't know much about RH cluster, but reading your post I'd suggest you open a bug report at Redhat, as the behaviour you describe looks like a buggy feature.
Is there any "out of band" heartbeat in RH cluster that you may be lacking which may cause the fencing to hang ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2009 03:21 PM
03-17-2009 03:21 PM
Re: Fence Hang when detach power supply, Urgent
If the server chassis has no power, the iLO also has no power. The remaining cluster nodes have no way of knowing that the server is powered off, since communication has been severed. The cluster only knows that it cannot reach the node via the network.
If you shut the node down cleanly, the server should leave the cluster in a clean state and not need to be fenced.
I am not as familiar with RH Cluster Suite, but Polyserve works the same way.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-17-2009 11:59 PM
03-17-2009 11:59 PM
Re: Fence Hang when detach power supply, Urgent
So if a Server go down in a 2 component cluster, the other one can't operate a "switch" of the services of the down server?
Is no good for us....
I attach my cluster.conf
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-19-2009 12:42 PM
03-19-2009 12:42 PM
Re: Fence Hang when detach power supply, Urgent
I think RHCS uses a lock lun which is used as a quorum in case one of the node fails.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-19-2009 12:51 PM
03-19-2009 12:51 PM
SolutionYes, that is what I expect. If you're removing multiple power sources, you're simulating a larger failure than it can handle. Most clusters prevent against a single point of failure. You're introducing at least 2 points of failure (double power failure or supply failure.)
You're powering off the server as well as the iLO, which it needs to fence.
If you want to simulate something more likely, like a kernel panic, bringing the node down, I'd suggest actually crashing the kernel, such as using the Alt-SysRq trigger. Then the node will be properly fenced and activity should continue if it's configured correctly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-02-2009 05:50 AM
04-02-2009 05:50 AM
Re: Fence Hang when detach power supply, Urgent
Agree with previous posters.
Lets look at this. Polyserve fences a node if there is a network outage where the nodes in the cluster cannot communicate to one of the other nodes. Until it fences the node that is not talking the rest of the cluster does not want to generate i/o to the shared cluster file system since they do not know that the node that is not responding is down and not writing to the disk too which will corrupt data.
So if a node stops talking to the cluster we fence them to make sure they are down and not accessing the disks.
When you killed the power from the cabinet you stopped the node from commuicating with the rest of the cluster plus you prevented the rest of the cluster from fencing the node.
Server fencing is to shut down the node if the OS fails but the ilo works.
There is a command you can execute
mx server markdown
on one of the nodes to say the node was fenced but this should only be done if the node really is down for another reason.
You should make sure the cabinet has multiple sources of power and does not loose all power.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-02-2009 06:59 AM
04-02-2009 06:59 AM
Re: Fence Hang when detach power supply, Urgent
I think this may be a flaw in the fence software.
You may wish to test this with fence from the recently released RHEL 5.3. Other rpms may also be required.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com