- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- lan failure simulation
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2002 03:44 AM
01-24-2002 03:44 AM
lan failure simulation
I'm doing lan link tests and I got an unexpect result. When I disconected all cables from one
machine, the other, that has all cables conected rebooted and the one that has no cables got the lock and still active. the
command cmviewcl show that lan interfaces wasnt
down.
Have someone pass throught this problem ?
Now, I'm just using core-io lan, HP-UX 11 and
MCSG 11.07
thanks in advace!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2002 03:57 AM
01-24-2002 03:57 AM
Re: lan failure simulation
Have you checked syslog.log for error messages from cmcld ?
Hilary
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2002 03:59 AM
01-24-2002 03:59 AM
Re: lan failure simulation
I do not run service guard but it looks like a heartbeat setting or a patch related to heartbeat.
Paula
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2002 04:09 AM
01-24-2002 04:09 AM
Re: lan failure simulation
Another thought.
If node A is running package & has cluster lock disk, and you pull lan cables from A (both data & heartbeat), both machines will think the other has died & race for the cluster lock disk. A already has it, B won't get the lock disk, so will TOC.
Hilary
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2002 04:13 AM
01-24-2002 04:13 AM
Re: lan failure simulation
David.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2002 04:20 AM
01-24-2002 04:20 AM
Re: lan failure simulation
A good test, can be disconnect just one network cable, look if traffic and addresses are conmuted to the other one. Then disconnect this other, I think package will be transferred to alternate node in this situation.
David.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2002 04:38 AM
01-24-2002 04:38 AM
Re: lan failure simulation
rather than ifconfig lan down
use lanadmin to reset the lan.
That should simulate it.
Or pull cable or wet the card!
I prefer lanadmin!
Later,
Bill
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2002 04:56 AM
01-24-2002 04:56 AM
Re: lan failure simulation
As previously said, if the node that stayed up already had access to the cluster lock disc, because ALL communiactions were lost, that node managed to grab the cluster lock diosc BEFORE the other node, and hence stayed up, forcing the other node to TOC.
You should definitely check you ar patched to the lates possible level, bearing in mind that SG patches are NOT included in the Patch Bundle CD's!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-24-2002 05:55 AM
01-24-2002 05:55 AM
Re: lan failure simulation
Take a look at this thread from the SG FAQ, which tries to explain the scenario you are having,
http://docs.hp.com/hpux/onlinedocs/ha/haFAQindex2.html#All%20Networks%20fail,%20which%20node%20wins?
Here is the FAQ,
http://docs.hp.com/hpux/onlinedocs/ha/haFAQindex2.html
Hope this helps.
Regds
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-26-2002 06:02 AM
01-26-2002 06:02 AM
Re: lan failure simulation
is not an correct behave.
there was a GOOD node on
cluster and this should
be up, get the lock and
take the ownership of
all packages running on node
that has all cables disconected. this one, that
have cables disconect should not get the lock because his
networks were down. if someone
have tested this on both nodes, please notify me
thanks again
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-26-2002 02:05 PM
01-26-2002 02:05 PM
Re: lan failure simulation
- Protect against single points of failure
- Under no circumstances corrupt your data
What you have simulated is a situation with multiple points of failure, consider what could be going on in this scenario:
Both nodes cannot talk to each other except via cluster lock - what is very clearly a network failure on one machine to you does not necessarily appear to be from the other machine - consider that from the point of view of the other machine it can still see the network but doesn't know what state the other node is - it could be that the multiple point of failures is networking equipment between the two nodes - now both nodes still have active networks (and presumably some clients can connect to each) but they can't talk to each other. If the nodes behaved as you have suggested they should then they would BOTH beleive they are the good node, and both try to run your package - not good for the integrity of your data.
This is an immutable rule of ServiceGuard in a two node cluster - when either node doesn't know what the other is doing, it's cluster lock time, and it won't always be the genuinely good node that wins.
HTH
Duncan
I am an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-28-2002 04:18 AM
01-28-2002 04:18 AM
Re: lan failure simulation
I still belive that is not
correct. If the node has
all networks down, it should
not stay running and the
other that has all cables ok
call the TOC. Imagine this
situation in a production
environmet. An node, that
is all right with it, call
the TOC unexpect. I agree
with this situation if BOTH
have at least one network up.
I will open an case in HP call
center. Thank you all
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-28-2002 04:52 AM
01-28-2002 04:52 AM
Re: lan failure simulation
If you wish to try to prevent this, then you could consider setting up a serial heartbeat link, as this is designed to do exactly what you are looking for, i.e. ensure the node with the good network connections stays up.
There are some not good things about having a serial heartbeat as well.
The bottom line is that you are pullling ALL network connections, and if your network redundancy design is such that this could happen, I would redesign the networking area.
Opening a call with HP should give you the same answer.