- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- How to survive a brief but complete network outage...
Categories
Company
Local Language
Forums
Discussions
Knowledge Base
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Knowledge Base
Forums
Discussions
- Cloud Mentoring and Education
- Software - General
- HPE OneView
- HPE Ezmeral Software platform
- HPE OpsRamp
Knowledge Base
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2005 01:50 AM
02-11-2005 01:50 AM
How to survive a brief but complete network outage in MC/SG ?
We don't want any server panic or shutdown the pkgs during the outage, just disabling the pkg switching and disabled alternate node will keep it running on primary node ( though not accessible for few minutes ?) Any ideas /suggestions ?
Thnx
-Q
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2005 01:54 AM
02-11-2005 01:54 AM
Re: How to survive a brief but complete network outage in MC/SG ?
David
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2005 01:55 AM
02-11-2005 01:55 AM
Re: How to survive a brief but complete network outage in MC/SG ?
If you still wish to do this, then move all packages on to one node, and do a cmhaltnode on the other node. once the network is back, cmrunnode on that node and then redistribute any packages you want to run back on to that node
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2005 01:56 AM
02-11-2005 01:56 AM
Re: How to survive a brief but complete network outage in MC/SG ?
Well - two things:
1) DON'T run the heartbeats on the public net - create a private net.
2) You'd have to set the polling for the public subnets higher than the longest anticipated outage. But that's problematic because it will make the failover time that much higher.
Ideally you need to have the diff public LAN NICs on diff switches such that the NIC will failover quickly on a switch/NIC failure.
BUT if the entire network is liable to fail then it makes the network ITSELF the SPOF and what good does cluster SW do then?
Seems to me that mgmnt needs to address this *fundamental* issue, eh?
My 2 cents,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2005 02:24 AM
02-11-2005 02:24 AM
Re: How to survive a brief but complete network outage in MC/SG ?
Forgot to mention that we have a completely isolated private heartbeat LAN which will stay up during the network outage. I know it does not make sense to keep the pkg running but if it is 5 min outage on network and if all comes back as it was, we want to avoid the pkg/cluster stop/start.
If it was a single cluster, we could do it but we are talking about 12 such clusters for 12 business units ! The pkg/stop start and verification involves a "whole new management mess"...you know what I mean :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2005 02:30 AM
02-11-2005 02:30 AM
Re: How to survive a brief but complete network outage in MC/SG ?
This would pre-empt a possible TOC if something WERE to go wrong with that private heartbeat link.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2005 02:34 AM
02-11-2005 02:34 AM
Re: How to survive a brief but complete network outage in MC/SG ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2005 03:00 AM
02-11-2005 03:00 AM
Re: How to survive a brief but complete network outage in MC/SG ?
A prequisite is all servers have an add in card that talks to the main "public" network at your organization.
Reasons for this private network:
1) SG heartbeat, primary should be here, set a secondary on the public network.
2) Being able to do Ignite boots to recover systems that have had major hardware issues.
Ignite won't boot off add in cards.
The setup above should let you run SG through planned or unplanned network failures. If you hang your private switch/hub off the same UPS as your 9000 servers, you will get a chance to gracefully shut down your systems and cluster.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2005 04:40 AM
02-11-2005 04:40 AM
Re: How to survive a brief but complete network outage in MC/SG ?
The problem is, the situation warrants that we *can not * have partial network failure. Only private hearttbeat LAN will remain up.
Public nw ( with all VLANs/Subnets) is going down completely for few mins.
So the question is,
Will the cluster or pkgs survive it, if pkg switching disabled and recover w/o any intervention ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2005 05:00 AM
02-11-2005 05:00 AM
Re: How to survive a brief but complete network outage in MC/SG ?
state. I would rather be in the position of saying "this will work" rather than "this may work" or "I think this will work" --- especially if business continuity is on the line. Besides, how do you know that all the network is going to be back up in 5 minutes?
I sure wouldn't trust anyone if they told me that.
All of this should have really been tested on your Sandbox Cluster/Network and then you could answer the question. Don't have one because it's too expensive? This is why you have one.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-11-2005 05:05 AM
02-11-2005 05:05 AM
Re: How to survive a brief but complete network outage in MC/SG ?
I will repeat...
IF you set the polling interval on the public LANs *high* enough to exceed the "anticipated" outages - YES. Otherwise - NO.
But AGAIN consider these three things:
1) This will cause any "normal" (term evidently used loosely in your orginization) network outages to cause *much* higher failover times
2) Ditto for any NIC failure
3) Do you seriously think that the times the network folks give you are realistic?
My 2 cemts,
Jeff
P.S. To me this is a situation that *begs* to be solved at a much deeper level than you appear to realize. Clusters are absolutely useless in light of an attitude that appears to be in play here.
P.P.S Please don't take this personally, but there *are* things that need to be stood up to in our profession & IMHO this is definitely one of them.
Best Rgds,
Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-12-2005 07:04 AM
02-12-2005 07:04 AM
Re: How to survive a brief but complete network outage in MC/SG ?
BTW, that ten minute tcp_ip_abort_interval on HP-UX is the _default_ - there are applications out there, desiring "fast connection failure notification" that suggest setting those values lower - sometimes even as low as 60 seconds. That those applications should be using an _application-level_ mechanism for their detection is often lost on those developers... so _all_ applications on the system end-up with the short failure time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-12-2005 11:34 AM
02-12-2005 11:34 AM
Re: How to survive a brief but complete network outage in MC/SG ?
I'd expect the public heartbeat in the described case to be only a secondary means.
(Also, I'd run a slip line over fibre converters as a second private line, but that's mine personal madness)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-16-2005 07:55 AM
02-16-2005 07:55 AM
Re: How to survive a brief but complete network outage in MC/SG ?
*All* clusters stayed up, no node paniced ( as HB was alive). The packages which had SUBNET monitoring disabled ( commnented in the pkg config) stayed up, no extra steps required.
The packages which had subnet monitoring enabled, took graceful halt(even though pkg switching was disabled before the change), required restart.
We will be making required changes so we do not see this situation again of network outage and configurations.
The systems which did not have MC/SG survived better and recovered upon network restore w/o any action.
Thanks for your inputs !
-Q