- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- IP routing issues with ServiceGuard on HP-UX 11iv3
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-14-2014 07:08 AM
01-14-2014 07:08 AM
IP routing issues with ServiceGuard on HP-UX 11iv3
Hello folks,
I would like to submit an issue we have on our newest ServiceGuard cluster running on a pair of BL860 i2 blades.
This has been reported to HP but so far very little progress has been made. Since I really can't see what's so specific about our configuration that makes us have this problem, I would suppose that other have been bitten by this too and I would love to hear about what you've done to work around this.
Sorry, this is going to be a lengthy post because I need to provide all the relevant information. Please bear with me and many thanks in advance for those who will take the time to read it.
Any information you can provide on these issues, including "we have this too", is more than welcome.
-----------------------------------------------------------------------------
Problem 1: need to explicitly and manually set a default route for every ServiceGuard package running on the machine (except those packaging SRP containers) to maintain their network connectivity with other VLANs.
-----------------------------------------------------------------------------
A bit of background information first: these servers have 3 active LAN interfaces per server:
- lan2 in network 10.149.160.0/24: production traffic
- lan4 in network 10.149.247.0/24: administrative traffic
- lan5 in network 192.168.2.0/24: ServiceGuard hearbeat traffic
Initially, only one default gateway was defined to 10.149.160.254 in /etc/rc.config.d/netconf, hence on interface lan2.
ServiceGuard packages running on this machine have IP addresses in network 10.149.160.0/24 hence they come up as secondary lan2:X interfaces.
Such a configuration makes the administrative IP address 10.149.247.X unreachable from any of our IP networks (we have 10.149.0.0/16) except from 10.149.247.0/24 itself, unless we force ip_strong_es_model=2 in the network stack parameters, which is not the default configuration.
We used to do this on our two older clusters, but this one is supposed to host SRP containers which *require* ip_strong_es_model=1 so that's not an option.
Therefore we have added a second default gateway to 10.149.247.254 in the network boot parameters /etc/rc.config.d/netconf. This solves the network connectivity for the administrative (lan4) IP.
The heartbeat interface (lan5) is obviously not concerned. It doesn't get any traffic from outside of its own private network.
However, we have noticed that:
- SG packages running a SRP container work fine (no IP connectivity issue)
- "plain" SG packages (e.g. running an Oracle DB engine) are unreachable from any other network than 10.149.160.0/24. We need to explicitly add a separate default gateway for each and every package.
For example: assume a package whose IP adddress is 10.149.160.123, which comes up as lan2:1. Full IP connectivity can only be achieved if the following command is issued during package startup:
/usr/sbin/route add default 10.149.160.254 1 source 10.149.160.123
The new default route appears as follows in "netstat -rn":
default 10.149.160.254 UG 0 lan2:1 1500
The "normal" SG script that takes care of bringing up the package's IP address (namely /etc/cmcluster/scripts/sg/package_ip.sh) DOES NOT do this. However the additional scripts run when a package hosting a SRP container is started (/etc/cmcluster/package_name/srp_route_script) DOES do it. Therefore someone at must HP have realized that this was required, but why hasn't this been backported to the main SG scripts?
We've eventually resorted to patching package_ip.sh to add the needed default gateway at package startup and remove it at package shutdown, but this really is an ugly hack I'm not proud of.
-----------------------------------------------------------------------------
Problem 2: outgoing connections from applicative processes within SG packages (including the ones made from standard HP-UX commands such as remsh) have completely unpredictable source IP addresses
-----------------------------------------------------------------------------
This is an entirely new issue. No such behaviour has ever taken place on the two other BL860 clusters running HP-UX 11iv3. We observe that outgoing connections made from processes running within SG packages have unpredictable and changing source IP addresses. Since all packages have IP addresses within 10.149.160.0/24, we would expect the source IP address to be the one set to lan2 at boot. This CERTAINLY was the case on the older machines.
We observe that these IP addresses can be ANY of the addresses set on lan2 i.e. the address of any active SG package. It does vary over time too. Starting a SG package tends to make the source IP address for outgoing connections made by any process running on this machine "stick" to the address of the newly started package... until another one is started.
This makes things as maintaining .rhosts files on remote target servers getting remsh or rcp commands issued from a SG package completely impossible, since we have to account for every possible package becoming the source IP for the connections they get.
The only reply we've had so far from HP: "application processes need to be bound to their socket by their IP address only and not by ANY address" is completely unacceptable for many reasons such as:
- we're not going to hardcode the IP address assigned to the relevant SG package into the source of any of our applications
- for several applications, source code is no longer available or never was and /or they're HP-PA applications running under ARIES
- this also affects standard HP-UX commands such as remsh, rcp, ftp etc. We're not going to recode these, or are we?
Thanks for your time reading this,
Greets,
_Alain_
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-21-2014 07:35 AM
01-21-2014 07:35 AM
Re: IP routing issues with ServiceGuard on HP-UX 11iv3
This document discusses routing in an SRP environment:
This link may get you to it:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-22-2014 10:23 AM
01-22-2014 10:23 AM
Re: IP routing issues with ServiceGuard on HP-UX 11iv3
For 1)
DB engine should run in a container.
HP-UX Containers (SRP) A.03.01 Administrator's Guide
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-28-2014 07:34 AM
01-28-2014 07:34 AM
Re: IP routing issues with ServiceGuard on HP-UX 11iv3
Thanks for your replies,
@Stephen: have you actually read this document? it deals with the creation of a SRP package and Oracle configuration, but specifically doesn't say much if anything about the IP routing issues.
@Laurent: you seem to imply that you can't have a mix of "regular" SG packages and SG-packaged SRP containers on the same server. Where do you get this information from? Our HP support folks certainly haven't complained the slightest bit about us having both on the same box...
As for the problem already being present although less likely before SRP packages, well, in this case there must be orders of magnitude of diffrences. Our SG packages have been running for quite a few years on the two other clusters with es_strong_model=2, making hundreds if not thousands of connections per day and we have *never* encountered such a problem. The source IP address of these outgoing connections has always been predictable and = the IP address of the server itself on that interface (in my example, the IP address of lan2).
Now it tends to take the IP address of the last package started on the server many timer per day.
This problem is already known to HP because SG daemons themselves can be affected as I've been told. Sometimes intra-cluster connections made by these are rejected by the target node because the source IP is not recognized as the one belonging to the source node. This can cause host panics due to safety timer expired. A local HP support folk told me that (quote) "an upcoming release of Serviceguard will address the issue by forcing daemons to bind their sockets explicity to the native IP address of the server instead of to 'any'" (end quote).
This document covers the routing issues quite nicely although it doesn't provide a solution.