1771354 Members
2134 Online
109005 Solutions
New Discussion юеВ

Network Issue

 
richard_kouadio
Advisor

Network Issue

Hi expert,
I am facing a disturbing problem.
We have an Alpha server running tru64 and datase application.
Users connect to databse application by telnet.
There are two type of users those that are in a samme bulding of the server and those that are on another buldings.
Now to increase the performence we have buy a new rx 6600 server running hpux 11iv3 based opreating system and installed the database application on the rx server.
We have also an application that monitoring our network.
The rx server and aplha server use the same gateway.
Every morning users connect to the Alpha server and woat this work, all is fine at this.
At afternoon we ask users to connect to rx server and problem begins.
Our network minotoring application show us that all network link failed, by putting the link in red like in the picture attached and user lose connection. But user in the same bulding of rx server still connected.
When use ask user to reconnect to aplha server the link become green that means all is fine.
Now I don't no where to investigate anymore.
I ready to give more information if you.
Please help me.
7 REPLIES 7
rick jones
Honored Contributor

Re: Network Issue

Are the users who are in the same building as the rx6600 also in the same IP subnet? That is, do they not have to go through a router to reach the HP-UX system?

HP-UX has a feature called dead gateway detection - it will periodically ping the configured gateways (eg the the default gateway) if if it does not receive a reply after a certain number of tries, will assume the gateway is down and mark that route as unuable. If your gateways are configured to not respond to pings (eg ICMP Echo Requests) this could cause the HP-UX system to mistakenly believe the gateway is down.

This functionality can be controlled via an ndd setting (man ndd) called ip_ire_gateway_probe:

$ ndd -h ip_ire_gw_probe

ip_ire_gw_probe:

Enable dead gateway probes. This option should only be disabled on
networks containing gateways which do not respond to ICMP echo
requests (ping).
[0-1] Default: 1 (probe for dead gateways)

you can set that by hand and then edit the /etc/rc.config.d/netconf file - do be certain to read the comments in that file and make sure you do the right things with the indicies for the variables and such...
there is no rest for the wicked yet the virtuous have no pillows
SoorajCleris
Honored Contributor

Re: Network Issue

Hi,

In addition to Ricks suggestion,

1. Are you using virtual IPs for the application?

2. "Our network minotoring application show us that all network link failed"

Are you able to ping (or check the connectivity manually). the server from other building, after you see the application shows connections are disabled?

3. Please confirm, if your samebuilding users are connected via router ( different subnet)

"UNIX is basically a simple operating system, but you have to be a genius to understand the simplicity" - Dennis Ritchie
richard_kouadio
Advisor

Re: Network Issue

The users in the same bulding don't use gateway to communicate with rx server.
There are in the same subnet.
rick jones
Honored Contributor

Re: Network Issue

Then that certainly sounds like the ip_ire_gw_probe bit.
there is no rest for the wicked yet the virtuous have no pillows
TwoProc
Honored Contributor

Re: Network Issue

Not saying "this is certainly it", by a long stretch, but one that has tripped me up when introducing new network areas and routes to existing structures with huge load changes is what I'm noting below. I'm including it because I believe it can/does look like what you're describing, and once researched is pretty easy to rule out as a possible problem.

Ask your networking guys if the spanning tree root bridge is moving between the main switches for the buildings is moving around when the network load moves from building to building. It needs to be reviewed per vlan, per managed switch that can/does tree spanning, while keeping in mind that most any enterprise class switch falls into this category.

I've seen that one, where the root bridge of a VLAN can move around with a load moving at a time to a diff switch. You'll probably want to set priorities on those switch so that they can't decide for themselves who owns the root bridge. If you talk to the network guys and they don't know what you're talking about, you probably need a little outside help to explain to your team what it is and how it works, and how to set it up and maintain it properly. It's an important step that can be easy to overlook. It manifests itself as temporary outages, to full on persistent ones. All occurences I've seen are more wave like in function, of duration 5 to 15 minutes. Besides looking directly inside the switch for topology change events (there is a counter in there), a big tip off is a big landslide of unexplained broadcast traffic during the connectivity event. That's the root switch event itself, trying to figure out new paths to all the assets out there.
We are the people our parents warned us about --Jimmy Buffett
richard_kouadio
Advisor

Re: Network Issue

Hi rick,
We are able to ping our gateway.
Do you think that despite this ndd -ip_ire_gw_probe can solve my problem?
rick jones
Honored Contributor

Re: Network Issue

If the *rx6600* is able to ping the gateway then there is something else happening. BTW, "back in the day" when this was common, the symptom would be that there would be connectivity through the gateway for a little while, and then it would cease. Basically, that was the interval over which the HP-UX stack was still making-up its mind about the gateway.
there is no rest for the wicked yet the virtuous have no pillows