Operating System - HP-UX
1855073 Members
10640 Online
104109 Solutions
New Discussion

L1000 network connectivity dies after 3 mins

 
R Manickavel
Occasional Contributor

L1000 network connectivity dies after 3 mins

Hi, we have L1000 server with hp-ux 11.0 installed. After rebooting the machine we are able to connect the server from all subnets for first 3 minutes. Afer 3 minutes, we able to connect from same subnet but not from other subnets.

If we restart the net services again it works for 3 mins. we tried with different speed/duplex/auto-negotiation settings but it didn't help.

we tried the following:

1) we connected another hp & NT server in same port & IP and tested. Working fine without any problem.

2) connected the problematic server to some other switch/subnet and tested. It works fine.

3) Applied latest patches suggested by HP. still same status.

4) replaced the network card and tried. same problem.


Please let me know if anyone has solution for this.

Thanks
Manick
5 REPLIES 5

Re: L1000 network connectivity dies after 3 mins

If you are unable to ping your gateways, then the "dead gateway detection" may be affecting your routes.

run : ndd -set /dev/ip ip_ire_gw_probe 0 to disable dead gateway detection

you will need to delete and add the routes back after disabling ndd

you can edit /etc/rc.config.d/nddconf to permanently disable dead gateway detection

TRANSPORT_NAME[0]=ip
NDD_NAME[0]=ip_ire_gw_probe
NDD_VALUE[0]=0


John Palmer
Honored Contributor

Re: L1000 network connectivity dies after 3 mins

So you are losing one or more routes. One possibility is Dead Gateway Detection where the system disables a gateway which it can't 'ping'.

Have a look at the following knowledge base article, you could also try searching for 'dead gateway'.

DocId: KBAN00000750 Updated: 20010723

DOCUMENT
ip_ire_gw_probe

Turns the Dead Gateway Detection on and off.

IP periodically tests if the gateways are available. It not only probes the
active one, but also the "dead" gateways in case the came back to live in the
meantime. The default for this value is "1", so we probe the gateways.

You could see which value is set by executing:

ndd -get /dev/ip ip_ire_gw_probe

This results in "1" probing or "0" not probing.

To see all gateways you could use ip_ire_status

ndd -get /dev/ip ip_ire_status | grep -e IRE_GATEWAY -e flag

This results in a list of all gateways, the flags will indicate a dead gateway.
Another option ip_ire_gw_probe_interval is available which changes the
frequency in which such probes will be performed.

Why would this be used?

The gateway probes are ICMP packets which await a proper reply.

In cases where e.g. a firewall is used it could be wanted to turn off ICMP, so
nobody could ping the firewall but still it works for the desired protocols.
So turning it off would not compromise the work, because we would never send an
ICMP packet to test the machine. On the other hand we would only know if a
gateway is not operational if we try to use it. This results in long timeouts
during the detection.

Usable commands:

Check the current value:

ndd -get /dev/ip ip_ire_gw_probe

Disable Dead Gateway Detection:

ndd -set /dev/ip ip_ire_gw_probe 0

Enable Dead Gateway Detection:

ndd -set /dev/ip ip_ire_gw_probe 1

nddconf entry example:

TRANSPORT_NAME[0]=ip
NDD_NAME[0]=ip_ire_gw_probe
NDD_VALUE[0]=0

Regards,
John
R Manickavel
Occasional Contributor

Re: L1000 network connectivity dies after 3 mins

Thanks for your response. I tried disabling dead gateway detection but the status was same.

The problem has been resolved after changing the following acl in router with the additional entry for this server alone apart from the common entry for all other servers.

permit icmp any any echo-reply
permit icmp host any


But hp-ux 10.2 server worked without this additional entry. Please let me if you know the reason why the additional entry is required for hp-ux 11 server.

Router model : CISCO 2651.

fyi, the server is there in DMZ firewall network.

thanks & regards
Manick
Ron Kinner
Honored Contributor

Re: L1000 network connectivity dies after 3 mins

It appears that you were not able to turn off dead gateway detection so assuming you followed the directions you are probably missing a patch. Did you do an

ndd -get /dev/ip ip_ire_gw_probe

after you tried to change it to verify that the change had taken? Should come back with 0 to show it has been turned off.

Your access list:

permit icmp any any echo-reply
permit icmp host any



The first entry allows echo replies (replies to pings) to be received at the router. So the router can ping the hosts.

The second allows all icmp from the server to reach the router. This allows the server to send an "echo" to the router and since the access lists do not stop traffic generated by the router, the router is able to send a "echo reply" thus completing the ping. You could rewrite the access list to:

permit icmp any any echo-reply
permit icmp host any echo

and it should still work.

10.2 does not have dead gateway detection which is why it works.

Ron





R Manickavel
Occasional Contributor

Re: L1000 network connectivity dies after 3 mins

Hi Ron,

Even after disabling the dead gateway detection it didn't work. This has been escalated to HP.

thanks & regards
manick