Operating System - HP-UX
1753943 Members
8984 Online
108811 Solutions
New Discussion

Re: VM loses network connection

 
SOLVED
Go to solution
Patrick Wallek
Honored Contributor

Re: VM loses network connection

>>but the network connectivity was still there (I can still SSH, and HTTP to the server).

 

Where were you SSH'ing from?  Were you on the same network segment as the VM (where you DO NOT have to go through the router)?  If so, the fact that you can SSH and HTTP makes sense.

 

Things to check:

 

netstat -in

netstat -rn 

ping a server on the same subnet

ping  the router

ping something on a different network subnet

 

 

Jeromejay
Advisor

Re: VM loses network connection

I was SSHing from another network, and HTTP from yet another one ... which, in short, implies that the network connectivity was still there (as opposite to the original outage, where everything was down).

 

Also: I could not ping anything, since I blocked ping outbound (maybe I should have blocked the GW IP only).

 

 

Thinking back on it:

as mentionned in my original post: restarting the virtual switch on the physical host solved the issue for the Guest ... would that indication tell us that the Dead GW detection was a consequence, and not the issue ?

 

Moreover: we know for sure that the GW is fine (reliable server, used by many other servers). If our faulty server failed 1 ping to the GW, and activate the infamous Dead GW detection by stopping using this route ... would it not come back on the next succesful ping ? (I assume it keeps on trying, since we have error messages every 183seconds).

 

 

All in all: the more I think on it, the more I think the Dead GW detection error message is a consequence of another failure ...

 

note: still in the process of updating the AVIO drivers

 

Patrick Wallek
Honored Contributor

Re: VM loses network connection

>>restarting the virtual switch on the physical host solved the issue for the Guest ... would that indication tell us that the Dead GW detection was a consequence, and not the issue?

 

I would think so, yes.  

 

Is there a way to check statistics for the virtual switch?  Things like packets in, packets out, number of errors, etc?

 

>>would it not come back on the next succesful ping ?

 

I don't think it does.  I think once it is disabled, it stays that way.  I could be wrong though...

 

>>the more I think the Dead GW detection error message is a consequence...

 

I tend to agree.  A ping is a pretty low level check.  If the network is so busy that a ping is dropped, then I would think there are other issues.

Jeromejay
Advisor

Re: VM loses network connection

>>>>would it not come back on the next succesful ping ?

 >>I don't think it does.  I think once it is disabled, it stays that way.  I could be wrong though...

 

no offence, but I hope you're wrong :) (I can't conceive HP would have done something that stupid).

Also: since the error repeats every 183sec in the log file, I guess it keeps on trying.

 

 

As for checking the virtual switch: I should have done it before the restart ... (like any other investigation ...).

 

As usual in this case: I'm not sure if I hope it happens again so I can investigate, or if I hope it never happens again .

Patrick Wallek
Honored Contributor

Re: VM loses network connection

I have been looking into the dead gateway detection some more and found something I was not aware of.

 

Supposedly if the dead gateway is the last default gateway it will remain enabled, but a message will still be logged.

 

To check the status of a gateway:

 

# ndd -get /dev/ip ip_ire_status | grep -e IRE_GATEWAY -e flag

 

I cannot find anything definitive about re-enabling a gateway, but the following from 'ndd -h' indicates that it should:

 

# ndd -h ip_ire_gw_probe_interval

ip_ire_gw_probe_interval:

Controls the probe interval for Dead Gateway Detection.
IP periodically probes active and dead gateways.
ip_ire_gw_probe_interval controls the frequency of probing.
With retries, the maximum time to detect a dead gateway is ip_ire_gw_probe_interval + 10000 milliseconds. 
Maximum time to detect that a dead gateway has come back to life is ip_ire_gw_probe_interval. [15000,- ] Default: 180000 (3 minutes)

 

 

Jeromejay
Advisor

Re: VM loses network connection

Thank you so much for the additional information !

I'll keep the command to check the Dead Gateway status ... although, since the error message is logged, I can already guess the results.

 

note: too bad, the server is still up and running

 

 

ps: I forgot to add: the other VM on the same host has the same IP settings than the one failing ... if one detects the GW as down, the other "should" maybe do the same

 

Thanks again !