- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- VM loses network connection
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2013 02:53 AM
01-30-2013 02:53 AM
Hi all,
So, we built 2 VM on a BL860C blade, and everything is working fine: both are fully configured and running, no issues there.
However, after some time (ranging between 2h and a a few weeks), one of the 2 VM loses its network connection completely...
Things we have seen:
- from the VM console, /sbin/init.d/net stop and /sbin/init.d/net start does not solve the issue
- rebooting the VM solves the issue
- restarting the virtual switch solves the issue
I don't think this issue can be solved on the spot with those infos (plus the ones below)... but my question is then:
=> what more can we check ?
we've checked logs (see below), NIC status, Ip status ... we can't find anything relevant.
ie: do you have specific commands for network troubleshooting we could use ?
Thanks for your help !
Some more info:
There are mostly no logs on either side, except in the VM syslog, which seems to be a result of the issue:
Jan 30 09:54:21 soem2 vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xa63xxc01 .See ndd -h ip_ire_gw_probe for more info
bash-4.0# nwmgr -c lan0 -v
lan0:
Interface State =UP
MAC Address = 0xCA5xxx75B9A
Subsystem = igssn
Interface Type = 1000Base-T
Hardware Path = 0/0/1/0
NMID = 1
Feature Capabilities = Physical Interface
IPV4 Recv CKO
IPV4 Send CKO
VLAN Tag Offload
64Bit MIB Support
IPV4 TCP Segmentation Offload
UDP Multifrag CKO
Feature Settings = Physical Interface
IPV4 Recv CKO
IPV4 Send CKO
VLAN Tag Offload
64Bit MIB Support
IPV4 TCP Segmentation Offload
UDP Multifrag CKO
MTU = 1500
Speed = 1 Gbps Full Duplex (Autonegotiation : On)
Solved! Go to Solution.
- Tags:
- NIC
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2013 03:28 AM
01-30-2013 03:28 AM
Re: VM loses network connection
You did not provide HPVM version, AVIO drivers version nor any details about interface to which the vswitch is connected.
So we can speak only on a generic level - make sure to have the latest AVIO driver on both host and guest as well
as up to date driver for the underlying physical NIC on the host.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2013 04:20 AM
01-30-2013 04:20 AM
Re: VM loses network connection
@Jeromejay wrote:Jan 30 09:54:21 soem2 vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xa63xxc01 .See ndd -h ip_ire_gw_probe for more info
Looks like you need the following in your /etc/rc.config.d/nddconf :
TRANSPORT_NAME[3]=ip
NDD_NAME[3]=ip_ire_gw_probe
NDD_VALUE[3]=0
Adjust "[3]" to your other entries.
Cheers,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2013 04:49 AM
01-30-2013 04:49 AM
Re: VM loses network connection
as up to date driver for the underlying physical NIC on the host"
Thanks: I'm checking now ...
that's exactly this kind of advice I needed ;)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2013 04:53 AM
01-30-2013 04:53 AM
Re: VM loses network connection
NDD_NAME[3]=ip_ire_gw_probe
NDD_VALUE[3]=0
From what I understand, the log error message is more a consequence than a cause...
Changing this will only remove the detection
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2013 06:02 AM - edited 01-30-2013 06:04 AM
01-30-2013 06:02 AM - edited 01-30-2013 06:04 AM
SolutionDead gateway detection...
I have found this on dozens of 'hung' systems causing hours of downtime and unnecessary reboots.
Turn it OFF.!
What is happening is that HP-UX will ping each of the gateways about every 3 -4 minutes and if the gateway fails to respond (or more likely, the ICMP packet gets lost), the network is immediately disabled, a very bad thing for any production system. And some network administrators may decide to turn off ping response from gateways as a security measure, which means that every HP-UX system with dead gateway detection enabled will disappear from the network, usually resulting in mass panic from the end users and the desperate system administrator will reset (crash) the system to reboot.
This is yet another reason to verify that 100% of your systems had GSP/MP network access, a known to work LAN connection that is *NOT* affected by the dead gateway mess. By logging in over the console, you can determine that the system is NOT hung, but simply off the network.
Bill Hassell, sysadmin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2013 06:43 AM
01-30-2013 06:43 AM
Re: VM loses network connection
Thanks for the full info !
that's appreciated ;)
also: I really thought the error message was a consequence, whereas it's actually the cause ...
So now, I'm on to re-configuring all our servers :/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2013 07:03 AM
01-30-2013 07:03 AM
Re: VM loses network connection
The dead gateway detection turning off the network will definitely cause you problems.
I have seen cases where the network was REALLY REALLY busy (in one case doing a backup over the network of an NFS mounted filesystem with a single interface) which likely caused the dead gateway detection ping to fail, thus causing the network to go down.
Basically this is a heads up for you that when you turn off the dead gateway detection, you may start seeing other symptoms on this VM, which may have been masked becuase the network was disabled.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2013 07:07 AM
01-30-2013 07:07 AM
Re: VM loses network connection
Additionally, you can manually set the ip_ire_gw_probe value from the command line:
# ndd -set /dev/ip ip_ire_gw_probe 0
The above will set the value to '0' (disabled). To check the value:
# ndd -get /dev/ip ip_ire_gw_probe
0
The instructions given above with setting up nddconf will only set the value when the system is rebooted, which is desireable. But if you can't reboot the system, then use the ndd command to set the value now.
- Tags:
- ndd
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2013 07:11 AM
01-30-2013 07:11 AM
Re: VM loses network connection
Hi again,
so before making any changes accross all servers, and because I have some time, I thought I'd go for a quick test first:
- I blocked outgoing ICMP on the server (using firewall).
- After the expected ~3min, I started getting the Error messages about Dead Gateway ...
- but the network connectivity was still there (I can still SSH, and HTTP to the server).
so:
- either my quick test is flawed
- either the error message is a consequence of the server dropping its network connectivity (ie: something else fails, and then the server can't ping the GW, and displays the message).
In case it's the 2nd option, could you give me an exhaustive list of checks I can do, for network investigation ? (my knowledge stops at ping, nwmgr basic commands, netstat, lsof, log investigation, lanscan)
thanks again for all the tips and explanations !