1834086 Members
2421 Online
110063 Solutions
New Discussion

Re: lan5 failed

 
SOLVED
Go to solution
Davor Bira?
Frequent Advisor

lan5 failed

I have ServiceGuard OPS Edition Bundle ver. A.11.14. My problem is standby network interface: lan5.
It repeats this message in syslog.log:
Mar 29 08:15:43 XXX cmcld: lan5 failed
Mar 29 08:15:49 XXX cmcld: lan5 recovered
Mar 29 08:43:38 XXX cmcld: lan5 failed
Mar 29 08:43:44 XXX cmcld: lan5 recovered
Mar 29 08:48:42 XXX cmcld: lan5 failed
Mar 29 08:48:48 XXX cmcld: lan5 recovered

Is this something to worry about?
7 REPLIES 7

Re: lan5 failed

Absoluteley - you have an intermittent network issue - have your network staff take a look at the switch that lan5 is attached to...

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Isralyn Manalac_1
Regular Advisor
Solution

Re: lan5 failed

It is indeed something that needs your attention. It is best to liaise with the network team to check the health of the switch with which this cable is connected to. Another log file worth checking is the nettl.LOG000. Do a ps -fea|grep netfmt and you'll find that file.

Regards,

Isralyn
melvyn burnard
Honored Contributor

Re: lan5 failed

Well yes, as this is telling you there appears to be an intermittent lan problem that could render a service, a package, or worse a node as failed.

get your networking folks to sort out this issue as soon as possible.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
doug mielke
Respected Contributor

Re: lan5 failed

There is a clue here in that each of the 3 events lasts 6 seconds.

This looks like some sort of 'discovery' like network flood. Too short to be a legitamate network overload, consistant enough to indicate an intentional network task.

If you get no diagnosis help from your network folks, you could try asking them to block some of the more common broadcasts from these switch ports and allow only what ServiceGuard requires.
Stephen Doud
Honored Contributor

Re: lan5 failed

The lan failure/recovery period is determined by the NETWORK_POLLING_INTERVAL designated in the cluster configuration ASCII file. The default value is 2 seconds.
You can verify what the setting is in your cluster by inspecting the cluster's binary file:
# cmviewconf | grep -i polling
network polling interval: 2.00 (seconds)

Since the recovery is 6 seconds after the failure, 3 check periods occur (on the default setting). The check is performed at the link-level, so transmission capability is being compromised. Best to get a network sniffer on the line and see what's happening.
Davor Bira?
Frequent Advisor

Re: lan5 failed

At the end it was problem with switch, so thanks to all for replays.
Davor Bira?
Frequent Advisor

Re: lan5 failed

Thank again