Serviceguard
cancel
Showing results for 
Search instead for 
Did you mean: 

Package Start fails w/ IP configured

Bernhard Mueller
Honored Contributor

Package Start fails w/ IP configured


Cluster is running,
w/ Lock lun initialized
adding a package works
mounting FS works
adding an IP to the packages reboots the node.
package log says

WARNING: IP 10.10.5.5 is already configured on the subnet 10.10.5.0

cmrunpkg says:
cmcld[6818]: Aborting! select failed (file: rcomm/comm_ip.c, line: 443)

/var/log/messages says:
Mar 20 07:37:07 sgnode-2 CM-pkg1[6876]: cmmodnet -a -i 10.10.4.201 10.10.4.0
Mar 20 07:37:07 sgnode-2 cmcld[6663]: select for port 46356 failed with Interrupted system call
Mar 20 16:37:07 sgnode-2 kernel: NET: Registered protocol family 17
Mar 20 07:37:07 sgnode-2 cmcld[6663]: select for port 46100 failed with Interrupted system call
Mar 20 07:37:07 sgnode-2 cmcld[6663]: Aborting! select failed (file: lcomm/local_server.c, line: 1165)
Mar 20 07:37:07 sgnode-2 cmcld[6663]: 18, 95774e60, 1a2d
Mar 20 07:37:07 sgnode-2 cmcld[6663]: 14 (read)
Mar 20 07:37:07 sgnode-2 cmcld[6663]: 18 (read)
Mar 20 07:37:07 sgnode-2 cmcld[6663]: Aborting! select failed (file: rcomm/comm_ip.c, line: 443)

What's wrong here?
(of course I checked there is no duplicate IP, which would respond to ping)

Regards
Bernhard
6 REPLIES
Bernhard Mueller
Honored Contributor

Re: Package Start fails w/ IP configured

note:

WARNING: IP 10.10.5.5 is already configured on the subnet 10.10.5.0

should be
WARNING: IP 10.10.4.201 is already configured on the subnet 10.10.4.0

(I did a test with a package IP on the other heartbeat NIC and copied this line from that test, that is why it does not match above).

Regards,
Bernhard
Bernhard Mueller
Honored Contributor

Re: Package Start fails w/ IP configured

the issue is definitly with cmmodnet, running "ifconfig bond0:1 bla bla ..."
works like a treat, however running "cmmodnet -i -a 10.10.4.201 10.10.4.0" gives exactly the same result as with the package start.

anybody listening out there?

Regards,
Bernhard
Stephen Doud
Honored Contributor

Re: Package Start fails w/ IP configured

Possibly corrupt SG code... does the problem only occur on one node, or both?

How new is the cluster? If network reconfiguration could have taken place since the cluster creation, it may be necessary to rebuild the cluster (incorrect LAN / NIC info in the binary maybe).

Also, what version of Serviceguard is installed, and version of linux?
Bernhard Mueller
Honored Contributor

Re: Package Start fails w/ IP configured

Stephen,

both nodes behave exactly the same way
they are BL20p running SLES9 SP2 Kernel 2.6.5-7.244-smp x86_64 ; MC/SG is 11.16.01

Is this an unsupported config? if yes, is it known *not* to work?

(I did the bonding of the network interfaces before the cluster setup, and I reverted to non-bonding NICs and re-created the cluster without NIC bonding, result is the same)

Regards,
Bernhard
John Bigg
Esteemed Contributor

Re: Package Start fails w/ IP configured

I suggest that you log a call with HP support here since cmcld has aborted. Any instances where cmcld aborts will need to be investigated by support.

I presume that you have checked to make sure that the ip 10.10.4.201 is not already configured prior to the cmmodnet? Maybe add an ifconfig statement to display the information immediately before the call to cmmodnet to check.

Hoever, even if there is some other problem, cmcld should not abort but should handle the situation gracefully.
Bernhard Mueller
Honored Contributor

Re: Package Start fails w/ IP configured

John,

unfortunately this is only a test case, we resorted to installing 32-bit Linux on the blades which resolves the MC/SG issue.

however, it would be nice to know if the issue is resolved on x86_64 with sg 11.16.02 (which I cannot test)

Regards,
Bernhard