Re: Error message starting cluster

G. Vrijhoeven · ‎04-15-2006

Hi all,

We started our MC/SG env. and we recieved the following message:

Apr 16 00:03:07 prc60b01 CM-CMD[24604]: cmruncl
Apr 16 00:03:08 prc60b01 CM-CMD[24604]: Error: lan1 is a slave device which you are not allowed to reference directly from a Serviceguard c
onfiguration.
Apr 16 00:03:08 prc60b01 CM-CMD[24604]: cmruncl: Failed to validate the network configuration as reported above but will try to start the c
luster anyway.
Apr 16 00:03:08 prc60b01 cmclconfd[24605]: Request from root on node prc60b01 to start the cluster on this node
Apr 16 00:03:08 prc60b01 cmcld: Logging level changed to level 0.
Apr 16 00:03:09 prc60b01 cmcld: Daemon Initialization - Maximum number of packages supported for this incarnation is 40.
Apr 16 00:03:09 prc60b01 cmcld: Global Cluster Information:
Apr 16 00:03:09 prc60b01 cmcld: Heartbeat Interval is 2.00 seconds.
Apr 16 00:03:08 prc60b01 cmcld: Logging level changed to level 0.
Apr 16 00:03:09 prc60b01 cmcld: Node Timeout is 10.00 seconds.
Apr 16 00:03:09 prc60b01 cmcld: Network Polling Interval is 2.00 seconds.
Apr 16 00:03:09 prc60b01 cmcld: Auto Start Timeout is 600.00 seconds.
Apr 16 00:03:09 prc60b01 cmcld: Failover Optimization is disabled.
Apr 16 00:03:09 prc60b01 cmcld: Information Specific to node prc60b01:
Apr 16 00:03:09 prc60b01 cmcld: lan1 0x00306e468d9c x.x.x.x bridged net:2
Apr 16 00:03:09 prc60b01 cmcld: lan5 0x00306e468dc8 standby bridged net:2
Apr 16 00:03:09 prc60b01 cmcld: lan0 0x00306ea76bb6 x.x.x.x bridged net:1
Apr 16 00:03:09 prc60b01 cmcld: lan6 0x00306e468dc9 standby bridged net:1

The cluster started fine, but i can not find an explanation for the error message.

PS the x.x.x.x is a real ipadress.

Hope s.o. can help me.

Regards,

Gideon

Sandman! · ‎04-15-2006

Could you post the output of "netstat -i"

thanks!

melvyn burnard · ‎04-15-2006

ruj the cmscabncl utility and post the output file as an attachment.
It looks like something has been set to do something on the lan1 NIC.

My house is the bank's, my money the wife's, But my opinions belong to me, not HP!

Sยภเl Kย๓คг · ‎04-15-2006

Hi,

Post check ifconfig lan1. Pl post output of netstat -i, -rn and cmviewconf. Check Lan switch port for lan1, is it connected to same VLAN.

Regards,
Sunil

Your imagination is the preview of your life's coming attractions

Sยภเl Kย๓คг · ‎04-15-2006

Hi,

Try removing the IP of lan1 and run cmcheckconf

ifconfig lan1 unplumb

Regards,
Sunil

Your imagination is the preview of your life's coming attractions

Sandman! · ‎04-16-2006

Hi Gideon,

Besides posting the output of "netstat -in" it would be helpful to include the output of the following:

# cat /etc/rc.config.d/netconf
# ifconfig lan1

IMHO...if lan1 is a failover LAN card then it should not be configured i.e. no IP and no netmask, however it should be plumbed so that if the primary lan card fails then the standby lan card can take over its IP/netmask value.

cheers!

Stephen Doud · ‎04-17-2006

If lan1 is supposed to be a standby LAN NIC, the most probable cause: lan1 has been assigned an IP of 0.0.0.0 and a subnet mask that is not 0 (use ifconfig lan1 to determine it's configuration).
To correct this, unplumb and re-plumb it and then check /etc/rc.config.d/netconf to see if lan1 has been configured.
# ifconfig lan1 unplumb
# ifconfig lan1 plumb

If such is the case, remove references for lan1 in the netconf file. Some older versions of IgniteUX have been known to add such an invalid entry in the netconf file for standby NICs.

G. Vrijhoeven · ‎04-17-2006

Hi All,

The answers:

tprc60b01:/root # netstat -in
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
lan1:7 1500 145.78.160.128 145.78.160.154 2610045 0 2566284 0 0
lan1:6 1500 145.78.160.128 145.78.160.153 1754706 0 1139615 0 0
lan1:1 1500 145.78.160.128 145.78.160.141 50180 0 998 0 0
lan1 1500 145.78.160.128 145.78.160.132 32289971 0 46969912 0 0
lan0 1500 145.78.190.0 145.78.190.12 47632345 0 86377179 0 0
lo0 4136 127.0.0.0 127.0.0.1 49529303 0 49529335 0 0
lan1:3 1500 145.78.160.128 145.78.160.144 185074 0 13 0 0
lan1:2 1500 145.78.160.128 145.78.160.142 6544373 0 2489542 0 0
lan6* 1500 none none 0 0 0 0 0
lan1:5 1500 145.78.160.128 145.78.160.147 978814 0 780422 0 0
lan5* 1500 none none 0 0 0 0 0
lan1:4 1500 145.78.160.128 145.78.160.146 276331 0 233103 0 0

The netconf part:

NTERFACE_NAME[0]="lan1"
IP_ADDRESS[0]="145.78.160.132"
SUBNET_MASK[0]="0xffffff80"
BROADCAST_ADDRESS[0]="145.78.160.255"
INTERFACE_STATE[0]=""
DHCP_ENABLE[0]=0

INTERFACE_NAME[1]="lan0"
IP_ADDRESS[1]="145.78.190.12"
SUBNET_MASK[1]="0xffffff00"
BROADCAST_ADDRESS[1]="145.78.190.0"
INTERFACE_STATE[1]=""
DHCP_ENABLE[1]=0

prc60b01:/root # ifconfig lan1
lan1: flags=843
inet 145.78.160.132 netmask ffffff80 broadcast 145.78.160.255

As you can see in the syslog lan1 is the primary interface and lan5 is standby. These messages are given when the cluster is coming up. The Error message is before the cluster starts.

scancl.out is attached.

The cluster is up and running and fully in production, so i am not able to change settings that couse down time.

Regards,

Gideon

Stephen Doud · ‎04-18-2006

Please attach scancl.out!
I suspect that the cluster binary file lan1 reference no longer matches the current configuration/connectivity of lan1.

Sandman! · ‎04-18-2006

The output of cmviewconf would be helpful too.

thanks

G. Vrijhoeven · ‎04-18-2006

scancl.out.

Stephen Doud · ‎04-18-2006

use ifconfig on each of the standby NICs on each node in the cluster - verify that they have an IP of 0.0.0.0 and particularly a subnet mask of 0 and not some other value.
If misconfigured, use:
# ifconfig lanX unplumb
# ifconfig lanX plumb

Also check /etc/rc.config.d/netconf to see if the standby NIC is listed. Remove any references to the standby NIC in netconf (in ANY of the 4 cluster nodes netconf file).

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Error message starting cluster

Error message starting cluster