Operating System - HP-UX
1833451 Members
3084 Online
110052 Solutions
New Discussion

Mixed O/S Cluster - cmcheckconf Failing

 
Craig Johnson_1
Regular Advisor

Mixed O/S Cluster - cmcheckconf Failing

Primary node is 11.23, second node is 11.31, both are running SG A.11.19.0.

# cmcheckconf -v -C /tmp/junk.out
Begin cluster verification...
Checking cluster file: /tmp/junk.out
Defaulting MAX_CONFIGURED_PACKAGES to 300.
Checking nodes ... Done
Checking existing configuration ... Done
Defaulting MAX_CONFIGURED_PACKAGES to 300.
Gathering storage information
Found 21 devices on node a300sua4
Found 11 devices on node a300sua8
Analysis of 32 devices should take approximately 5 seconds
0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%
Found 8 volume groups on node a300sua4
Found 7 volume groups on node a300sua8
Analysis of 15 volume groups should take approximately 1 seconds
0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%
Volume group /dev/vg21 is configured differently on node a300sua4 than on node a300sua8
Volume group /dev/vg21 is configured differently on node a300sua8 than on node a300sua4
Volume group /dev/vg31 is configured differently on node a300sua4 than on node a300sua8
Volume group /dev/vg31 is configured differently on node a300sua8 than on node a300sua4
Volume group /dev/vg32 is configured differently on node a300sua4 than on node a300sua8
Volume group /dev/vg32 is configured differently on node a300sua8 than on node a300sua4
Volume group /dev/vg33 is configured differently on node a300sua4 than on node a300sua8
Volume group /dev/vg33 is configured differently on node a300sua8 than on node a300sua4
Gathering network information
Beginning network probing (this may take a while)
Completed network probing
Failed to evaluate network
Gathering polling target information
cmcheckconf: Unable to reconcile configuration file /tmp/junk.out
with discovered configuration information.

Here is the entire contents of junk.out (comments removed)

CLUSTER_NAME a300cu38
HOSTNAME_ADDRESS_FAMILY IPV4
QS_HOST a0300qrmp6
QS_POLLING_INTERVAL 300000000
NODE_NAME a300sua4
NETWORK_INTERFACE lan4
STATIONARY_IP 10.20.209.224
NETWORK_INTERFACE lan1
HEARTBEAT_IP 169.254.2.224
NETWORK_INTERFACE lan3
HEARTBEAT_IP 169.254.1.224
NETWORK_INTERFACE lan0
NODE_NAME a300sua8
NETWORK_INTERFACE lan4
STATIONARY_IP 10.20.209.228
NETWORK_INTERFACE lan1
HEARTBEAT_IP 169.254.2.228
NETWORK_INTERFACE lan3
HEARTBEAT_IP 169.254.1.228
NETWORK_INTERFACE lan0
MEMBER_TIMEOUT 14000000
AUTO_START_TIMEOUT 600000000
NETWORK_POLLING_INTERVAL 2000000
NETWORK_FAILURE_DETECTION INOUT
NETWORK_AUTO_FAILBACK YES
SUBNET 169.254.2.0
IP_MONITOR OFF
SUBNET 169.254.1.0
IP_MONITOR OFF
SUBNET 10.20.209.0
IP_MONITOR OFF
MAX_CONFIGURED_PACKAGES 300
VOLUME_GROUP /dev/vg21
VOLUME_GROUP /dev/vg31
VOLUME_GROUP /dev/vg32
VOLUME_GROUP /dev/vg33


Why would it be failing to evaluate the network setup? I have triple checked those IP's and made sure the entries are also in the host file and that all are pingable from each of the nodes. The junk.out file was created using cmquerycl, then edited slightly (cluster name, added quorum server, commented out disk lock, corrected heartbeat/stationary errors from cmquerycl, turned off IP level monitoring).

Am I missing something?
12 REPLIES 12
Steven E. Protter
Exalted Contributor

Re: Mixed O/S Cluster - cmcheckconf Failing

Shalom,

I don't think a mixed OS environment for SG is supported.

For example if you want to run oracle server on both nodes the binaries would be different and there is no guarantee the database would fail over without corrupting your data.

Where did you get the idea this was supported?

You can with same OS run SG in mixed versions of SG.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
melvyn burnard
Honored Contributor

Re: Mixed O/S Cluster - cmcheckconf Failing

We do support mixed OS versions with Serviceguard, as follows:
* Beginning with Serviceguard A.11.18 (and A.11.19), a Serviceguard cluster may contain a mix of nodes running HP-UX 11i v2 and 11i v3, with the following restrictions, requirements and recommendations:
o It is strongly recommended that all nodes are running equivalent Serviceguard patch levels. For example, for Serviceguard A.11.18, PHSS_38423 or later for 11i v2 and PHSS_38424 or later for 11i v3. If nodes in the cluster have different Serviceguard patch levels, then any new functionality introduced in the later patches may not be available in the cluster.
o Some 11i v3 features cannot be used in a mixed OS cluster, such as LVM 2.0 volume groups and Agile I/O addressing. 11i v3 Native Multipathing is supported to be used.
o SGeRAC is not supported in a mixed OS cluster (as Oracle does not support that).
o All the nodes on a given HP-UX version should be running the same Fusion release, at the same patch level, that is, the 11i v2 nodes should all be running the same 11i v2 Fusion release at the same patch level, and the 11i v3 nodes should all be running the same 11i v3 Fusion release at the same patch level.
o It is your responsibility to ensure that your other applications work properly in a mixed OS cluster.
o Refer to the September 2008 (or later) revision of the Serviceguard A.11.18 Release Notes for additional information on mixed OS cluster. For A.11.19, refer to the Serviceguard A.11.19 Release Notes.

My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
melvyn burnard
Honored Contributor

Re: Mixed O/S Cluster - cmcheckconf Failing

You also appear to have configuration issues wioth the shared VG's, you should sort those out.
As for the networking, do they all have the same network mask?
What happens if you change the IP-MONITOR to ON?
Was anything logged in either nodes syslog.log?

You may need to enable some enhanced logging for this.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
melvyn burnard
Honored Contributor

Re: Mixed O/S Cluster - cmcheckconf Failing

another suggestion, if you create the asii file again, and ONLY change the cluster name, test that configuration with cmcheckconf. It may be that there is an inadvertent typo in your config changes
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Craig Johnson_1
Regular Advisor

Re: Mixed O/S Cluster - cmcheckconf Failing

I could try that, however, the networks are discovered incorrectly (heartbeat and stationary are reversed) and there is that pesky FIRST_CLUSTER_LOCK_VG entry and missing QS_HOST. I have no choice but to edit the file, at least a little.
Craig Johnson_1
Regular Advisor

Re: Mixed O/S Cluster - cmcheckconf Failing

OK, I did try what you suggested, but I added "-c a300cu38 -q a0300qrmp6" to the query. It failed unless I removed the "-c a300cu38". So I had to edit the cluster name (only) and then tried to run a cmcheckconf. Same error as before.

Gathering network information
Beginning network probing (this may take a while)
Completed network probing
Failed to evaluate network
Gathering polling target information
cmcheckconf: Unable to reconcile configuration file /tmp/asciinew.out
with discovered configuration information.
Craig Johnson_1
Regular Advisor

Re: Mixed O/S Cluster - cmcheckconf Failing

Now I also tried correcting the heartbeat/stationary stuff and it still fails.

This was our TEST cluster. I have since managed to get this working on two other clusters, one DEV and one QA. There is something fishy about this one.
melvyn burnard
Honored Contributor

Re: Mixed O/S Cluster - cmcheckconf Failing

sounds weird. verify that the network config is the same, especially things like subnet masks etc.
do ifconfig on each lan and compare the two servers outputs
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Emil Velez
Honored Contributor

Re: Mixed O/S Cluster - cmcheckconf Failing

I remember something in the documentation that you should not make any changes to your cluster when it is a mixed cluster (either different OS versions or serviceguard versions) and the commands must be done on the node to start stop and start packages on the node which has the newest version of the OS and serviceguard.
John Bigg
Esteemed Contributor

Re: Mixed O/S Cluster - cmcheckconf Failing

Emil, this is true when you have different versions of Serviceguard during a rolling upgrade, but the commands will stop you making configuration changes in this situation. This is NOT true when there are mixed versions of HP-UX.

Back to the original problem. Firstly the LVM errors are fatal and this should be fixed first. I do not think this has an impact on the network probing but there is a small possibility it does. Therefore, just fix this first before moving onto the network errors.

Do the cmclconfd daemons log any errors in the syslog files on any of the nodes? This is what I would look at first.
Stephen Doud
Honored Contributor

Re: Mixed O/S Cluster - cmcheckconf Failing

During testing:
mv /etc/nsswitch.conf /etc/nsswitch.conf.ORIG
cp /etc/nsswitch.file /etc/nsswitch.conf

Insure -all- the fixed IPs that are assigned to NICs on both nodes are listed in /etc/hosts, and aliased to the simple hostname of the sponsoring host. This is crucial (and documented in the Managing Serviceguard manual)


The cmquerycl -c option (in cmquerycl) causes SG to include cluster binary configuration information, which may not be accurate.
What does the following produce?
cmquerycl -v -w full -n a300sua4 -n a300sua8
Craig Johnson_1
Regular Advisor

Re: Mixed O/S Cluster - cmcheckconf Failing

$ cmquerycl -v -w full -n a300sua4 -n a300sua8
Gathering storage information
Found 94 devices on node a300sua4
Found 99 devices on node a300sua8
Analysis of 193 devices should take approximately 11 seconds
0%----10%----20%----30%----40%----50%----60%----70%----80%----90%----100%
Note: Disks were discovered which are not in use by either LVM or VxVM.
Use pvcreate(1M) to initialize a disk for LVM or,
use vxdiskadm(1M) to initialize a disk for VxVM.
Volume group /dev/vg21 is configured differently on node a300sua4 than on node a300sua8
Volume group /dev/vg21 is configured differently on node a300sua8 than on node a300sua4
Volume group /dev/vg31 is configured differently on node a300sua4 than on node a300sua8
Volume group /dev/vg31 is configured differently on node a300sua8 than on node a300sua4
Volume group /dev/vg32 is configured differently on node a300sua4 than on node a300sua8
Volume group /dev/vg32 is configured differently on node a300sua8 than on node a300sua4
Volume group /dev/vg33 is configured differently on node a300sua4 than on node a300sua8
Volume group /dev/vg33 is configured differently on node a300sua8 than on node a300sua4
Gathering network information
Beginning network probing (this may take a while)
Completed network probing
Gathering polling target information

Node Names: a300sua4
a300sua8

Bridged networks (full probing performed):

1 lan1 (a300sua4)
lan1 (a300sua8)

2 lan0 (a300sua4)
lan3 (a300sua4)
lan3 (a300sua8)

3 lan2 (a300sua4)
lan2 (a300sua8)

4 lan4 (a300sua4)
lan4 (a300sua8)
lan0 (a300sua8)

IP subnets:

IPv4:

169.254.2.0 lan1 (a300sua4)
lan1 (a300sua8)

10.20.37.0 lan2 (a300sua4)
lan2 (a300sua8)

169.254.1.0 lan3 (a300sua4)
lan3 (a300sua8)

10.20.209.0 lan4 (a300sua4)
lan4 (a300sua8)

IPv6:

Possible Heartbeat IPs:

IPv4:

169.254.2.0 169.254.2.224 (a300sua4)
169.254.2.228 (a300sua8)

10.20.37.0 10.20.37.124 (a300sua4)
10.20.37.228 (a300sua8)

169.254.1.0 169.254.1.224 (a300sua4)
169.254.1.228 (a300sua8)

10.20.209.0 10.20.209.224 (a300sua4)
10.20.209.228 (a300sua8)

IPv6:

Route Connectivity (full probing performed):

IPv4:

1 169.254.2.0

2 10.20.37.0

3 169.254.1.0

4 10.20.209.0

Possible IP Monitor Subnets:

IPv4:

10.20.209.0 Polling Target 10.20.209.1

IPv6:

Possible Cluster Lock Devices:

NO CLUSTER LOCK: 28 seconds

LVM volume groups:

/dev/vg00 a300sua4

/dev/vg01 a300sua4

/dev/vg21 a300sua4
a300sua8

/dev/vg09 a300sua4

/dev/vg40 a300sua4
a300sua8

/dev/vg31 a300sua4
a300sua8

/dev/vg32 a300sua4
a300sua8

/dev/vg33 a300sua4
a300sua8

/dev/vg00 a300sua8

/dev/vg01 a300sua8

LVM physical volumes:

/dev/vg00
/dev/dsk/c1t2d0s2 0/4/1/0.0.0.2.0 a300sua4

/dev/vg01
/dev/dsk/c1t0d0 0/4/1/0.0.0.0.0 a300sua4
/dev/dsk/c1t1d0 0/4/1/0.0.0.1.0 a300sua4

/dev/vg21
/dev/dsk/c29t11d5 0/7/1/0.98.78.19.2.11.5 a300sua4
/dev/dsk/c28t11d5 0/3/1/0.97.125.19.2.11.5 a300sua4

/dev/disk/disk90 64000/0xfa00/0x2e a300sua8

/dev/vg09
/dev/dsk/c34t0d1 0/3/1/0.99.80.19.0.0.1 a300sua4
/dev/dsk/c36t0d1 0/7/1/0.100.80.19.0.0.1 a300sua4
/dev/dsk/c40t0d1 0/3/1/0.99.5.19.0.0.1 a300sua4
/dev/dsk/c41t0d1 0/7/1/0.100.87.19.0.0.1 a300sua4

/dev/vg40
/dev/dsk/c41t0d0 0/7/1/0.100.87.19.0.0.0 a300sua4
/dev/dsk/c34t0d0 0/3/1/0.99.80.19.0.0.0 a300sua4
/dev/dsk/c36t0d0 0/7/1/0.100.80.19.0.0.0 a300sua4
/dev/dsk/c40t0d0 0/3/1/0.99.5.19.0.0.0 a300sua4

/dev/dsk/c3t0d0 0/3/1/0.99.80.19.0.0.0 a300sua8
/dev/dsk/c1t0d0 0/7/1/0.100.80.19.0.0.0 a300sua8
/dev/dsk/c5t0d0 0/3/1/0.99.5.19.0.0.0 a300sua8
/dev/dsk/c7t0d0 0/7/1/0.100.87.19.0.0.0 a300sua8

/dev/vg31
/dev/dsk/c28t13d4 0/3/1/0.97.125.19.2.13.4 a300sua4
/dev/dsk/c29t13d4 0/7/1/0.98.78.19.2.13.4 a300sua4

/dev/disk/disk91 64000/0xfa00/0x2f a300sua8

/dev/vg32
/dev/dsk/c16t2d2 0/3/1/0.97.125.19.8.2.2 a300sua4
/dev/dsk/c11t8d0 0/7/1/0.98.78.19.8.8.0 a300sua4

/dev/disk/disk98 64000/0xfa00/0x36 a300sua8

/dev/vg33
/dev/dsk/c28t13d5 0/3/1/0.97.125.19.2.13.5 a300sua4
/dev/dsk/c15t6d5 0/3/1/0.97.125.19.6.6.5 a300sua4
/dev/dsk/c29t13d5 0/7/1/0.98.78.19.2.13.5 a300sua4
/dev/dsk/c10t2d0 0/7/1/0.98.78.19.5.2.0 a300sua4

/dev/disk/disk94 64000/0xfa00/0x32 a300sua8
/dev/disk/disk92 64000/0xfa00/0x30 a300sua8

/dev/vg00
/dev/disk/disk52_p264000/0xfa00/0x6 a300sua8

/dev/vg01
/dev/disk/disk53 64000/0xfa00/0x7 a300sua8

LVM logical volumes:

Volume groups on a300sua4:

Volume groups on a300sua8: