*** cmviewcl output when node hostA is up and hostB is still powered off *** [root@hostA]# cmviewcl cmviewcl : Cannot talk to all the nodes. Cluster does not appear to be up CLUSTER STATUS foo_cluster unknown NODE STATUS STATE hostA down unknown hostB unknown unknown UNOWNED_PACKAGES PACKAGE STATUS STATE AUTO_RUN NODE foopkg unknown unknown *** cmviewcl output when both nodes are up and node hostB is trying to form a cluster *** [root@hostA]# cmviewcl CLUSTER STATUS foo_cluster starting cmviewcl : Unable to query the package information: cluster may be reforming, try again: Text file busy. *** entries in /var/log/messages through that 10 minute period *** Jan 27 17:13:30 hostB kernel: Deadman: 1.0 minor: 63 Jan 27 17:13:30 hostB cmcluster: Created /dev/deadman c 10 63 Jan 27 17:13:30 hostB xinetd[1625]: xinetd Version 2.3.12 started with libwrap loadavg options compiled in. Jan 27 17:13:30 hostB xinetd[1625]: Started working: 5 available services Jan 27 17:13:30 hostB CM-CMD[1729]: /usr/local/cmcluster/bin/cmrunnode -v Jan 27 17:13:30 hostB cmclconfd[1736]: Request from node hostB to start the cluster on this node. Jan 27 17:13:30 hostB cmclconfd[1736]: Executing "/usr/local/cmcluster/bin/cmcld" for node hostB Jan 27 17:13:30 hostB cmcld: Logging level changed to level 0. Jan 27 17:13:30 hostB cmcld: Logging level changed to level 0. Jan 27 17:13:30 hostB cmcld: Daemon Initialization - Maximum number of packages supported for this incarnation is 5. Jan 27 17:13:30 hostB cmcld: Global Cluster Information: Jan 27 17:13:30 hostB cmcld: Heartbeat Interval is 1 seconds. Jan 27 17:13:30 hostB cmcld: Node Timeout is 5 seconds. Jan 27 17:13:30 hostB cmcld: Network Polling Interval is 2 seconds. Jan 27 17:13:30 hostB cmcld: Auto Start Timeout is 600 seconds. Jan 27 17:13:30 hostB cmcld: Information Specific to node hostB: Jan 27 17:13:30 hostB cmcld: Cluster lock disk: /dev/sda1. Jan 27 17:13:30 hostB cmcld: Quorum Server: localhost. Jan 27 17:13:30 hostB cmcld: bond0 0x00:02:a5:4f:66:d5 192.168.6.102 bridged net:1 Jan 27 17:13:30 hostB cmcld: bond1 0x00:02:a5:4f:66:d4 192.168.5.104 bridged net:2 Jan 27 17:13:30 hostB cmcld: bond2 0x00:02:a5:4f:6d:54 192.168.7.102 bridged net:3 Jan 27 17:13:30 hostB cmcld: Heartbeat Subnet: 192.168.6.0 Jan 27 17:13:30 hostB cmcld: The maximum # of concurrent local connections to the daemon that will be supported i s 1003. Jan 27 17:13:30 hostB cmcld: CLUSTER_RUNTIME_ID is set to 0 Jan 27 17:13:30 hostB cmcld: Quorum server port number is 1238 Jan 27 17:13:30 hostB cmcld: qm_cluster_lock_config:my_appl_id = hostB old_appl_id = 2 Jan 27 17:13:30 hostB cmcld: Quorum server probe interval is 1800000000 Jan 27 17:13:30 hostB cmcld: Quorum server probe timeout interval is 17000000 Jan 27 17:13:30 hostB cmcld: Quorum server request timeout interval is 17000000 Jan 27 17:13:30 hostB cmcld: Lock LUN Device is /dev/sda1 Jan 27 17:13:31 hostB cmcld: The quorum device localhost is being initialized. Jan 27 17:13:31 hostB cmcld: rcomm health: Initializing timeout to 120000000 microseconds Jan 27 17:13:31 hostB cmlocklund[1766]: Total allocated: 540672 bytes, used: 0 bytes, unused 540672 bytes Jan 27 17:13:31 hostB cmlocklund[1766]: Warning: Unable to determine local domain name for hostB Jan 27 17:13:31 hostB cmlocklund[1766]: Port number returned by locklund_setup: 32790 Jan 27 17:13:31 hostB cmcld: Lock LUN initialized (port = 32790). Jan 27 17:13:31 hostB cmlocklund[1766]: Disk device /dev/sda1 has been associated (bound) to /dev/raw/raw1. Jan 27 17:13:31 hostB cmlocklund[1766]: Server is up and waiting for connections at port 32790 Jan 27 17:13:31 hostB cmcld: Total allocated: 2469888 bytes, used: 87208 bytes, unused 2382680 bytes Jan 27 17:13:31 hostB cmcld: Starting cluster management protocols. Jan 27 17:13:31 hostB cmcld: Attempting to form a new cluster Jan 27 17:13:37 hostB cmcld: Attempting to form a new cluster Jan 27 17:13:41 hostB cmcld: The quorum device localhost is up. Jan 27 17:13:44 hostB cmcld: Attempting to form a new cluster Jan 27 17:14:03 hostB last message repeated 3 times Jan 27 17:14:06 hostB sshd(pam_unix)[1793]: session opened for user root by (uid=0) Jan 27 17:14:10 hostB cmcld: Attempting to form a new cluster Jan 27 17:14:42 hostB last message repeated 5 times Jan 27 17:14:49 hostB cmcld: Attempting to form a new cluster Jan 27 17:14:49 hostB cmcld: Attempting to form a new cluster Jan 27 17:14:52 hostB sshd(pam_unix)[1880]: session opened for user root by (uid=0) Jan 27 17:14:55 hostB cmcld: Attempting to form a new cluster Jan 27 17:15:28 hostB last message repeated 5 times Jan 27 17:16:33 hostB last message repeated 10 times Jan 27 17:17:38 hostB last message repeated 10 times Jan 27 17:18:43 hostB last message repeated 10 times Jan 27 17:19:48 hostB last message repeated 10 times Jan 27 17:20:53 hostB last message repeated 10 times Jan 27 17:21:58 hostB last message repeated 10 times Jan 27 17:23:03 hostB last message repeated 10 times Jan 27 17:23:29 hostB last message repeated 4 times Jan 27 17:23:31 hostB cmcld: Cluster formation failed Jan 27 17:23:31 hostB cmcld: Reason: Ran out of time for automatically joining a cluster Jan 27 17:23:31 hostB cmcld: Unable to contact all nodes in the cluster, thus it is not Jan 27 17:23:31 hostB cmcld: possible to join the cluster at this time. Jan 27 17:23:31 hostB cmcld: If the cluster is not running, use the cmruncl command to Jan 27 17:23:31 hostB cmcld: start it. If the cluster is running on other nodes, verify Jan 27 17:23:31 hostB cmcld: this node's ability to send messages to the other nodes, Jan 27 17:23:31 hostB cmcld: then re-issue the cmrunnode command. Jan 27 17:23:32 hostB cmsrvassistd[1763]: Service assistant daemon halted. Jan 27 17:23:33 hostB cmcld: This node (hostB) has ceased cluster activities. Jan 27 17:23:33 hostB cmcld: Daemon exiting Jan 27 17:23:37 hostB cmcluster: ERROR: Unable to join cluster