1835913 Members
2371 Online
110087 Solutions
New Discussion

cmruncl error!!

 
SOLVED
Go to solution
Lorenzo Facello
Valued Contributor

cmruncl error!!

Hi,
when I try to start my cluster
cmruncl -v
I recieve this error
in attach syslogs errors.
I checked:
.rhost ok
/etc/hosts ok
lan conf ok
....

Thanks in advance
L.
8 REPLIES 8
Lorenzo Facello
Valued Contributor

Re: cmruncl error!!

....error cmruncl
k360:/etc/cmcluster>cmruncl -v
Successfully started $SGLBIN/cmcld on k360.
Successfully started $SGLBIN/cmcld on k260.
cmruncl : Waiting for cluster to form............
cmruncl : Cluster did not form. Check the syslog file for information.
Kent Ostby
Honored Contributor

Re: cmruncl error!!

Lorenzo --

I have seen this problem when the cluster was set up to autostart and the autostart failed for some reason.

While it was still trying to autostart a cmruncl was run and you had two copies of cmcld trying to run at the same time.

What I would suggest is to wait 10 minutes (your timeout is 600 seconds) and then manually check to see if cmcld is running on each node with:

ps -ef | grep cmcld

If after 10 to 15 minutes, this is running on only one node, reboot that node.

If its running on both nodes then do a : cmviewcl and see if the cluster finally started.

If its not running on either node then do :

cmruncl

Best regards,

Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Lorenzo Facello
Valued Contributor

Re: cmruncl error!!

Hi Kent,
if I type
ps -ef | grep cmcld
...nothing!! on both node
also after 10 min
Regds
L.
Carsten Krege
Honored Contributor

Re: cmruncl error!!

To resolve the problem you simply need to follow the instructions given in syslog that you attached:

Jan 26 10:44:16 k360 cmcld: To recover from this situation
Jan 26 10:44:17 k360 cmcld: reboot this system:
Jan 26 10:44:17 k360 cmcld: shutdown -r (stops package components)
Jan 26 10:44:18 k360 cmclconfd[2897]: Unable to lookup cluster information in CDB: No such file or directory
Jan 26 10:44:18 k360 cmcld: After ensuring that no package applications
Jan 26 10:44:18 k360 cmcld: or resources are active, you can override this data
Jan 26 10:44:19 k360 cmcld: integrity protection by issuing the following commands
Jan 26 10:44:19 k360 cmcld: (which allow the daemon to start without rebooting):
Jan 26 10:44:20 k360 cmcld: rm /var/adm/cmcluster/.cm_start_time
Jan 26 10:44:20 k360 cmcld: touch /var/adm/cmcluster/.cm_start_time

The reason for this problem is that you probably ran a 1-node cluster before and then issued a reboot. The important point is that SG was not halted cleanly (cmhaltnode, cmhaltcl) when it was a 1-node cluster.

Because SG's safety timer mechanism is disabled for 1-node clusters (no TOC is initiated anymore when cmcld doesn't run), there is no way for SG to determine if packages were halted cleanly before it is started next time. Suppose the following situation:

SG is running on node A, the other node is halted. All packages and volume groups are activated on the active SG node A. Now cmcld aborts with an error, and the volume groups are still active. Because this is a 1-node cluster, node A doesn't perform a TOC, i.e. the application are still alive. If you could restart the cluster on node A, this could cause data corruption, because you would restart a package that is already running. Therefore SG asks you to doublecheck that all packages and resources are down.

You shouldn't see the problem if you always halt your cluster nodes cleanly with cmhaltnode and cmhaltcl.

Carsten
-------------------------------------------------------------------------------------------------
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move. -- HhGttG
Lorenzo Facello
Valued Contributor

Re: cmruncl error!!

ok kent,
I understand... but
can you explain me which and what kind of
process still remain active???
also becouse I've already tried this morning
to shutdown -r the nodes... and nothing changes.
Thanks a lot
Lorenzo
Kent Ostby
Honored Contributor

Re: cmruncl error!!

Lorenzo --

Given Carsten's explanation, its not a matter of processes, its a matter of a setting in the SG configuration.

You need to run these commands that were listed in your syslog.log file and then do cmruncl after that:

rm /var/adm/cmcluster/.cm_start_time

touch /var/adm/cmcluster/.cm_start_time

then:

cmruncl

Best regards,

Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Carsten Krege
Honored Contributor
Solution

Re: cmruncl error!!

Ok, to be very clear what needs to be done:

1) reboot
2) rm /var/adm/cmcluster/.cm_start_time
3) touch /var/adm/cmcluster/.cm_start_time
4) cmruncl

Step 1) can be ommitted if you're 100% sure that your applications are not running and the VGs are not active.

SG does not check for any running processes. The file .cm_start_time only contains the start time of the cluster, e.g.

# xd .cm_start_time
0000000 4091 fe9c 000d 4232
0000008
# adb
adb> PA-32 adb ($h help $q quit)
adb>0x4091fe9c=Y
adb> 2004 Apr 30 09:22:04

If the file has nothing in it, the last thing the cmcld daemon did, was to shutdown cleanly. If there is a timestamp in it, SG checks if the time in the file is later than the system boot time. If it is, SG does complain, because it means that cmcld aborted and the system has not been booted (your situation).

Hope, the mechanism becomes clearer..

Carsten
-------------------------------------------------------------------------------------------------
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move. -- HhGttG
Lorenzo Facello
Valued Contributor

Re: cmruncl error!!

thanks a lot...
sorry carsten, but I exchanged you with kent...