Serviceguard

Re: Continental cluster issue - /usr/lbin/cmclsentryd daemon dies

 
Whitehorse_1
Frequent Advisor

Continental cluster issue - /usr/lbin/cmclsentryd daemon dies

hi All,

Im facing an issue with my HPUX 11.31/MCSG 11.18 running continental clusters. I'm not able to start ccmonpkg. Found below errors on /var/adm/cmconcl/sentryd.log and I could see cmclsentryd is getting died very immediate, even if tried starting manually. Could anyone throw some lights on this.

I tried a dark-shot by adding this daemon (/usr/lbin/cmclsentryd) to respawn, but couldnt help me either.

};
instance of LogEntryInstTable {
timestamp = "Jul 01 02:20:45";
source = "[main,5,main]";
category = "INFO";
path = "dr.sentryd";
message = "state directory not configured";
};
instance of LogEntryInstTable {
timestamp = "Jul 01 02:20:45";
source = "8156";
category = "ERROR";
path = "provider.drprovider";
message = "Monitor request failed. Error(3035).
";
};
instance of LogEntryInstTable {
timestamp = "Jul 01 02:20:45";
source = "8156";
category = "ERROR";
path = "provider.drprovider";
message = "Failed receiving reply to monitor request.
";
};
instance of LogEntryInstTable {
timestamp = "Jul 01 02:20:45";
source = "8156";
category = "ERROR";
path = "provider.drprovider";
message = "Failed setting up EMS monitor requests.
";
Reading is a good course medicine for deep sleep !!
7 REPLIES 7
Whitehorse_1
Frequent Advisor

Re: Continental cluster issue - /usr/lbin/cmclsentryd daemon dies

Tried reinstalling new EMS version as well.
Reading is a good course medicine for deep sleep !!
melvyn burnard
Honored Contributor

Re: Continental cluster issue - /usr/lbin/cmclsentryd daemon dies

did this ever work?
If so, whgat has changed?

There was an issue some years ago with an older version of CC and EMS that caused a similar symptom.
If it has never worked, you shoul dconsider raising a call with the HP Response Centre
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Whitehorse_1
Frequent Advisor

Re: Continental cluster issue - /usr/lbin/cmclsentryd daemon dies

hi Melvyn. Thanks for response.

I got nearly 15 pair servers running in CC successfully. The same version components are indeed used here as well (MCSG/CC/EMS). I did opened case with HP Response and they could not go any furthur than reinstalling the latest EMS, Now I did that as well. :(
Reading is a good course medicine for deep sleep !!
Rita C Workman
Honored Contributor

Re: Continental cluster issue - /usr/lbin/cmclsentryd daemon dies

It's been a number of years since I've run a Continental Cluster, so obviously I'm not familiar with it in ver 11.18.

But, in the past I had a problem that sounded very similar to this. Here's a snip-it of my old notes on what was done:
---------------------------
sentryd.all and sentryd.log file appeared they are not being read. The logs showed that everything is going to the .all file and not the .log. Looks like it is realy the whole .all file for keywords and when it gets a match it takes the corresponding action...what it does from there is a total..???..

First, move the /var/adm/cmconcl/sentryd.all to something else and remove the sendryd.(all and log) that have no records.
sentryd.all oldsentryd.all
rm sentryd.all
rm sentryd.log

[Did some tusc files and sent to engineer...]

Then try starting the ccmonpkg this way (while on the primary monitoring node):
cmrunpkg -v -n ccmonpkg

Note, I was already on that node, so the -n option should not have been necessary, but it was recommended to start the ccmonpkg up this way.

It started up the cmmonpkg and then viewed and could see that it was now finally updating the /var/adm/cmconcl/sentryd.log file correctly.
----------------------

Like I said, it's from old notes on an earlier version of CC, but at least it's something to try.

Regards,
Rita
Rita C Workman
Honored Contributor

Re: Continental cluster issue - /usr/lbin/cmclsentryd daemon dies

Sorry typo:

Looks like it is realy the whole .all file

Should be:

Looks like it is reading the whole .all file

/rcw
Whitehorse_1
Frequent Advisor

Re: Continental cluster issue - /usr/lbin/cmclsentryd daemon dies

hi.

Tried moving sentryd.log, there is not *.all file., file. Not worked either.
Reading is a good course medicine for deep sleep !!
Carsten Krege
Honored Contributor

Re: Continental cluster issue - /usr/lbin/cmclsentryd daemon dies

Yesterday I resolved the same problem at a customer site. Not sure it was even the identical site. :) The error "Monitor request failed. Error(3035)" indicates that the EMS monitor cmclrmond complained with RM_TARGET_NOT_AVAILABLE. This means that there is a problem with the requested notification mechanism.

There were two problems (actually three but this is not the same as covered by the logs above):

- EMS did not boot up correctly,
i.e. /etc/opt/resmon/persistence/reboot_flag was still there
and /etc/opt/resmon/persistence/runlevel4_flag was still missing. This
caused that some textlog notification messages could not be registered.

The reboot_flag is created when the system boots and and removed when the system reaches run level 4 (or higher). Also runlevel4_flag is created by then. If the runlevel4_flag file is missing this usually means that the ems entries in /etc/inittab are missing or wrong OR the system boot scripts have not finished (i.e. this prevents the system to reach run level 4). Customer needed to verify that this does not occur again after next reboot.

- The "opcmsg" binary was missing. Because the ContCl config file was specifying OpC
notification, some EMS requests could not be configured because the opcmsg
binary was not there. This contributed to sentryd unable to register EMS
requests. This was resolved after the required OpC bits were installed.


Carsten
-------------------------------------------------------------------------------------------------
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move. -- HhGttG