Operating System - HP-UX
1834804 Members
2403 Online
110070 Solutions
New Discussion

Re: Serviceguard package went down on its own?

 
Alex Tsekhansky_1
Occasional Advisor

Serviceguard package went down on its own?

I have two clients (two completely different companies) on HPUX 11.23, one on PARISC and another - on Itanium. The one on PARISC has Serviceguard 11.18.
Both of them had their primary packages shut down exactly at 21:58.
On one we rebooted the primary box, and the cluster did not come up on it with syslog message:

syslog: Request from root on node to start the cluster on this node failed: not authorized.
(I substituted the name of the node with to protect client's privacy).

Manual restart of the cluster software as ROOT generated the same message. I killed ALL cm.. daemons and started cluster software again on that node via /sbin/init.d script, but got the same result.

We will open a case with HP shortly as we have 24x7 support. Also I ran the package on the secondary box (which we did not reboot), and it came up fine.

I would appreciate any suggestions or ideas. It looked like a "time bomb" (or a bug) in cluster software as two clients having the same exact issue at the same exact time is too much of a coincidense. In both cases the actual cluster setup was done by HP. I only configured packages.
5 REPLIES 5
Fraga
Advisor

Re: Serviceguard package went down on its own?

Hi,
Have you seen the log package?
what's node state now (cmviewcl)?
DeafFrog
Valued Contributor

Re: Serviceguard package went down on its own?

Hi Alex ,
please post the cmrunnode -v from the failed node , along with corresponding message from syslog.

reg,
FrogIsDeaf
SoorajCleris
Honored Contributor

Re: Serviceguard package went down on its own?

Hi Alex,

1. Check name resolution is working.
2. Check your cmclnodelist/rhost is configured and working properly. make sure none made changes in that.
3. Check the permission of cmcluster contents.
4. Check the service guard entries in /etc/inetd.conf is there and it is proper.

Still not working, post the syslog here.

Regards,
Sooraj
"UNIX is basically a simple operating system, but you have to be a genius to understand the simplicity" - Dennis Ritchie
John Bigg
Esteemed Contributor

Re: Serviceguard package went down on its own?

Packages failing are generally a result of a failure of something configured within Serviceguard and not Serviceguard itself.

You have not provided any data giving the details of the package failure so there is nothing to comment on here.

What runs in the packages? Maybe there could be something else in common between your two companies? However, I would suspect coincidence or something else you have overlooked.

If there was some bug in Serviceguard timing causing these two failures then I would expect to see far more than just 2 packages failing in the world!

People buy lottery tickets all the time but I think there is more chance of 2 packages failing at the same time than winning the lottery!

If you give details of the package failure, i.e. messages from syslog, package logs etc, then maybe we can provide more accurate responses to this.

I suspect you also have cluster configuration issues. Probably hostname resolution. This is the cause of more than 95% of authorisation issues like the ones you report. Ensure ALL ip addresses on ALL nodes resolve to or have the hostname as an alias. I bet this is the restart problem.
Stephen Doud
Honored Contributor

Re: Serviceguard package went down on its own?

You are describing two different issues - package failure and cmruncl/cmrunnode trouble.

Package shutdown/failover is automatically initiated when Serviceguard receives a triggering event, such as a package service failure. The syslog.log will show whether this is the case.

The "Not Authorized" issue has been seen when the hacl-cfg lines in/etc/inetd.conf have been commented out.
They should look like this:
hacl-cfg dgram udp wait root /usr/lbin/cmclconfd cmclconfd -p
hacl-cfg stream tcp nowait root /usr/lbin/cmclconfd cmclconfd -c

Repair if needed, but regardless, restart inetd:
inetd -k ; inetd