1751922 Members
4986 Online
108783 Solutions
New Discussion юеВ

Anyone ever seen this?

 
Craig Johnson_1
Regular Advisor

Anyone ever seen this?

Mixed 11.23/11.31 cluster, 14 nodes (seven of each). Migration to new 11.31 servers. cmapplyconf threw this:

/etc/cmcluster/apply.gen[3]: 10932 Abort(coredump)

I was trying to add new node names to about 10 packages and the apply blew up. I reduced it to just update two packages and it worked (whew!).

I think I ran into some sort of bug/limitation?
26 REPLIES 26
Craig Johnson_1
Regular Advisor

Re: Anyone ever seen this?

I should have noted that cmcheckconf worked fine.
John Bigg
Esteemed Contributor

Re: Anyone ever seen this?

Normally when cmapplyconf aborts there is an abort message. Was this re-directed somewhere? Without this there is not much that can be guessed.

There is a known problem with cmapplyconf aborting (although I expect this would affect cmcheckconf too) which is fixed in PHSS_41902 SG 11.19 and PHSS_41523 SG 11.20 which I have seen several times:

"Assertion failed:
(char *)tmp_vgd + copy_size <= (char*)lim + msg_length ,
file: config/config_lvm.c, line: 733"

but you would need the abort message to check. Looking in the patch catalog there are a few cmapplyconf abort issues so you should probably check for these too.
Viktor Balogh
Honored Contributor

Re: Anyone ever seen this?

Craig,

what version of SG do you have?

# cmversion
****
Unix operates with beer.
Emil Velez
Honored Contributor

Re: Anyone ever seen this?

You should not make changes to the cluster when you are in a mixed mode.

Craig Johnson_1
Regular Advisor

Re: Anyone ever seen this?

A.11.19.00 on both 11.23 and 11.31 nodes.

We have definitely seen that other "assertion failed" error also. Our workaround was to not use the "-k" option to check/apply.
Stephen Doud
Honored Contributor

Re: Anyone ever seen this?

Page 13 of the Release Notes for A.11.19 at http://bizsupport1.austin.hp.com/bc/docs/support/SupportManual/c02032073/c02032073.pdf
state:
"Support for Mixed-OS Clusters (HP-UX 11i v2 and 11i v3)
With some limitations, HP now supports Serviceguard clusters in which some nodes
are running HP-UX 11i v2 and some 11i v3."

Page 53 states:
"As of Serviceguard A.11.18 a cluster can contain a mix of nodes running HP-UX 11i v2 and 11i v3, with certain restrictions."

Page 55 documents the 'Rules and Restrictions for Heterogeneous Clusters'

None of the restrictions include Serviceguard configuration commands, so cmcheckconf and cmapplyconf are supported in mixed O/S clusters.

If you feel you are seeing a bug, you should open a call with the HP Customer Support Center to investigate this further.
Craig Johnson_1
Regular Advisor

Re: Anyone ever seen this?

This project preceded me. The plan was hatched and verified with HP over a year ago. A couple months back the engineer assigned to the project was given a higher priority project and (guess who?) got this one?

Anyway, as far as this core dump is concerned, two things happened that may have caused it. First was that I tried to update too many packages at once. Secondly I got distracted and it say there waiting for me to answer "y" to the "Modify the cluster configuration?" question for about 10 minutes.

All I know is that reducing the number of package updates allowed me to continue.
Craig Johnson_1
Regular Advisor

Re: Anyone ever seen this?

"sat there" not "say there"
John Bigg
Esteemed Contributor

Re: Anyone ever seen this?

I have to say that I am not aware of any problems associated with a large number of packages, or due to waiting a long time before completing the command. It would be interesting to see any command output (I would expect some) or a stack trace from the core file which would allow us to work out the abort. The abort is almost certainly an assertion. Otherwise I'd expect a SIGSEGV rather than an abort. I think we only ever abort on an assertion.