TruCluster
cancel
Showing results for 
Search instead for 
Did you mean: 

Crash kernel cluster quorom disk

SOLVED
Go to solution
Vijver, van de W.C.L.
Occasional Visitor

Crash kernel cluster quorom disk

After a power failure member 2 of our Trucluster TRu64 5.1A doesn't start.

In the process of loading the kernel the system crashes when trying to join the cluster.

In a brief moment it reports it can write to disk (doesn't report the disk details, however I sucpect its the quorom disk).

It also reports a message concerning an item 'rm_audit_ack_block'. After this message the server reutrns to the bootprompt, with halt code = 5.

I tried to restore the cnx partition of the boot disk of member 2 however this didn't resolve the problem.

When change the sysconfigtab of member 2 I could tell the member was running until it adressed the cluster configuration at the lines:

cluster_qdisk_major = 19
cluster_qdisk_minor = 95

How could I resolve this issue?
5 REPLIES
Vladimir Fabecic
Honored Contributor

Re: Crash kernel cluster quorom disk

It would be helpfull to have more details about hardware configuration.
Did your member 1 boot OK?
If so, cluster quorom disk should be OK.
Did you check your cluster interconnect?
Can you mount member 2 boot disk from member 1?
In vino veritas, in VMS cluster
Vijver, van de W.C.L.
Occasional Visitor

Re: Crash kernel cluster quorom disk

The cluster consists of two D20e server interconnect using memorychannel.

The member 1 is currently running and supporting the service delivered normaly
by member 2. Mounting the boot_partition member2 on member 1 is no problem, changing the sysconfigtab cluster info is reflected as member 2 is booted.

I also restore the cluster data on member boot disk of member 2.

mc_cable and mc_diag at the console reports on problems.

Thanks
Vladimir Fabecic
Honored Contributor
Solution

Re: Crash kernel cluster quorom disk

If I remember well, there was a bug, a problem in a Memory Channel cluster where rebooting a node without performing a hardware reset can crash other members with a RM_AUDIT_ACK_BLOCK panic.
So memory channel may be the key.
Did you try the following:
-shutdown the cluster
-set autoaction to HALT on both nodes
-power off both machines
-power on the machine with MC card set as master first
-power on other machine
-boot member 1 machine first and then the other machine
This problem with memory channel was fixed in V5.1A PK6 and V5.1B PK2.
In vino veritas, in VMS cluster
Rob Leadbeater
Honored Contributor

Re: Crash kernel cluster quorom disk

Hi,

You'll find that advisory here:

http://h30097.www3.hp.com/unix/erp/BU040106_EW01.html

however if you've run mc_diag and mc_cable without problems, that implies that both nodes have been reset....

Cheers,

Rob
Vijver, van de W.C.L.
Occasional Visitor

Re: Crash kernel cluster quorom disk

Problem solve after proper shutdown cluster and hardware reset proposed by Vladimir.

Thanks again.