Operating System - HP-UX
1829576 Members
3096 Online
109992 Solutions
New Discussion

one node in MCSG rebooted

 
Sanjiv Sharma_1
Honored Contributor

one node in MCSG rebooted

Hi All,

I have two (ia01 and ia02)N-Class/HP-UX 11.00/XP512/
MCSG s/w installed on both servers is:
B3935DA A.11.12 MC / Service Guard.

Cluster lock vg is /dev/vg_lock consists of one LUN of XP512 which is activated on ia01. But there is no logical volumes created on /dev/vg_lock.

On last thursday evening at about 5:30 pm. ia01 rebooted with following message in the syslog.log:
Oct 24 17:34:40 ia01 cmcld: Communication to node ia02 has been interrup
ted
Oct 24 17:34:40 ia01 cmcld: Node ia02 may have died
Oct 24 17:34:40 ia01 cmcld: Attempting to form a new cluster
Oct 24 17:35:02 ia01 cmcld: Obtaining Cluster Lock
Oct 24 17:35:03 ia01 cmcld: Cluster lock was denied. Lock was obtained by an
other node.
Oct 24 17:35:43 ia01 cmcld: Heartbeat connection attempt to node ijmsia02 ti
med out
Oct 24 17:36:04 ia01 cmcld: Cluster lock has been denied

Cluster Lock vg was available to ia01 then why ia01 rebooted instead of ia02. Also what could be the reason of reboot?

Thanks,
Raje.
Everything is possible
5 REPLIES 5
Balaji N
Honored Contributor

Re: one node in MCSG rebooted

Hi,
Is it possible that the heart beat connection between ia01 and ia02 failed and ia02 started the cluster without ia01?

Was the cluster running on ia02 when ia01 got rebooted.?
-balaji
Its Always Important To Know, What People Think Of You. Then, Of Course, You Surprise Them By Giving More.
melvyn burnard
Honored Contributor

Re: one node in MCSG rebooted

Well you have all the informnation in hte supplied data.
There was a networking issue that indicates neither node could talk to each other.
In this scenario, each node tries to grab the cluster lock disc, and only one can win.
The winning node stays up, hte node htat fails to get the cluster lock TOC's.
That is the way it is supposed to work.
Have you checked the syslog of the ohter node? It should have similar messages, except it should say it obtains the cluster Lock.
My house is the bank's, my money the wife's, But my opinions belong to me, not HP!
Ronald Cogen
Frequent Advisor

Re: one node in MCSG rebooted

Hello Raje,

Take a look at the Cluster Timint Parameters in the ASCII config file (/etc/cmcluster).
Have you defined the cluster as a TWO-Node cluster? Have you defined a single or a dual lock cluster? HP recommends a single lock cluster where possible.

Regards,
Ron
I've been down so long it looks like up to me
BFA6
Respected Contributor

Re: one node in MCSG rebooted

Hi,

There looks to have been a network issue, and the heartbeat has been lost.

Do you have a redundant heartbeat going out across the lan ?
Which of the nodes was running the package at the time of the reboot ?

Hilary
Fabrice Meynard
Frequent Advisor

Re: one node in MCSG rebooted

Hi,

you have seen ia01 lost connection to ia02. As there is a connection problem with this node, ia01 tried to rebuild the cluster. For this, he tried to have the cluster lock to have the quorum respected. As he can't have cluster lock and he is alone in the cluster environment, quorum is not respected so there is a TOC on ia01.
If you look ia02 syslog, you will have same errors. ia02 should have the cluster lock, quorum is respected, it didn't TOC.
There should have a network connection problem.

Fabrice