Server Reboot

kunjuttan · ‎06-06-2011

Hi,

I am having a 2 node active-passive cluster server.Last Sunday my primary node gets rebooted.How to find the reason for the reboot of the node.

Sever Model - Superdome2
OS - HP-UX11.31

Torsten. · ‎06-07-2011

I would first take a look at the shutdownlog, the OLDsyslog and the cluster log.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.
__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!

kunjuttan · ‎06-07-2011

Thanks for the update.

But may i know the location and how to find the exact details of the logs.

Dennis Handly · ‎06-07-2011

>may I know the location

/var/adm/syslog/
/var/adm/shutdownlog

Torsten. · ‎06-07-2011

and /etc/cmcluster

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.
__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!

kunjuttan · ‎06-07-2011

I have a doubt.In syslog its showing like heartbeat connection lost.But because of this is the server will reboot?

Torsten. · ‎06-07-2011

This could be a reason.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.
__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!

Duncan Edmonstone · ‎06-07-2011

>> In syslog its showing like heartbeat connection lost.

Why not actually post the message(s) from syslog rather than something "like" what it says - then we can give you a better answer...

HTH

Duncan

I am an HPE Employee

kunjuttan · ‎06-07-2011

Hi,

Please find the attached syslog output.And also I want to know in normal case,if a hearbeat lan got failed,the package will move to other node.But what will the happen to the other node from which package got moved to other node.Is this node will get rebooted?

g3jza · ‎06-07-2011

Jun 5 14:53:52 bilprdci cmnetd[3812]: 172.16.8.165 failed.

Jun 5 14:53:52 bilprdci cmnetd[3812]: lan900 is down at the IP layer.

Jun 5 14:53:52 bilprdci cmnetd[3812]: lan900 failed.

Jun 5 14:53:52 bilprdci cmnetd[3812]: Subnet 172.16.8.0 down

Jun 5 14:54:43 bilprdci vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xac1008c8 .See ndd -h ip_ire_gw_probe for more info

Jun 5 14:56:46 bilprdci cmnetd[3812]: 172.16.8.165 recovered.

Jun 5 14:56:46 bilprdci cmnetd[3812]: Subnet 172.16.8.0 up

Jun 5 14:56:46 bilprdci cmnetd[3812]: lan900 is up at the IP layer.

Jun 5 14:56:46 bilprdci cmnetd[3812]: lan900 recovered.

Jun 5 15:02:26 bilprdci cmnetd[3812]: 172.16.8.165 failed.

Jun 5 15:02:26 bilprdci cmnetd[3812]: lan900 is down at the IP layer.

Jun 5 15:02:26 bilprdci cmnetd[3812]: lan900 failed.

Jun 5 15:02:26 bilprdci cmnetd[3812]: Subnet 172.16.8.0 down

Jun 5 15:03:43 bilprdci vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xac1008c8 .See ndd -h ip_ire_gw_probe for more info

Jun 5 15:06:43 bilprdci vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xac1008c8 .See ndd -h ip_ire_gw_probe for more info

Jun 5 15:09:43 bilprdci vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xac1008c8 .See ndd -h ip_ire_gw_probe for more info

Jun 5 15:12:52 bilprdci cmnetd[3812]: 172.16.8.165 recovered.

Jun 5 15:12:52 bilprdci cmnetd[3812]: Subnet 172.16.8.0 up

Jun 5 15:12:52 bilprdci cmnetd[3812]: lan900 is up at the IP layer.

Jun 5 15:12:43 bilprdci vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xac1008c8 .See ndd -h ip_ire_gw_probe for more info

Jun 5 15:12:52 bilprdci cmnetd[3812]: lan900 recovered.

Jun 5 15:18:38 bilprdci cmnetd[3812]: Link level address on network interface lan901 has been changed from 0xf4ce46f488fa to 0xf4ce46f48808.

Jun 5 15:18:38 bilprdci cmnetd[3812]: lan901 is down at the data link layer.

Jun 5 15:18:38 bilprdci cmnetd[3812]: lan901 failed.

Jun 5 15:18:38 bilprdci cmnetd[3812]: Subnet 10.10.12.0 down

Jun 5 15:18:39 bilprdci cmcld[3803]: Member bilprddb seems unhealthy, not receiving heartbeats from it.

.......

Looks like some network problems, both of your aggregates went down and the node lost the heartbeat with other nodes, which is possible cause of reboot. You should check it with your network team, what was going on exactly...

How many nodes are in the cluster? In the logs it says that the cluster later on formed with only one node and the other 2 were excluded, is that a 3-node cluster then?

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Server Reboot

Server Reboot

Re: Server Reboot

Re: Server Reboot

Re: Server Reboot

Re: Server Reboot

Re: Server Reboot

Re: Server Reboot

Re: Server Reboot

Re: Server Reboot

Re: Server Reboot