- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Server Reboot
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2011 11:58 PM
06-06-2011 11:58 PM
I am having a 2 node active-passive cluster server.Last Sunday my primary node gets rebooted.How to find the reason for the reboot of the node.
Sever Model - Superdome2
OS - HP-UX11.31
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2011 12:07 AM
06-07-2011 12:07 AM
Re: Server Reboot
Hope this helps!
Regards
Torsten.
__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.
__________________________________________________
No support by private messages. Please ask the forum!
If you feel this was helpful please click the KUDOS! thumb below!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2011 12:42 AM
06-07-2011 12:42 AM
Re: Server Reboot
But may i know the location and how to find the exact details of the logs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2011 12:58 AM
06-07-2011 12:58 AM
Re: Server Reboot
/var/adm/syslog/
/var/adm/shutdownlog
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2011 01:00 AM
06-07-2011 01:00 AM
Re: Server Reboot
Hope this helps!
Regards
Torsten.
__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.
__________________________________________________
No support by private messages. Please ask the forum!
If you feel this was helpful please click the KUDOS! thumb below!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2011 09:38 AM
06-07-2011 09:38 AM
Re: Server Reboot
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2011 10:10 AM
06-07-2011 10:10 AM
Re: Server Reboot
Hope this helps!
Regards
Torsten.
__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.
__________________________________________________
No support by private messages. Please ask the forum!
If you feel this was helpful please click the KUDOS! thumb below!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2011 10:16 AM
06-07-2011 10:16 AM
Re: Server Reboot
Why not actually post the message(s) from syslog rather than something "like" what it says - then we can give you a better answer...
HTH
Duncan
I am an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2011 07:24 PM
06-07-2011 07:24 PM
Re: Server Reboot
Please find the attached syslog output.And also I want to know in normal case,if a hearbeat lan got failed,the package will move to other node.But what will the happen to the other node from which package got moved to other node.Is this node will get rebooted?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2011 09:44 PM
06-07-2011 09:44 PM
Re: Server Reboot
Jun 5 14:53:52 bilprdci cmnetd[3812]: lan900 is down at the IP layer.
Jun 5 14:53:52 bilprdci cmnetd[3812]: lan900 failed.
Jun 5 14:53:52 bilprdci cmnetd[3812]: Subnet 172.16.8.0 down
Jun 5 14:54:43 bilprdci vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xac1008c8 .See ndd -h ip_ire_gw_probe for more info
Jun 5 14:56:46 bilprdci cmnetd[3812]: 172.16.8.165 recovered.
Jun 5 14:56:46 bilprdci cmnetd[3812]: Subnet 172.16.8.0 up
Jun 5 14:56:46 bilprdci cmnetd[3812]: lan900 is up at the IP layer.
Jun 5 14:56:46 bilprdci cmnetd[3812]: lan900 recovered.
Jun 5 15:02:26 bilprdci cmnetd[3812]: 172.16.8.165 failed.
Jun 5 15:02:26 bilprdci cmnetd[3812]: lan900 is down at the IP layer.
Jun 5 15:02:26 bilprdci cmnetd[3812]: lan900 failed.
Jun 5 15:02:26 bilprdci cmnetd[3812]: Subnet 172.16.8.0 down
Jun 5 15:03:43 bilprdci vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xac1008c8 .See ndd -h ip_ire_gw_probe for more info
Jun 5 15:06:43 bilprdci vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xac1008c8 .See ndd -h ip_ire_gw_probe for more info
Jun 5 15:09:43 bilprdci vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xac1008c8 .See ndd -h ip_ire_gw_probe for more info
Jun 5 15:12:52 bilprdci cmnetd[3812]: 172.16.8.165 recovered.
Jun 5 15:12:52 bilprdci cmnetd[3812]: Subnet 172.16.8.0 up
Jun 5 15:12:52 bilprdci cmnetd[3812]: lan900 is up at the IP layer.
Jun 5 15:12:43 bilprdci vmunix: Dead gateway detection can't ping the last remaining default gateway at 0xac1008c8 .See ndd -h ip_ire_gw_probe for more info
Jun 5 15:12:52 bilprdci cmnetd[3812]: lan900 recovered.
Jun 5 15:18:38 bilprdci cmnetd[3812]: Link level address on network interface lan901 has been changed from 0xf4ce46f488fa to 0xf4ce46f48808.
Jun 5 15:18:38 bilprdci cmnetd[3812]: lan901 is down at the data link layer.
Jun 5 15:18:38 bilprdci cmnetd[3812]: lan901 failed.
Jun 5 15:18:38 bilprdci cmnetd[3812]: Subnet 10.10.12.0 down
Jun 5 15:18:39 bilprdci cmcld[3803]: Member bilprddb seems unhealthy, not receiving heartbeats from it.
.......
Looks like some network problems, both of your aggregates went down and the node lost the heartbeat with other nodes, which is possible cause of reboot. You should check it with your network team, what was going on exactly...
How many nodes are in the cluster? In the logs it says that the cluster later on formed with only one node and the other 2 were excluded, is that a 3-node cluster then?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-07-2011 10:25 PM
06-07-2011 10:25 PM
Re: Server Reboot
After the network failed, the remore node (bilprddb) was ejected from the cluster following a race for the cluster lock disk - this is normal cluster behaviour when 2 nodes in a cluster cannot communicate over any LAN interfaces.
bilprdci formed a one node cluster, and attempted to start the dbPRD package, which failed (reason unknown - you would need to look at the package log for this, but most likely due to the complete network failure)
Later bilprddb rejoined the cluster and someone manually stopped and started ciPRD on bilprdci
So my advice here is:
1. Review your cluster package logs as well, as they may throw more light on the nature of the failure(s) here.
2. You need a ground up review of the network design within this cluster - a good cluster design should never be able to lose all network links at the same time.
3. Lots of nasty NFS issues in here too, no doubt caused by the network outage - however you should review that you are following the NFS best practice when used in a cluster
4. You need to check your name resolution standards in /etc/nsswitch.conf. In a cluster you really need to have name resolution handled first by files and only then by DNS, and you need to make sure all the interfaces are consistently named in /etc/hosts on both cluster nodes
HTH
Duncan
I am an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-08-2011 03:08 AM
06-08-2011 03:08 AM
Re: Server Reboot
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-08-2011 03:13 AM
06-08-2011 03:13 AM
SolutionIf neither node can talk to the other, how do they know whether the other node is running one of the packages in the cluster or not... they can't, so what happens is they both try and obtain the cluster lock and the node that "loses" the race for the cluster lock reboots itself. It could just as easily have been the other node that lost the race for the cluster lock...
HTH
Duncan
I am an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-08-2011 03:39 AM
06-08-2011 03:39 AM