- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: Regarding trigger for node reset after uptime ...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-14-2006 10:30 PM
тАО11-14-2006 10:30 PM
We found the following description in the SGLX_00005.text of patch kit:
Serviceguard causes a node to reset for
no apparent reason after a system uptime
of around 248.5 days with a 2.4 Linux
kernel or 24.8 days with a 2.6 kernel.
The symptom is that the deadman driver
expires without apparent cause.
Please note that this was also fixed in
SG A.11.16.02 (the RedHat4 support release).
[QUESTION]
- Could you tell me the root cause what is trigger ?
Is it uptime of system, or running time of cluster daemon ?
Is it reset with executing cmhaltcl ?
In fact, one node is running for 260 days.
Thank you for advice.
Best Regards.
/Minoru.Asano
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-15-2006 01:41 AM
тАО11-15-2006 01:41 AM
SolutionAs written in the text you quoted from the patch text shown above, "after a system uptime..."
And the reason that one node managed to get to 260 days is that because the first node reset at 248.5 days the other node was only a single node cluster when it reached the same 248.5 days of uptime shortly afterwards (since they were both booted at about the same time). The node reset only occurs when the deadman timer is enabled, and it is only enabled when there is more than 1 node in the cluster.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-15-2006 06:40 AM
тАО11-15-2006 06:40 AM
Re: Regarding trigger for node reset after uptime around 248.5 days
(2**31)/100/60/60/24 = 248.55
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-15-2006 09:35 PM
тАО11-15-2006 09:35 PM
Re: Regarding trigger for node reset after uptime around 248.5 days
It's actually further complicated by the particular version of Linux since the default 2.6 kernels initialise jiffies to 4294667296 rather than zero (even for 64 bit kernels!) although SUSE have modified that back to zero.
Therefore with a RedHat 32 bit 2.6 kernel the reset will occur after 5 minutes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-16-2006 12:42 PM
тАО11-16-2006 12:42 PM
Re: Regarding trigger for node reset after uptime around 248.5 days
Thank you for reply and suggestion.
I have gotten enough information.
The one system has started as single node cluster because "AUTOSTART_CMCLD" is "0".
So deadman did not work.
I could explain this phenomenon to the customer.
Thank you.
Best Regards.
/Minoru.Asano