- Community Home
- >
- Servers and Operating Systems
- >
- HPE ProLiant
- >
- ProLiant Servers (ML,DL,SL)
- >
- Re: Weird restart system after restarting HP utili...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-07-2013 11:04 PM
03-07-2013 11:04 PM
Weird restart system after restarting HP utilities
Hi
After restarting some services as hpilo, hp-health and hpsmd, out HP ProLiant DL380 G6 restarted abruptly, i.e., the system was power reset (no shutdown). The log in the ILO shows:
Severity Class Last Update Initial Update Count Description
Informational iLO 2 03/06/2013 17:41 03/06/2013 17:41 1 Server power restored.
Informational iLO 2 03/06/2013 17:41 03/06/2013 17:41 1 Server power removed.
Caution iLO 2 03/06/2013 17:41 03/06/2013 17:41 1 Server reset.
Informational iLO 2 03/06/2013 17:41 03/06/2013 17:41 1 BMC IPMI Watchdog Timer Timeout: Action=System Power Reset.
Do you think the system power reset was caused by the restart of those services?
The people who restarted the services swear the restart of those services does not reset the system power reset.
Thanx in advance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-07-2013 11:49 PM
03-07-2013 11:49 PM
Re: Weird restart system after restarting HP utilities
I would say that the reset was caused by not restarting the appropriate services, or at least not restarting them quickly enough.
This log event is probably the key here:
BMC IPMI Watchdog Timer Timeout: Action=System Power Reset.
A "watchdog timer" that shows up in iLO log is a hardware element that forces the server to reset if the OS seems to be hung. It is a very simple hardware timer that counts down from a set value and resets the server when it reaches zero. An accompanying software component (probably in the hp-health driver) periodically keeps resetting the timer back to the start value, so it should not ever reach zero as long as the OS and the software component remain functional.
If the software component is prevented from running for any reason for a long enough time, the hardware timer will reach zero and will trigger a system power reset.
In older ProLiant models, this was known as ASR (Automatic Server Recovery) and the default time was about 10 minutes if I recall correctly.
There probably is a way for the software component to "disarm" the timer when the software component is intentionally unloaded, but for some reason that did not happen. Anyway, even if the timer could not be disarmed, there should have been several minutes of time to restart the services before the watchdog timer triggered a reset. Maybe the person that actually did the work was unaware of the existence of a watchdog timer and e.g. took a coffee break while the service was down?
I understand that you're working with second-hand information ("The people who restarted the services swear..."), but knowing the exact actions taken and the sequence of events would be important here.
The operating system and hp-health version information might be good to know too, in case your particular version has known bugs related to the watchdog timer.