- Community Home
- >
- Servers and Operating Systems
- >
- HPE ProLiant
- >
- ProLiant Servers (ML,DL,SL)
- >
- Proliant dl360p gen8An Unrecoverable SystemError (...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-26-2019 01:36 AM
04-26-2019 01:36 AM
An Unrecoverable System Error (NMI) has occurred (iLO application watchdog timeout NMI, Service Information: 0x0000002B, 0x00000000)
Hi All ,
We have one DL380p gen 8 server runnign Red Hat Enterprise Linux Server Release 6.2, it is second time that server crashed and reboot with the following IML logs :
ASR 04/24/2019 22:00 04/24/2019 22:00 1 ASR Detected by System ROM
System Error 04/24/2019 21:58 04/24/2019 21:58 1 An Unrecoverable System Error (NMI) has occurred (iLO application watchdog timeout NMI, Service Information: 0x0000002B, 0x00000000)
sys logs :
========================================
Apr 24 22:01:54 prs3-tir hp-ams[2834]: hpHelper Started . .
Apr 24 22:02:08 prs3-tir hpasmlited[2895]: hpDeferSPDThread: Starting thread to collect DIMM SPD Data.
Apr 24 22:02:08 prs3-tir hpasmlited[2895]: Initialize data structures succesful
Apr 24 22:02:13 prs3-tir hp-ams[2834]: CRITICAL: An Unrecoverable System Error (NMI) has occurred (iLO application watchdog timeout NMI, Service Information: 0x0000002B, 0x00000000)
Apr 24 22:02:14 prs3-tir hp-ams[2834]: CRITICAL: ASR Detected by System ROM
Apr 24 22:02:15 prs3-tir hpasrd[2922]: Starting with poll 1 and timeout 600
Apr 24 22:02:15 prs3-tir hpasrd[2922]: Setting the watchdog timer.
Apr 24 22:02:15 prs3-tir hpasrd[2922]: Found iLO memory at 0xf7df0000.
Apr 24 22:02:15 prs3-tir hpasrd[2922]: Successfully mapped device.
Apr 24 22:02:15 prs3-tir cmanicd: Entering iml_log_link_up(slot: 0, port: 1)
Apr 24 22:02:15 prs3-tir cmanicd: Entering get_event_id(slot: 0, port: 1
Apr 24 22:02:47 prs3-tir hpasmlited[2895]: hpDeferSPDThread: End of Collecting DIMM SPD data.
Apr 24 22:02:48 prs3-tir cmanicd: Existing event id(4) found for the slot and port.
Apr 24 22:02:48 prs3-tir cmanicd: Entering repair_iml_event(slot: 0, port: 1, event: 4)
Apr 24 22:02:48 prs3-tir cmanicd: Entering read_iml_event(slot: 0, port: 1, eventid: 4)
Apr 24 22:02:48 prs3-tir cmanicd: Calling ioctl() to read event id: 4)
Apr 24 22:02:48 prs3-tir cmanicd: Successfully read the event id: 4)
Apr 24 22:02:48 prs3-tir cmanicd: Trying to repair the existing IML Event.
Apr 24 22:02:48 prs3-tir cmanicd: Successfully repaired the IML Event.
Apr 24 22:02:48 prs3-tir cmanicd: Returning from repair_iml_event().
=======================================================
Any one has faced this issue and know how to resolve?
Thanks in Advacnce
Florant
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-02-2019 06:41 AM
05-02-2019 06:41 AM
SolutionHi Florant,
This NMI seems to be know issue with RHEL
NMI An Unrecoverable System Error (NMI) has occurred (iLO application watchdog timeout NMI, Service Information: 0x0000002B, 0x00000000)
please see the below adviosry from REDHAT
https://access.redhat.com/solutions/1309033
IML log has the following entry:
An Unrecoverable System Error (NMI) has occurred (System error code 0x0000002B, 0x00000000)
Resolution
By default systemd starts a watchdog timer on shutdown. Disable ShutdownWatchdogSec to resolve this issue. To disable it, please open /etc/systemd/system.conf file and find following line:
#ShutdownWatchdogSec=10min
Change them to:
ShutdownWatchdogSec=0
Save the file and after that run:
# systemctl daemon-reexec
to allow systemd to know about the updated configuration or reboot the system.
NOTE: You may also wish to look at RuntimeWatchdogSec in the same file, it is disabled by default, please do not enable -it without specific reasons for doing so.
--------------------------------------------------------------------------------------------------------------------------------------
If still issue persist we recommand log a case with REDHAT.
If you need futher troubleshooting from Hardware side kindly log a case with HPE and share all the logs (AHS and SOS report)
Regards,
Sangam.