Server Management - Remote Server Management
cancel
Showing results for 
Search instead for 
Did you mean: 

ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

 
Highlighted
Advisor

ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

Hi,

This issue involves several DL380 and DL360 G5 servers.

I recently updated the ILO2 firmware to version 1.79 and also the HP System Management Homepage to v3.0.2.77 on several DL380 and Dl360 Servers.

Since doing so I have had sporadic power up and down issues and when looking in the ILO2 log I receive this message each time before the power cycle "BMC IPMI Watchdog Timer Timeout: Action=System Power Reset"

I have dug around online and find various threads leading nowhere specifically all indicating that HP are working on this then ending with a disable ASR as a fix in-between, which I have done and it appears to have stopped the reboots.

My questions are 1. What is the issue I am experiencing and how do I fix it ?

2. What exactly are ASR and BMC IPMI Watchdog Timer? And what are they used for?

Your help as ever is greatly appreciated!

Best Regards
Iain
21 REPLIES 21
Highlighted
Honored Contributor

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

I'm no expert in this but I think the recommendation (for Linux systems) is to disable ASR.

The DL585 G5 User Guide on ASR:
ASR is a feature that causes the system to restart when a catastrophic operating system error occurs, such as a blue screen, ABEND, or panic. A system fail-safe timer, the ASR timer, starts when the System Management driver, also known as the Health Driver, is loaded. When the operating system is functioning properly, the system periodically resets the timer. However, when the operating system fails, the timer expires and restarts the server.
Highlighted
Honored Contributor

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

By the way, someone else has answered your second question much better then I ever could :-)

http://forums.itrc.hp.com/service/forums/questionanswer.do?threadId=1374922
Highlighted
Honored Contributor

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

The problem solution requires both:
- update to iLO 2 v1.78 or later
- update the Windows Management Controller Driver to 1.11.2.0 or later

Updating one or the other is not a complete fix.

Discussed in this Customer Advisory:
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?locale=en_US&objectID=c01802766
Highlighted
Occasional Visitor

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

have yuo tried upgrading the iLO 2 Management Controller Driver Version?
Highlighted
New Member

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

acartes, the Advisory you cite talks about this issue occurring when the iLO2 Firmware is at or below 1.70. Iain's issue (and mine coincidentally) did not manifest until the iLO Firmware was upgraded to 1.79.
I had a colleage reference the same Advisory article, but I'm not convinced the problem is solved.
Highlighted
New Member

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

I'm having the same issue. Mine didn't start unitl upgrading to 1.79!! I have updated the Instance driver, controller driver and the system rom. Still gettign random reboots and reports of BMC IPMI Watchdog Timer Timeout.
Highlighted
Occasional Contributor

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

I have the same issue here on a number of DL360G5s. Reboots a random time after upgrading to ILO 1.79

HAs anyone tried reflashing back to 1.78?

It is suggested elsewhere that the reboot will only happen once - can anyone confirm this?

John
Highlighted
Occasional Advisor

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

Hi,

My experience is that iLO2 Management Driver 1.11.2.0 AND iLO2 firmware 1.78 or later will fix the issue.

BR / Jimmy
Highlighted
Honored Contributor

Re: ILO2 firmware update caues BMC IPMI Watchdog Timer Timeout: Action=System Power Reset

>> acartes, the Advisory you cite talks about this issue occurring when the iLO2 Firmware is at or below 1.70. Iain's issue (and mine coincidentally) did not manifest until the iLO Firmware was upgraded to 1.79.
I had a colleage reference the same Advisory article, but I'm not convinced the problem is solved.

A complete solution requires that both iLO and the OS driver are updated.
Details for each are in the CA.
The iLO change addresses a "duplicate records in the SEL" bug (generated by the OS driver), the OS driver change prevents a communication hang-up that can result in an ASR.

The iLO change was introduced in iLO 2 v1.78, and released at the same time as the OS driver update. It is a difficult "update two pieces of software to address the issue" fix.