System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

DL 585 G5 NMI watchdog on RHEL5.4 64bits

Pawel Kisiel
Occasional Visitor

DL 585 G5 NMI watchdog on RHEL5.4 64bits

Hi All,

I'm having problems with DL585G5 and standard Redhat 5.4 installation. Server is equipped with 2cpus and 64GB ram. Installation is completing fine, but pretty much after first restart I'm getting NMI messages:
testing NMI watchdog ... <4>WARNING: CPU#4: NMI appears to be stuck (44->49)!
time.c: Using 25.000000 MHz WALL HPET GTOD HPET/TSC timer

Server is up to date with BIOS/storage ctrl fw/ilo. I've tried to fiddle with Linux x86_64 HPET option in BIOS but getting same error.

I've found HP/Redhat artical descrbing identical issue but on DL580 G5 and RHEL 4.x!?
Ref HP url: http://www13.itrc.hp.com/service/cki/docDisplay.do?docLocale=en&docId=emr_na-c01670591
Ref Redhat url: http://kbase.redhat.com/faq/docs/DOC-15444

Provided solution with replacing nmi_watchdog with ilo2 hp-wdt module seems to work ok, but I'm concerned that I didn't have any problems with DL585G5 6 months ago when installing RHEL5.2.
I've tried installing 5.2 on that machine but same error occurs.

I've got 2 other DL585 G5s, identical specs and they all are showing nmi errors on default rhel 5.4 installation.

Anyone can give me some idea what to do next or should I just ignore these and rely on hp-wdt timer?

Regards,
Pawel
4 REPLIES
Steven E. Protter
Exalted Contributor

Re: DL 585 G5 NMI watchdog on RHEL5.4 64bits

Shalom Pawel,

I would not completely ignore the issue. I would forget the messages and keep an eye on red hat's buzgilla site. There may be a fix coming for this issue in the near future.

This is an example of Red Hat's poor quality control. This should have been caught.

But ask yourself, does it effect reliability in a meaningful way.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Pawel Kisiel
Occasional Visitor

Re: DL 585 G5 NMI watchdog on RHEL5.4 64bits

Hi Steven,

It's bit bizzare that all 3 servers suffer from same NMI watchdog messages. This particular server gave me more trouble than 2 others. I couldn't even build the server and had engineers resitting all components again which allowed me to install OS in the end, but it was ending up with NMI errors and hard lockups, while the other 2 only NMI messages during startup. Replacing nmi_watchdog with hp-wdt one caused also a hard lockup. Today after removing all pci-e cards (2nics and 2 fc hbas) server is stable. I put cards in again and no issues with hard lockups, but still NMIs.

I'm setting up kdump to catch vmcores if server become unstable again.

After all of this I'm hesitating to say that this is purely hardware of software issue. If I could upgrade other 2 servers which I installed 6 months ago to rhel5.4...that could possibly give me some clues.

Regards,
Pawel
Pawel Kisiel
Occasional Visitor

Re: DL 585 G5 NMI watchdog on RHEL5.4 64bits


I'd like to share good news! I've managed to fix that problem on my servers thanks to clue from Redhat support. They advised me to change power regulator settings to OS control, where default is HP dynamic power savings mode. Initally it was part success as "Uhuh, Dazed and confused.." message disappeared, but "NMI appears to be stuck..." was still there! I spent all yesterday trying to get rid of that message. I've managed to fix it on server which was built already on OS control power mode, but wasn't able to fix it on already installed servers when changing that power mode on them.

The missing and crucial bit was re-installing HP-PSP after changing power mode to OS Control!

When installing RHEL5.2 with default power mode settings, you will get messages:
"powernow-k8: Your BIOS does not provide _PSS objects. PowerNow! does not work on SMP systems without _PSS objects. Complain to your BIOS vendor."

By changing power mode to OS Control you will be able to get rid of them.

Regards,
Pawel
Viktor Balogh
Honored Contributor

Re: DL 585 G5 NMI watchdog on RHEL5.4 64bits