- Integrated Systems
- About Us
- Integrated Systems
- About Us
03-21-2018 06:09 AM
DL380 Gen8 servers dying like flies
At the moment we have the feeling that the DL380 Gen8 servers are dying like flies on a hot summer day. Since about half a year we had several problems that point to the onboard flash memory where the Intelligent Provisioning software is stored.
The symptoms are as follows: iLO reports initialization errors, iLO runs slow, "F10" for Intelligent Provisioning does not work, I-P does not start.
But all servers are booting normal to OS and performance is also normal. Only iLO and I-P is affected.
We tried the format utility for internal nand memory (see advisory here: https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c04996097) with now success. BTW, success is sometimes relative... we had servers that worked well after that treatment but failed two days later after previously successful reboots.
Other problems are iLO firmware chips that were corrupted and iLO did not work at all. We started to replace the tiny 8-pin eeprom chip with a freshly programmed. Some of these boards worked permanently after repair.
I wonder what's happening... Are the NAND Flash chips bad? Do they fail after some time? Or is it a firmware bug?
I would be glad to get any information on this... Do you have similar problems?
"If it seems illogical... you just don't have enough information"
03-21-2018 07:24 AM
Re: DL380 Gen8 servers dying like flies
The last Version of ILO firmware 2.55
I think with this expecification answer your question About ILO 4
The following issues are resolved in iLO 4 version 2.55:
- Masked out errant failures due to charging of storage battery.
- Implemented cell voltage separation pre-failure warning for storage battery.
- iLO RESTful API output might display incorrect power supply information.
- iLO Federation group authentication errors might occur if you repeatedly add and remove groups during a query.
- An iLO RESTful API event subscription might be lost when DELETE and CREATE subscriptions occur at the same time that the transmitter waits to retry an action.
- During CAC smartcard authentication, the iLO RESTful API returns a session URI that incorrectly contains uppercase letters.
- In certain conditions, the Rest server becomes unavailable during a GET of the IELs.
- The iLO web interface language pack redirects to English.
- The iLO RESTful API output text represents upper threshold values as lower thresholds.
- The Linux openipmi driver does not poll the receive message queue if KCS host irq not enabled.
- The iLO RESTful API EthernetInterfaces link should be under the system/1 root resource, and not in the OEM section.
- SNMPv3 Engine Boot is not getting incremented on iLO reset.
- IPMI FRU read returns incorrect completion code for response too long.
- IPMI Get PEF Capabilities returns the number of valid table entries instead of the total number of table entries.
- IPMI Set Boot Options for one time change for boot mode UEFI/Legacy fixed.
- iLO restserver suspends when patching bad payload to external provider array.
- iLO REST API returned 500 internal error for a GET of systems/1/ leading to failed One View Profile Apply.
- iLO RESfulT API events are sending incorrect "Host" header when using IPv6.
- iLO time becomes Unset after update from 2.50 or prior to 2.54.
This version adds support for the following features and enhancements:
- The Self-signed SSL certificate can now be regenerated.
- New iLO RESTful API command to allow an auxiliary power cycle of the server on the next host power down.
- Added THERM_TRIP events, OS_STOP_SHUTDOWN, OS_NMI, ACPI, PCI-E Bus Error and CPU error logs to the SEL.
- Added OEM type SEL event with IML info on critical events.
- Improved reliability of Embedded Media attach and diagnostics.
About the Intelligent Provisioning, I don't have information about the flash card with errors.
Normally ocorrous that the file of inicialization boot is corrupted and can't open the de I.P.
Reinstall the I.P. with last version and try open again
Link for donwload of last version - 1.70 - https://support.hpe.com/hpsc/swd/public/detail?swItemId=MTX_67851618d0f1444993f262f0f6