HPE Community read-only access December 15, 2018
This is a maintenance upgrade. You will be able to read articles and posts, but not post or reply.
Hours:
Dec 15, 4:00 am to 10:00 am UTC
Dec 14, 10:00 pm CST to Dec 15, 4:00 am CST
Dec 14, 8:00 pm PST to Dec 15, 2:00 am PST
BladeSystem - General
cancel
Showing results for 
Search instead for 
Did you mean: 

Blade POST issue/Health alert (after firmware update?)

 
Ryan22
Occasional Contributor

Blade POST issue/Health alert (after firmware update?)

Hi,

I've got a problem with a Blade that I'm looking for any ideas with.  Essentially at the moment, it won't POST at all and the System Health light is flashing Red.  I'll run through the details, as much as I can recall:

 

The system was originally:

  • HP BL685 G1 w/ ROM 05/04/07 (405661-B21)
  • 4 x AMD Opteron
  • 20GB RAM
  • 2 x 76GB SAS
  • QLogic Fibre Mezzanine module
  • iLO Firmware v1.50? - unsure what it was exactly
  • within a c7000 Enclosure running OA v1.30


Now the machine had been running fine as an ESX Host for I'm not sure how long.  As part of repurposing the unit for a new project, I decided to update the firmeware (big mistake?).  I connected the ISO of the HP Firmware Maintenance DVD v9.30 using iLO's Virtual Media facility and booted from it.  I ran a custom firmware update, so-as to see what was being updated.  I left everything to be updated, except I removed the QLogic card, not wanting to cause any SAN issues.  The update went through mostly without a hitch, although the 2 SAS drives didnt update, but I'd figure I would get that on another boot later.  The machine rebooted fine and got back into the VMWare farm.  Now part of why we were updating firmware was we wanted to ensure maximum compatibility with VMWare features.  So we rebooted back into RBSU to look for any CPU virtualisation features not enabled.  We found AMD Virtualisation wasnt enabled, so turned that on, and rebooted.  Now the problem starts.  The system never appeared back into vCenter.  iLO was still working at this point, but wasnt showing any video using the consoles.  So we reset the iLO.  Still same issue.  OA is showing the system all green.  So is iLO.  On physically examining the machine, the red health light is flashing, and the enclosure's fans are all spinning at high speed.  Removing the blade from the chassis makes the fans return to normal speed, but then putting the blade back in, the following happens:

 

  • Amber power light while enclosure detects blade
  • System Powers on
  • Insight display shows blade as green
  • All green on Blade lights
  • Eventually health light starts flashing red and fans spin up to near full speed
  • No video/POST when directly connecting to unit, nor via iLO
  • Both iLO and OA report the system is running fine.  No logs to show issues whatsoever.

 

Things I've tried:

 

  • Reseat RAM - no joy
  • Removed 2 x CPU and their RAM - no joy
  • Also removed QLogic - no joy
  • Reset Configuration using DIP switch - no joy
  • Tried to enable Redundant ROM, but it doesnt appear to switch to it (looking at iLO readout) - no joy
  • Reverted from iLO v2.05 firmware to iLO v1.50.  After doing this, I could see iLO was showing red for health but after another blade reseat, iLO now reports everything a-ok again - no joy

 

Another weird thing I've noticed in iLO is the System Information screen is still showing all 4 CPUs and the original RAM.
Does anyone have any ideas of what to try with this box?  I know replacing the system board is one option, but as the machine is out of warranty, that is probably going to be prohibitively expensive for what is only a test machine really.  I find it odd that the server was fine enough after the firmware update to boot into VMWare and then back into the RBSU, but after that one setting change, bang, broken.  Just bad luck / timing possibly, and it's a general system board failure?