HPE BladeSystem Management Software
Showing results for 
Search instead for 
Do you mean 

Problems with OA 4.11?

Occasional Advisor

Problems with OA 4.11?

I recently upgraded all 6 of my C7000s from OA 4.01 to OA 4.11. 

 

A week later, two of them, at different times seem to have lost access to the blades (I have 16 in each) and then reacquires it back within 10 seconds.

The alert mail shows this for each blade:

EVENT (30 Mar 13:31): Blade in bay 8 status has changed to: Failed.

 

Blade, "CUTTER", has changed from OK to Failed.

Only to get it right back

 

EVENT (30 Mar 13:32): Blade in bay 8 status has changed to: OK.

 

Blade, "CUTTER", has changed from Failed to OK.

 

This happened again this weekend,  once per blade server, a couple hours apart. Its always the same bladecenters.

The individual blades stay up.  This seems to be an ILO connectivity issue, but I don't know why. Any ideas and help is greatly appreciated.

16 REPLIES
Occasional Advisor

Re: Problems with OA 4.11?

Update: It happened again this morning:  All 16 blades report the same..

 


Apr  1 06:30:34  OA: Management Processor on Blade 8 appears unresponsive.
Apr  1 06:30:34  OA: Management Processor on Blade 9 appears unresponsive.
<snip>
Apr  1 06:30:59  OA: Management Process on Blade 4 appears responsive again.
Apr  1 06:30:59  OA: Management Process on Blade 2 appears responsive again.

<etc>

Trusted Contributor Trusted Contributor

Re: Problems with OA 4.11?

Stuart,

 

Which ILO generation (2, 3, 4) and what version of ILO firmware on them?

 

Ken

Occasional Advisor

Re: Problems with OA 4.11?

ilo2 are at 2.23

and ilo3 are at  1.55, 1.65 or 1.70 

thanks

 

HPE Pro

Re: Problems with OA 4.11?

send me a showall for all c7000, in private message ;)

I work for HP
A quick resolution to technical issues for your HP Enterprise products is just a click away HP Support Center Knowledge-base
See Self Help Post for more details

Honored Contributor

Re: Problems with OA 4.11?

I can tell you that, since I did the update through Oneview to SPP 2014.02 on my blade enclosure, I have had similar things twice in 3 weeks... No further messages behind. 3 Gen8 blades in the enclosure all updated to latest SPP... If I receive the message again I will post the exact information...

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
My blog: http://blog.bitcon.be
Respected Contributor

Re: Problems with OA 4.11?

I am seeing the same thing on the two chassis worth of BL465c Gen8 blades that I've updated to 2014.02.  I left the OA at 4.01.  It looks like it's something with the 1.40 Firmware.

Occasional Advisor

Re: Problems with OA 4.11?

HP tells me that I have to run SP2014.02 on all my blades and my problems will go away... Humm.

 

HPE Pro

Re: Problems with OA 4.11?

or those who have problems, sent me by private message, showAll report

I work for HP
A quick resolution to technical issues for your HP Enterprise products is just a click away HP Support Center Knowledge-base
See Self Help Post for more details

Honored Contributor

Re: Problems with OA 4.11?

Stuart,

 

My entire enclosure is running 2014.02 (OA, VC and blades) and I see these issues as well....

 

 

Kr,

Bart

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
My blog: http://blog.bitcon.be
Occasional Advisor

Re: Problems with OA 4.11?

I have lost iLO access (server unresponsive) to almost all servers in 20 x c7000 enclosures.

 

Different servers (BL460c/BL490c/BL2x220c from G1 to G7).

iLO 2.23

OA 4.01/4.11/4.20

 

It's happened after unsucceful network firmware upgrade (with SPP 2014.02) on some BL460c G1 servers.

NIC configuration data lost (including MAC addresses, etc), NIC adapter is disabled.

 

This is known bug of SPP 2014.02.

http://h30499.www3.hp.com/t5/ProLiant-Servers-ML-DL-SL/HP-Proliant-DL380-G5-NIC-s-not-found-after-firmware-update/td-p/6256615

 

Looks like, corrupted NIC's have send some incorrect packets or MAC addresses conflict, which almost destroys all management network!!!

 

I can recover servers only by issues "reset server" command on OA modules. But i can't hard restart for hundreds of servers!

 

Servers which was not restarted by "reset server" still "Critical error"/"Unknwon" state.

 

Occasional Advisor

Re: Problems with OA 4.11?

Fantastic!

All iLO 2 interfaces in our management vlan is stuck!

Even non blades servers!

 

# hponcfg
HP Lights-Out Online Configuration utility
Version 4.3.0 Date 12/10/2013 (c) Hewlett-Packard Company, 2014
ERROR: Error communicating with ILO
ERROR: Unable to communicate with the Management Processor.

 

I can't restart iLO 2 without completely remove power from the servers!

 

p.s. iLO 3 was not affected.

HPE Pro

Re: Problems with OA 4.11?

Hi All

 

OA v4.11 and v4.20 contain an OpenSSL version that has the vulnerability for  Heartbleed

 

iLOs are NOT vulnerable as they don't use SSL/TLS libraries that contain the TLS heartbeat extension BUT, we are receiving reports that the script that test for the HeartBleed bug is causing iLO2 to stop responding and the blades have to be refused to recover iLO2 functionality.  

 

I work for HP
A quick resolution to technical issues for your HP Enterprise products is just a click away HP Support Center Knowledge-base
See Self Help Post for more details

Occasional Visitor

Re: Problems with OA 4.11?

I was being flooded with lots and lots of these messages until I turned SIM off.  Then they all stopped.  Working with HP Support now to be able to use SIM again.

 

Ben

Respected Contributor

Re: Problems with OA 4.11?

I upgraded the two chassis that I was seeing the alerts from to firmware 4.21 and it looks like that finally stopped the "Blade, "xxxx", has changed from Failed to OK." flapping.

Occasional Visitor

management processor on blade 6 appears unresponsive

management processor on blade 6 appears unresponsive Hi ,

OA Syslog showing the below,

Jun 12 04:14:48 OA: Management Processor on Blade 6 appears unresponsive.

Jun 12 04:14:58 OA: Management Process on Blade 6 appears responsive again.

My OA firmware version is 4.21

Please advise and what is the root cause??

Occasional Advisor

Re: management processor on blade 6 appears unresponsive

Hi guys,

I have got the same problem from 2 - 3 days with my Onbord Administrator (OA).

It's happened twice morning, when I come to work in our server room is too noisily. When I login in OA I seeing my blade servers with Critical errors : Management Processor :  Error - lost communication with ILO , and their fans spins on 98%.

The reboot it is not decision, but when I pull out OA module from the chassis and back it again, everything is OK

That from below is OA System Logs from when I have problem with my OA:

Aug 17 04:07:54 OA: Management Processor on Blade 1 appears unresponsive.
Aug 17 04:08:04 OA: Management Processor on Blade 2 appears unresponsive.
Aug 17 04:08:14 OA: Management Processor on Blade 3 appears unresponsive.
Aug 17 07:38:28 OA: Authentication failure for user admin from 192.168.11.203, requesting web service
Aug 17 07:38:46 OA: admin logged into the Onboard Administrator from 192.168.11.203
Aug 17 08:29:45 OA: Blade removed from bay 3
Aug 17 08:29:45 OA: Blade inserted in bay 3
Aug 17 08:29:45 OA: Blade in bay #3 status changed to OK
Aug 17 08:29:45 OA: Management Processor on Blade 3 appears unresponsive.
Aug 17 08:31:45 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 08:33:45 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 08:44:19 OA: Blade removed from bay 3
Aug 17 08:44:19 OA: Blade inserted in bay 3
Aug 17 08:44:19 OA: Blade in bay #3 status changed to OK
Aug 17 08:46:19 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 08:47:37 OA: Management Processor on Blade 3 appears unresponsive.
Aug 17 08:48:19 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 09:31:39 OA: Blade removed from bay 3
Aug 17 09:31:40 OA: Blade inserted in bay 4
Aug 17 09:31:40 OA: Blade in bay #4 status changed to OK
Aug 17 09:33:40 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 09:35:04 OA: Management Processor on Blade 4 appears unresponsive.
Aug 17 09:35:40 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 09:44:11 OA: Blade removed from bay 4
Aug 17 09:44:11 OA: Blade inserted in bay 4
Aug 17 09:44:11 OA: Blade in bay #4 status changed to OK
Aug 17 09:46:11 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 09:47:35 OA: Management Processor on Blade 4 appears unresponsive.
Aug 17 09:48:11 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 10:37:36 OA: PowerDelay server settings have been changed.
Aug 17 10:39:57 OA: Blade removed from bay 4
Aug 17 10:39:57 OA: Blade inserted in bay 4
Aug 17 10:39:57 OA: Blade in bay #4 status changed to OK
Aug 17 10:41:58 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 10:43:21 OA: Management Processor on Blade 4 appears unresponsive.
Aug 17 10:43:58 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 10:44:54 OA: PowerDelay server settings have been changed.
Aug 17 10:57:36 OA: Onboard Administrator is rebooting
Aug 17 10:58:16 Kernel: Network link is up at 100Mbps - Full Duplex
Aug 17 10:58:17 OA: Time zone changed to GMT+2
Aug 17 10:58:19 OA: LCD Status is: OK.
Aug 17 10:58:21 Enclosure-Link: Service started
Aug 17 10:58:22 OA: Onboard Administrator booted successfully
Aug 17 10:58:31 Enclosure-Link: Initial topology scan completed successfully
Aug 17 10:59:48 OA: admin logged into the Onboard Administrator from 192.168.11.203
Aug 17 11:00:32 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 11:01:24 OA: Management Processor on Blade 1 appears unresponsive.
Aug 17 11:01:30 OA: Management Processor on Blade 2 appears unresponsive.
Aug 17 11:01:42 OA: Management Processor on Blade 4 appears unresponsive.
Aug 17 11:02:56 OA: Blade removed from bay 4
Aug 17 11:03:13 OA: Blade inserted in bay 3
Aug 17 11:03:13 OA: Blade in bay #3 status changed to OK
Aug 17 11:04:56 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 11:06:31 OA: Management Processor on Blade 3 appears unresponsive.
Aug 17 11:06:56 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 11:12:51 Kernel: Network link is up at 100Mbps - Full Duplex
Aug 17 11:12:53 OA: Time zone changed to GMT+2
Aug 17 11:12:54 OA: Server Power Reduction Mode - Enabled
Aug 17 11:12:55 OA: LCD Status is: OK.
Aug 17 11:12:56 Enclosure-Link: Service started
Aug 17 11:12:58 OA: Onboard Administrator booted successfully
Aug 17 11:13:06 Enclosure-Link: Could not acquire bottom enclosure's UUID. Cannot set RUID.
Aug 17 11:13:06 Enclosure-Link: Initial topology scan completed successfully
Aug 17 11:13:08 OA: Blade 3 is reporting nominal health status.
Aug 17 11:13:08 OA: Blade in bay #3 status changed to OK
Aug 17 11:13:13 Enclosure-Link: RUID recovered: 09CZC9027XJ3
Aug 17 11:13:28 OA: Blade 1 is reporting nominal health status.
Aug 17 11:13:28 OA: Blade in bay #1 status changed to OK
Aug 17 11:13:32 OA: Blade 2 is reporting nominal health status.
Aug 17 11:13:32 OA: Blade in bay #2 status changed to OK
Aug 17 11:13:36 OA: Server Power Reduction - Deactivated
Aug 17 11:13:36 OA: Server Power Reduction Mode - Disabled
Aug 17 11:13:40 OA: Server blade in bay 3 has been powered on
Aug 17 11:13:40 OA: Blade 3 is properly cooled.
Aug 17 11:14:09 OA: Blade in bay #3 status changed to OK
Aug 17 11:14:25 OA: admin logged into the Onboard Administrator from 192.168.11.203
Aug 17 11:14:29 OA: Blade in bay #1 status changed to OK
Aug 17 11:14:33 OA: Blade in bay #2 status changed to OK
Aug 17 11:15:08 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 11:15:08 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 17 11:17:08 Alertmail: Failed to send AlertMail to kiro@elkabel.bg

Aug 18 15:35:48 Kernel: Network packet flooding detected.
Aug 18 15:35:50 Kernel: Network packet flooding detected.
Aug 18 15:35:55 Kernel: Network packet flooding detected.
Aug 18 15:36:02 Kernel: Network packet flooding detected.
Aug 18 20:28:34 OA: Management Processor on Blade 1 appears unresponsive.
Aug 18 20:28:44 OA: Management Processor on Blade 2 appears unresponsive.
Aug 18 20:28:54 OA: Management Processor on Blade 3 appears unresponsive.
Aug 19 07:43:16 OA: admin logged into the Onboard Administrator from 192.168.11.203
Aug 19 07:51:20 OA: Onboard Administrator is rebooting
Aug 19 07:52:00 Kernel: Network link is up at 100Mbps - Full Duplex
Aug 19 07:52:00 OA: Time zone changed to GMT+2
Aug 19 07:52:01 OA: LCD Status is: OK.
Aug 19 07:52:03 Enclosure-Link: Service started
Aug 19 07:52:04 OA: Onboard Administrator booted successfully
Aug 19 07:52:13 Enclosure-Link: Initial topology scan completed successfully
Aug 19 07:53:13 OA: admin logged into the Onboard Administrator from 192.168.11.203
Aug 19 07:54:14 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 19 07:55:07 OA: Management Processor on Blade 1 appears unresponsive.
Aug 19 07:55:13 OA: Management Processor on Blade 2 appears unresponsive.
Aug 19 07:55:19 OA: Management Processor on Blade 3 appears unresponsive.
Aug 19 08:15:31 Kernel: Network link is up at 100Mbps - Full Duplex
Aug 19 08:15:32 OA: Time zone changed to GMT+2
Aug 19 08:15:34 OA: Server Power Reduction Mode - Enabled
Aug 19 08:15:36 OA: LCD Status is: OK.
Aug 19 08:15:38 Enclosure-Link: Service started
Aug 19 08:15:41 OA: Onboard Administrator booted successfully
Aug 19 08:15:47 Enclosure-Link: Initial topology scan completed successfully
Aug 19 08:15:47 OA: admin logged into the Onboard Administrator from 192.168.11.203
Aug 19 08:16:10 OA: Blade 1 is reporting nominal health status.
Aug 19 08:16:10 OA: Blade in bay #1 status changed to OK
Aug 19 08:16:15 OA: Blade 2 is reporting nominal health status.
Aug 19 08:16:15 OA: Blade in bay #2 status changed to OK
Aug 19 08:16:16 OA: Server Power Reduction - Deactivated
Aug 19 08:16:16 OA: Server Power Reduction Mode - Disabled
Aug 19 08:16:20 OA: Blade 3 is reporting nominal health status.
Aug 19 08:16:20 OA: Blade in bay #3 status changed to OK
Aug 19 08:17:10 OA: Blade in bay #1 status changed to OK
Aug 19 08:17:15 OA: Blade in bay #2 status changed to OK
Aug 19 08:17:20 OA: Blade in bay #3 status changed to OK

Aug 19 08:17:48 Alertmail: Failed to send AlertMail to kiro@elkabel.bg
Aug 19 08:18:08 Alertmail: Failed to send AlertMail to kiro@elkabel.bg

If someone has experience with similar errors please let me share what have to done.

Thank you in advance