BladeSystem - General
1748203 Members
3073 Online
108759 Solutions
New Discussion

BL870c i2 iLO3 Firmware Upgrade issue.

 
CDMTEX
Regular Visitor

BL870c i2 iLO3 Firmware Upgrade issue.

I have a BL870c i2 in a Lab environment that I have been testing upgrade/downgrade of iLO3 firmware.  I noticed something yesterday during my testing and looking to see if anyone else has seen this before I open a case with HP.

 

Via a telnet connection into the iLO I am watching "Live Events " during the upgrade (which is done via the iLO Web interface) and all looks good:

 

2     ILO  5          2  4080230D40E10003 0000000000000000 FW_UPDATE_START
                                                           27 Aug 2013 12:53:07
3     ILO  6          2  4080230D50E10005 0000000000000000 FW_UPDATE_START
                                                           27 Aug 2013 12:53:09
4     ILO  6          2  4080231250E10007 0000000000000000 FW_UPDATE_SUCCESS
                                                           27 Aug 2013 12:57:40
5     ILO  5          2  4080231240E10009 0000000000000000 FW_UPDATE_SUCCESS
                                                           27 Aug 2013 12:57:40
6     ILO  5          1  4980257D40E1000B 4554414450555746 SYS_CTL_REQ_ENABLE
                                                           27 Aug 2013 12:57:42
7     ILO  6          2  4080256950E1000D 0000000000000000 ILO_SOFT_RESET
                                                           27 Aug 2013 12:57:42
8     ILO  5          2  4080256940E1000F 0000000000000000 ILO_SOFT_RESET
                                                           27 Aug 2013 12:57:44
9     ILO  5          2  408022AE40E10011 0000000000000000 ILO_IS_BOOTING
                                                           27 Aug 2013 12:58:15

 

At the same time I am tailing "syslog.log" on the HP-UX OS instance running on the BL870c i2 and noticed the following after the iLO is reset:

 

 

Aug 27 07:58:34 t07hvs00001 hpvmnetd[2757]: Detecting PNIC state change from UP
to DOWN, delete vsw_234
Aug 27 07:58:36 t07hvs00001 vmunix: avioVswitchSnapCam: switch 0x2 not found.
Aug 27 07:58:36 t07hvs00001 vmunix: hssn_vswitch_create_handler:L1495  hiftp HSS
N_DEAD.
Aug 27 07:59:10 t07hvs00001 vmunix: fclp driver at 0/0/0/5/0/0/0.0x11 (/dev/fclp
2) : detected that device id 0xffffff, PWWN 0x500507680140bb88 is offline.
Aug 27 07:59:10 t07hvs00001 vmunix: DIAGNOSTIC SYSTEM WARNING:
Aug 27 07:59:10 t07hvs00001 vmunix:    The diagnostic logging facility is no lon
ger receiving excessive
Aug 27 07:59:10 t07hvs00001 vmunix:    errors .  1  error entries were lost.
Aug 27 07:59:10 t07hvs00001 vmunix: fclp driver at 0/0/0/5/0/0/0.0x11 (/dev/fclp
2) : detected that device id 0xffffff, PWWN 0x500507680140bb89 is offline.
Aug 27 07:59:10 t07hvs00001 vmunix: fclp driver at 0/0/0/5/0/0/0 (/dev/fclp0) :
detected that device id 0xffffff, PWWN 0x500507680140bb88 is offline.
Aug 27 07:59:10 t07hvs00001 vmunix: fclp driver at 0/0/0/5/0/0/0 (/dev/fclp0) :
detected that device id 0xffffff, PWWN 0x500507680140bb89 is offline.
Aug 27 07:59:10 t07hvs00001 vmunix: fclp driver at 0/0/0/5/0/0/0.0x12 (/dev/fclp
4) : detected that device id 0xffffff, PWWN 0x500507680140bb88 is offline.
Aug 27 07:59:10 t07hvs00001 vmunix: fclp driver at 0/0/0/5/0/0/0.0x12 (/dev/fclp
4) : detected that device id 0xffffff, PWWN 0x500507680140bb89 is offline.
Aug 27 07:59:10 t07hvs00001 vmunix: class : tgtpath, instance 2
Aug 27 07:59:10 t07hvs00001 vmunix: class : tgtpath, instance 3
Aug 27 07:59:10 t07hvs00001 vmunix: Target path (class=tgtpath, instance=2) has
gone offline.  The target path h/w path is 0/0/0/5/0/0/0.0x500507680140bb88
Aug 27 07:59:10 t07hvs00001 vmunix:
Aug 27 07:59:10 t07hvs00001 vmunix: Target path (class=tgtpath, instance=3) has
gone offline.  The target path h/w path is 0/0/0/5/0/0/0.0x500507680140bb89
Aug 27 07:59:10 t07hvs00001 vmunix: fclp driver at 0/0/0/5/0/0/1.0x11 (/dev/fclp
3) : detected that device id 0xffffff, PWWN 0x500507680130bb88 is offline.
Aug 27 07:59:10 t07hvs00001 vmunix: fclp driver at 0/0/0/5/0/0/1.0x11 (/dev/fclp
3) : detected that device id 0xffffff, PWWN 0x500507680130bb89 is offline.
Aug 27 07:59:10 t07hvs00001 vmunix: fclp driver at 0/0/0/5/0/0/1.0x12 (/dev/fclp
5) : detected that device id 0xffffff, PWWN 0x500507680130bb88 is offline.
Aug 27 07:59:10 t07hvs00001 vmunix:
Aug 27 07:59:10 t07hvs00001 vmunix: fclp driver at 0/0/0/5/0/0/1.0x12 (/dev/fclp
5) : detected that device id 0xffffff, PWWN 0x500507680130bb89 is offline.
Aug 27 07:59:10 t07hvs00001 vmunix: class : tgtpath, instance 5
Aug 27 07:59:10 t07hvs00001 vmunix: Target path (class=tgtpath, instance=5) has
gone offline.  The target path h/w path is 0/0/0/5/0/0/1.0x500507680130bb89
Aug 27 07:59:10 t07hvs00001 vmunix:
Aug 27 07:59:10 t07hvs00001 vmunix: class : tgtpath, instance 4
Aug 27 07:59:10 t07hvs00001 vmunix: Target path (class=tgtpath, instance=4) has
gone offline.  The target path h/w path is 0/0/0/5/0/0/1.0x500507680130bb88
Aug 27 07:59:10 t07hvs00001 vmunix: WARNING:  Failed to find optimal path for 0x1000002.
Aug 27 07:59:10 t07hvs00001 vmunix: Marking the device 0x1000002 offline.
.
.

.

Aug 27 07:58:57 t07hvs00001 hpvmnetd[2757]: Detecting PNIC state change from DOW
N to UP, recreate vsw_234
Aug 27 07:59:10 t07hvs00001 vmunix: LVM: NOTICE: VG 64 0x000000: LV 8: All I/O r
equests to this LV that were
Aug 27 07:59:10 t07hvs00001 vmunix: LVM: NOTICE: VG 64 0x000000: LV 7: All I/O r
equests to this LV that were
Aug 27 07:59:10 t07hvs00001 vmunix: class : tgtpath, instance 5
Aug 27 07:59:10 t07hvs00001 vmunix: Target path (class=tgtpath, instance=5) has
gone online.  The target path h/w path is 0/0/0/5/0/0/1.0x500507680130bb89
Aug 27 07:59:10 t07hvs00001 vmunix:     waiting indefinitely for an unavailable
PV have now completed.
Aug 27 07:59:10 t07hvs00001  above message repeats 10 times
Aug 27 07:59:10 t07hvs00001 vmunix:
Aug 27 07:58:58 t07hvs00001 hpvmnetd[2757]: Detecting PNIC state change from DOW
N to UP, recreate vsw_234
Aug 27 07:59:10 t07hvs00001 vmunix: class : tgtpath, instance 4
Aug 27 07:59:10 t07hvs00001 vmunix: Target path (class=tgtpath, instance=4) has
gone online.  The target path h/w path is 0/0/0/5/0/0/1.0x500507680130bb88


So it looks like for about 30 seconds during the iLO reboot the Blade lost all of its I/O with the chassis.  I noticed it a few more times further back in "syslog.log" and those occurances were around the same time I had completed other upgrade/downgrades.  I have been bouncing between 1.30.30 (currently in production) and 1.55.02 (what we plan to upgrade to at this time).

 

Has anyone else seen this before?

 

4 REPLIES 4
Torsten.
Acclaimed Contributor

Re: BL870c i2 iLO3 Firmware Upgrade issue.

Since you should NEVER update the ILO only, you should ALWAYS arrange a downtime for firmware upgrades.

 

I always power off the blade, then update ILO and system firmware (they belong together!), then power on the blade, go to EFI shell and update LOM, Smartarray and mezzanine cards.

 

Sometimes older ILO firmware may get unresponsive, in this case you need to reset the whole server before updating the firmware, because the "dead" ILO cannot do anything.

 

 

 

If you find "online update" in the notes, be aware they say you can install the firmware while the OS is up, but you need to restart the server shortly after to make it active. So this is not really "online"...


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
CDMTEX
Regular Visitor

Re: BL870c i2 iLO3 Firmware Upgrade issue.

Torsten, I'm with you on upgrading both at same time, of course getting downtime in our environment is like pulling teeth and we do not have spare capacity (blades) to migrate the workloads between.  With these Itanium Blades the dowtime is even longer since you have to power them off for the SYS FW update to take place, I tested that too by itself.  I shiver to think how long it will take to update a BL890c i2.  :-)

 

I has just testing the iLO part to see what might happen and you pretty much answered my question.  It is all or nothing.

 

Thanks!

Chris

Torsten.
Acclaimed Contributor

Re: BL870c i2 iLO3 Firmware Upgrade issue.

This is what we do.

 

- Clone the OS using DRD

- patch/upgrade the clone

- switch the cluster at saturday night and power of the servers (couple of blades, BL860, 870, 890, all i2)

- update all the blade firmware in the enclosure (takes about 20 min per server, no matter how many blades the server has)

- boot the server to EFI and update LOM, smart array and mezzanines

- update interconnects and OA

- boot everything

 

 

Depending on the interconnects this is done in around 1 hour (interconnects may need more time, but some can be done online).

 

Bringing the firmware for the blades online in place saves you around 10 min, but be very careful with reboot options!


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
CDMTEX
Regular Visitor

Re: BL870c i2 iLO3 Firmware Upgrade issue.

Cool!  Thanks for the information.


Regards,

Chris