ProLiant Servers (ML,DL,SL)
1752414 Members
5570 Online
108788 Solutions
New Discussion

Re: [DL 360 Gen8] ILO 4 small issue

 
nimbus
Occasional Advisor

Re: [DL 360 Gen8] ILO 4 small issue

I've not had a chance to do the server reset on the blade yet, as I'll need to wait for some downtime on it. Both are in a remote datacenter so getting power reset on the DL380 will be a challenge.

nimbus
Occasional Advisor

Re: [DL 360 Gen8] ILO 4 small issue

I managed to get some downtime on the blade, and here's what happened....

I initiated the server bay reset command, and the blade didn't come back. Not really the scenario I wanted to experience, the OA just reported it couldn't identify the device in the bay. So I stepped away, made a coffee and came back after 10 mins and it was still complaining. So I re-issued the command and this time it worked successfully. I left it another 10 mins for everything to settle down.

The iLO Diagnostics page was now reporting that the Embedded Flash/SD-CARD was ok. So I downloaded the IP provisioning ISO, attached it via virtual media and powered up the server.

Worryingly OneView was still reporting Lights Out Self Test Error 8192, even after a refresh of the device. Additionally during POST iLO4 itself was reporting a self-test error. So I continued to let it start and boot the IP DVD. Never having used the IP media before I wasn't sure what to expect, so I let it run through on defaults. After a few minutes it threw up a complaint about the NAND device, and then proceeded to a text screen of a percentage bar of deployment. It didn't get past 1% and then rebooted the server.

Upon the 2nd reboot I noticed that during POST iLO4 was no longer complaining about a self-test error, but OneView still was. Somewhat annoyingly, iLO4 diagnostics is now reporting the following error against the Embedded Flash/SD-Card "Controller firmware revision 2.10.00 Embedded media manager failed initialization"

Is the IP ISO image a completely automated process, if so, how long should it take on a normal execution run?

With regards to this server I do think it's now a case of calling HP support to see what they say, as despite it initially looking favourable to fix the issue, I seem to be back to Square 1 with this, albeit with a slightly upgraded controller firmware on the NAND.

Torsten.
Acclaimed Contributor

Re: [DL 360 Gen8] ILO 4 small issue

The IP recovery media will by default re-install IP.

OneView needs IP on that server.

If the ILO doesn't work, the blade cannot talk to the OA and vice versa, hence no power on.

Has the ILO at least version 2.42?

If the format did not help, the ILO might be broken.

The ILO error is related:

http://h20564.www2.hpe.com/hpsc/doc/public/display?docId=emr_na-c04996097


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
nimbus
Occasional Advisor

Re: [DL 360 Gen8] ILO 4 small issue

As per the HP documentation on this issue I took iLO to version 2.50, and that upgraded without any issue.

iLO itself seems completely fine. After the 2nd attempt at resetting the bay everything came back as I would expect. However, that first behaviour in itself is a concern as I've only ever seen that behaviour occur on a blade that has a fault.

However, clearly things aren't right with iLO as OneView is still complaining, IP won't re-deploy and the errors are back in the iLO diagnostics.

The only reason we noticed a problem was trying to bring these enclosures under OneView control. That's when we noticed a problem with this box and the DL380p Gen 8.

Torsten.
Acclaimed Contributor

Re: [DL 360 Gen8] ILO 4 small issue

Let's try the blade again.

Power off the blade.

From OA ssh session run (copy and paste as is, but adjust the bay number "3" to your value):

 

  HPONCFG 3 << end_marker
<RIBCL VERSION="2.0">
<LOGIN USER_LOGIN="Administrator" PASSWORD="password">
<RIB_INFO MODE="write">
<FORCE_FORMAT VALUE="all"/>
</RIB_INFO>
</LOGIN>
</RIBCL>
end_marker

password doesn't matter.

Once done, do the "server reset" command.

Power on the blade and check.

Assign the IP 1.64 recovery ISO via vrtual media and boot from.

When booting again, check if F10 - Intelligent Provisioning is displayed during boot - I found this sometimes disabled.


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
nimbus
Occasional Advisor

Re: [DL 360 Gen8] ILO 4 small issue

I re-executed the command remotely as I already had the remote command line ready to go. The contents of the XML was as follows:


<RIBCL VERSION="2.0">
<LOGIN USER_LOGIN="x" PASSWORD="xxxxxxxx">
<RIB_INFO MODE="write">
<FORCE_FORMAT VALUE="all" />
</RIB_INFO>
</LOGIN>
</RIBCL>

The output was as follows:

Server IP is : x.x.x.x:443
<?xml version="1.0"?>
<RIBCL VERSION="2.23">
<RESPONSE
STATUS="0x0000"
MESSAGE='No error'
/>
</RIBCL>
<?xml version="1.0"?>
<RIBCL VERSION="2.23">
<RESPONSE
STATUS="0x0000"
MESSAGE='No error'
/>
</RIBCL>
<?xml version="1.0"?>
<RIBCL VERSION="2.23">
<RESPONSE
STATUS="0x0000"
MESSAGE='No error'
/>
</RIBCL>
<?xml version="1.0"?>
<RIBCL VERSION="2.23">
<RESPONSE
STATUS="0x0000"
MESSAGE='Forcing a format of the partition after the iLO reset.'
/>
</RIBCL>
<?xml version="1.0"?>
<RIBCL VERSION="2.23">
<RESPONSE
STATUS="0x0000"
MESSAGE='No error'
/>
</RIBCL>
<?xml version="1.0"?>
<RIBCL VERSION="2.23">
<RESPONSE
STATUS="0x0000"
MESSAGE='No error'
/>
</RIBCL>
<?xml version="1.0"?>
<RIBCL VERSION="2.23">
<RESPONSE
STATUS="0x0000"
MESSAGE='No error'
/>
</RIBCL>

Script succeeded for IP:x.x.x.x:443

I then proceeded to reset the bay 3 again. This time the server came back without any problem. Once connected to iLO, the Diagnostics page is reporting the Embedded Flash/SD-Card to be healthy with the notes "Controller firmware revision 2.10.00"

During the POST the option of pressing F11 for Boot, F10 for Intelligent Provisioning was displayed, F9 for setup were all displayed. Along with iLO4 reporting a self-test error. During POST I refreshed the iLO4 Diagnostics page and its health had changed to "Controller firmware revision 2.10.00 Failed restart."

I booted into the IP recovery media DVD. After boot a screen came up verifying system settings with a progress bar sitting at 0%. This was followed by an error saying "Error flashing the NVRAM. Please try again". After a few seconds it reverted to a text screen stating it was starting the Intelligent Provisioning flash process. This was on screen for less than 10 seconds before the server rebooted.

If you think executing the script on the OA will make a difference I'm happy to try that.

There are several iLO Log events relating to the card, these are:

  • Embedded Flash/SD-CARD: Media controller exception 01.
  • Embedded Flash/SD-CARD: Failed restart..
  • Embedded Flash/SD-CARD: The AHS file system mount failed with (No such device)

However, I've not seen anything in the iLO log regarding the formatting of the card, which according to the HP documentation I should be seeing. I also didn't see this logged on the DL380 either.

Torsten.
Acclaimed Contributor

Re: [DL 360 Gen8] ILO 4 small issue

If the format doesn't work you probably need a new board. Contact HPE support. Sorry.


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
nimbus
Occasional Advisor

Re: [DL 360 Gen8] ILO 4 small issue

No need to apologise. I really appreciate your help and support.

robatworkuk
Occasional Advisor

Re: [DL 360 Gen8] ILO 4 small issue

Hope this will help someone else as this happened to one of our Proliants and this is the first forum answer found. All the solutions were talking about hpocnfg in windows but we use it from the Vmware command line. Here's the magic incantation to format your NAND., which fixed the issue with 

Make the Force_Format.xml as per the advisory - I put mine in /opt/hp/tools:

<!-- RIBCL Sample Script for HP Lights-Out Products -->
<!--Copyright (c) 2016 Hewlett-Packard Enterprise Development Company,L.P. -->

<!-- Description: This is a sample XML script to force format ll -->
<!-- the iLO partitions. -->
<!-- iLO resets automatically for this operation to take effect -->

<!-- Warning: This command erases all data on the partition(s) -->
<!-- External providers will need to be re-configured if -->
<!-- partition is formatted -->

<!-- Input: VALUE tag: all - format all available partitions -->

<!-- NOTE:You will need to replace the USER_LOGIN and PASSWORD values -->
<!-- with values that are appropriate for your environment -->

<!-- See "HP Integrated Lights-Out Management Processor Scripting -->
<!-- and Command Line Resource Guide" for more information on -->
<!-- scripting and the syntax of the RIBCL XML -->

<!-- Firmware support information for this script: -->
<!-- iLO 4 - Version 2.42 or later. -->
<!-- iLO 3 - None. -->
<!-- iLO 2 - None. -->

 

<RIBCL VERSION="2.0">
<LOGIN USER_LOGIN="Administrator" PASSWORD="">
<RIB_INFO MODE="write"> 
<FORCE_FORMAT VALUE="all" /> 
</RIB_INFO> 
</LOGIN> 
</RIBCL>

Then run this command:

./hponcfg -v -f ./Force_Format.xml -s user=Administrator,password=password

Note the comma between username and pw. 

I now have a nice green

Embedded Flash/SD-CARD Controller firmware revision 2.10.00

GustavoAyala
Advisor

Re: [DL 360 Gen8] ILO 4 small issue

good news, with revision 8 of Document ID: c04996097 emr_na-c04996097

and ilo 2.60 it seems to be finally resolving this issue for good (it took them a really long long time this NAND fiasco)

And the option to do the NAND formatting directly from Ilo without fiddling with XML is good.

One question though, according the document

An AuxPwrCycle feature was added in iLO 4 firmware version 2.55 so that the equivalent of an AC power removal can be performed remotely on a server. Refer to xxxx for more information.

it seems an option to simulate power removal from rack servers ( DL380 gen8 and the like ) similar to the option we already have with enclosure/blades ( ssh to OA, and then Reset Bay # )

But I can't find any further information about that and I need to simulate AC power removal in a remote inaccesible rack  server DL380 G8.

Any ideas or information about this new AuxPwrCycle feature was added in iLO 4 firmware version 2.55 ?