ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

ILO4 bug - AHS turns back on after ILO Reset or cold restart

 
Highlighted
Advisor

ILO4 bug - AHS turns back on after ILO Reset or cold restart

Hi,

Just noticed that if I disable AHS under ILO4, after an ILO reset or a cold restart the AHS turns back itself again. My ILO4 version is the latest (v.2.73), the server is a Gen8 Microserver (latest BIOS). If I just warm restart the machine, it stays disabled.

Is this the normal behavior or do I need to report this via some other channel to get this fixed?

15 REPLIES 15
Highlighted
Advisor

Re: ILO4 bug - AHS turns back on after ILO Reset or cold restart

Someone checked on a DL380 gen8 server, and the issue is present there as well. So it seems that this is a general bug in ILO4, and it is not limited to the microserver.

MOD: Just to clarify why this is an issue: AHS by far writes a lot to the internal flash memory shared by ILO, IntellIgent Provisioning etc. Due to this excessive writing to the flash, it shortens the lifespan of the machine. Up until now mainly Microserver users are affected, maybe the normal servers has bigger and/or better quality flash chips, so they tolerate more writes. But this issue will affect every gen8/gen9 servers eventually. And as the AHS provides no source of information for the regular customers (especially without a support contract), it is important to have the ability to turn it off. Please keep in mind that turning AHS off does not turn off the general ILO Event Log, nor the Integrated Management Log, but those write magnitudes less to the onboard flash storage.

Highlighted
HPE Pro

Re: ILO4 bug - AHS turns back on after ILO Reset or cold restart

Hello, 

To start with AHS (Active Health System) is "Always on" feature. I am pretty sure what you saw is as per design.
The HPE Active Health System provides the following features:

-Combined diagnostics tools/scanners.
-Always on, continuous monitoring for increased stability and shorter downtimes.
-Rich configuration history.
-Health and service alerts.
-Easy export and upload to service and support.


The active health system monitors and records changes in the server hardware and system configuration.
The active health system assists in diagnosing problems and delivering rapid resolution if server failures occur.

The active health system collects the following types of data:

Server model.
Serial number.
Processor model and speed.
Storage capacity and speed.
Memory capacity and speed.
Firmware/BIOS.


If you still want to try disabling AHS you can try disabling Intelligent Provisioning:
https://support.hpe.com/hpesc/public/docDisplay?docId=mmr_kc-0108654

NOTE: At HPE we donot recommend customer to disable AHS or Intelligent Provisioning.

 


I work for HPE

Accept or Kudo

Highlighted
Advisor

Re: ILO4 bug - AHS turns back on after ILO Reset or cold restart

First of all, thanks for your answer.

None the less I have some issues with it:

1. If it is an "always on" feature, why is it that you can explicitly disable it on the ILO4 webgui? And if it is possible to disable it, why is it turns back on despite the fact it was explicitly disabled by the user?

2. To my best knowledge the AHS logs cannot be viewed by anyone other than HPE support. So if you have no support contract, this log is completely useless, the system admin cannot extract any information out of them. Therefor there is no reason to keep it, especially with the condition that it writes the on-board flash extensively.

3.  I don't see how disabling Intelligent Provisioning will fix this issue with AHS, but I am going to give it a try, and report back the result.

MOD: disabling Intelligent Provisioning has no effect as expected: after an ILO Reset (which is a restart), or a complete power cycle AHS switches iteslf back on again.

@Jazz_ISScan you please let me know how (and where) to file a proper bug report about this? I went thorugh the support section, but found no channel or form for this...

Highlighted
HPE Pro

Re: ILO4 bug - AHS turns back on after ILO Reset or cold restart

You can log a new case, check the status / update details or raise a call back request on an existing case using the link provided : HPE Support Case Manager

https://support.hpe.com/help/en/Content/productSupport/supportCaseManager.html

 


I work for HPE

Accept or Kudo

Highlighted
Advisor

Re: ILO4 bug - AHS turns back on after ILO Reset or cold restart

OK, what if I dont have a support contract nor warranty on the product anymore? Where should I report a bug in that case?

Highlighted
Acclaimed Contributor

Re: ILO4 bug - AHS turns back on after ILO Reset or cold restart

MOD: Just to clarify why this is an issue: AHS by far writes a lot to the internal flash memory shared by ILO, IntellIgent Provisioning etc. Due to this excessive writing to the flash, it shortens the lifespan of the machine. Up until now mainly Microserver users are affected, maybe the normal servers has bigger and/or better quality flash chips, so they tolerate more writes.
 
This was the problem of the older versions. Meanwhile with a current ILO version the systems takes care about.
 


At least one of the related advisories has the details.

Example:

https://support.hpe.com/hpesc/public/docDisplay?docLocale=en_US&docId=emr_na-a00060052en_us

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Highlighted
HPE Pro

Re: ILO4 bug - AHS turns back on after ILO Reset or cold restart

Out  of warranty/contract customer can still open a case with HPE by calling the techincal support hot line.

HPE does provide TRADE support ( Pay for 1 time 1 case basis ) but again support will remain limited.

Based on your input if this issue need to be addressed by ILO Developer you need to have an active contract.

 


I work for HPE

Accept or Kudo

Highlighted
Advisor

Re: ILO4 bug - AHS turns back on after ILO Reset or cold restart

OK, at this point this starts to be a bit ridiculous.

So according to HPE, if I find a bug in their product (which was already there when we bought the product by the way) and have no support contract or warranty, I have to pay for them to recongine and fix it?

@Torsten. 

Well, how they "addressed" it, is that in case the AHS already excessively overwritten the internal flash, now under ILO health a hidden option will pop up and allow you to reformat the internal flash, which in some case helps, in others it does not. Obviously if the flash is damaged beyond repair due to excessive writes, a reformat will not help at all. And there was no information about if the new ILO has an improved wear level algo. And AHS writes exactly the same amount to the flash as before, so the basic issue (extensive writes to the internal flash by AHS) is not solved at all. The only option to solve this is via disabling AHS (or have an option in ILO to log to the microSD card), which is not possible due to this bug.

MOD: On Torsten's link there are a few important informations:

The optimal block size NAND write algorithm was added to extend the longevity of the NAND. The write frequencywas reduced by over 98% with the write algorithm change.

So it seems they actually did optimized the wear level algorythm. Non the less for the customer there is no information how much damage this "98% more than what needed" writes are already did to the flash chip.

And the document also confirms that indeed AHS generates the most writes to the flash chip:

The most frequent writes to the NAND device in HPE servers are AHS information. AHS writes are initiated by the iLOfirmware.

So it is still important to be able to permanently turn off AHS.

Highlighted
Advisor

Re: ILO4 bug - AHS turns back on after ILO Reset or cold restart

I also confirmed the same bug is present on DL360 Gen9 variants (with latest ILO and BIOS).