HPE Community read-only access December 15, 2018
This is a maintenance upgrade. You will be able to read articles and posts, but not post or reply.
Hours:
Dec 15, 4:00 am to 10:00 am UTC
Dec 14, 10:00 pm CST to Dec 15, 4:00 am CST
Dec 14, 8:00 pm PST to Dec 15, 2:00 am PST
ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

DL380 Gen9 - Fans running at high speed & POST errors

 
Graham Barnes
Occasional Advisor

DL380 Gen9 - Fans running at high speed & POST errors

We have 2 identical DL380 Gen9 servers running as a pair of 2012 RDS servers.

Each with...
128GB RAM
2 x Xeon E5-2650v3 @ 2.30GHz
RAID 1 - 2 x SAS 15K 450GB
iLO configured

They are rack mounted, one on top of the other in an air-conditioned server room.

These were installed and commissioned July 2016.

We are experiencing an odd problem when they are required to restart, however it doesn't happen all the time but too often to ignore.

If one is restarted, the POST will put the internal FANs into high speed mode. This triggers the neighbouring DL380 to run its own fans at high speed too!

Not only does this happen, the DL380 that is restarting will not complete its POST. It will report errors such as…

"329-Power Management Controller FW error" and "312_HPE Smart Storage Battery 1 Failure"  Contact HPE support.

This also happens Vice Versa.

Yesterday evening both servers were restarted at the same time, following the Quick System diags (no errors found) via the iLO remote management interface and they both failed to start properly. They both stopped at 52% POST, both iLO ports replied to pings but were not accessible to manage.

The only way to overcome this problem was to remove all AC power cables (Redundant PSU present) and restart the servers individually. They started, no errors reported during POST and no errors reported in the IM Log.

Both Servers have been updated using the HP SUM April 2016 DVD in August but the problem still happens.

On a couple of occasions before the latest HP SUM update, the fans on both servers would run at high speed during normal operation for unknown reasons. After an hour or so, they both quietened down again.

How can one DL380 influence the other? are they psychic or are they communicating via the iLO somehow and how do I resolve this??

Just to throw something else in the mix, one of them had a Red Screen of Death RSOD yesterday during normal operation - x64 Exception Type 0E - Page-Fault Exception.

I will  raise a support ticket with HPE but any advice would be greatly appreciated.

Thanks,

Graham

8 REPLIES
Torsten.
Acclaimed Contributor

Re: DL380 Gen9 - Fans running at high speed & POST errors

I don't remember the details now, but there was a battery related issue fixed with BIOS 2.22 and ILO 2.44 - consider to update both.


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Graham Barnes
Occasional Advisor

Re: DL380 Gen9 - Fans running at high speed & POST errors

Thanks, I will schedule the updates tomorrow evening and let you know the outcome.

Graham Barnes
Occasional Advisor

Re: DL380 Gen9 - Fans running at high speed & POST errors

Both servers were running their fans at high speed again yesterday. HP System Management Homepage reported no problems, however the iLO log file reports dozens of "Power restored to iLO" caution messages.
The only way to stop the fans running at high speed was to power cycle the servers.
The fans are now running at normal speed and there have been no more "power restored to iLO" messages.

There is an HPE advisory c04703868, but refers to an older iLO firmware.

Any suggestions/comments ?

Thanks

 

Jimmy Vance
HPE Pro

Re: DL380 Gen9 - Fans running at high speed & POST errors

Not sure what is causing the error message you are seeing, but when iLO gets into an unknown or error state the fans will ramp up to high speed




__________________________________________________
No support by private messages. Please ask the forum!      I work for HPE

If you feel this was helpful please click the KUDOS! thumb below!   
Graham Barnes
Occasional Advisor

Re: DL380 Gen9 - Fans running at high speed & POST errors

Thanks, interesting to know the fan speed ramps up wihen there is a problem with the iLO.
Whilst investigating the iLO status on both servers, I also noticed "Embedded Flash/SD-CARD: Embedded media manager failed initialization" errors.
I have followed advisory c04996097 and reformatted the NAND device - something I didn't do when I updated the iLO firmware to 2.44
I now see green ticks on all subsystems and devices.

Fingers crossed!

Torsten.
Acclaimed Contributor

Re: DL380 Gen9 - Fans running at high speed & POST errors

>> interesting to know the fan speed ramps up wihen there is a problem with the iLO.

 

ILO read the temperature sensors and control the fan speed depending on these values. WIthout ILO control (or without temperature values from a sensor) the fans will go at high speed.

You can see/hear this for example when upgrading smart array firmware during the controller reset.

The issue you have seen will force the ILO to restart or not come up, hence the ILO cannot slow down the fans.


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Graham Barnes
Occasional Advisor

Re: DL380 Gen9 - Fans running at high speed & POST errors

Hi Again,

We still have the issue on both servers. Since they are running TS2012 RDS, they are scheduled to restart @ 23:00 very night (at IT Manager's request).
This morning both servers were running high speed fans. Below is an extract of the iLO event log this morning from one of the servers.
Both servers quietened down following the "Rest API memory cleared" message

2541 Informational  09/20/2016 08:25 1 Browser login: Administrator
2540 Informational 09/20/2016 07:02 1 Rest API Warning: Rest API memory cleared (iLO master storage directory was rebuilt. Reboot system.).
2539 Critical  09/20/2016 07:01 1 Embedded Flash/SD-CARD: Embedded media manager failed initialization.
2538 Caution  09/20/2016 06:58 1 Power restored to iLO.
2537 Informational  09/20/2016 06:42 2 iLO network link up at 1000 Mbps.
2536 Caution 09/20/2016 06:42 09/20/2016 06:42 1 Power restored to iLO.
2535 Caution  09/20/2016 05:43 1 Power restored to iLO.
2534 Caution  09/20/2016 05:37 1 Power restored to iLO.
2533 Informational  09/20/2016 05:21 3 iLO network link up at 1000 Mbps.
2532 Caution  09/20/2016 05:21 1 Power restored to iLO.
2531 Caution  09/20/2016 04:44 1 Power restored to iLO.
2530 Caution  09/20/2016 04:38 1 Power restored to iLO.
2529 Caution  09/20/2016 04:33 1 Power restored to iLO.
2528 Informational  09/20/2016 04:01 4 iLO network link up at 1000 Mbps.
2527 Caution  09/20/2016 04:01 1 Power restored to iLO.
2526 Caution  09/20/2016 03:55 1 Power restored to iLO.
2525 Caution  09/20/2016 03:50 1 Power restored to iLO.
2524 Caution  09/20/2016 03:23 1 Power restored to iLO.


Any other suggestions on how to resolve this problem? Both servers are identical in age and specification.

Thanks.

 

Graham Barnes
Occasional Advisor

Re: DL380 Gen9 - Fans running at high speed & POST errors

So, last week, HP changed the system board on one of the servers and things seems to have settled down until yesterday!
The fans inside the server that didn't have its system board changed were running at full speed which at first i thought was a positive step.
However, yesterday I also connected 2 other DL380 Gen9 servers' iLO NICs into the same switch, same subnet, now all 4 DL380 Gen9 servers are running high speed fans!!
What's going on??????