ProLiant Servers (ML,DL,SL)
1825727 Members
2592 Online
109687 Solutions
New Discussion

Re: ML30 Gen10 Fun with Fans...

 
BunnyPon
Valued Contributor

ML30 Gen10 Fun with Fans...

ML30 Gen10 broadcom lan version, a bit old and out of warranty but it doesn't matter.

Latest Debian, amsd, Removed the extra 4port LAN, p408i (7.11)

HP700 series SSD on the raid controller BUT being used as JBOD.

Here you can see it just ticking over doing nothing, 11%, 48% and 9% This time the deafening racket is caused by FAN2.

The assumption here is that the SSDs, not having temperature SMART entries, are causing the p408 to throw a wobbly and run the fans much faster.  Would someone from HPE please confirm that.

Also, as there are several temperature options in the SMART specifications, *WHICH* ones doe the raid controller need and WHY?

  • 190, 0xBE "Airflow Temperature" supposedly 100-temp'C
  • 192, 0xC2 "Temperature" actual device temperature in 'C

Why there is no fallback to 0xC2 if the device has no 0xBE is also a good question for whoever made this firmware and I would really like an answer to it.

 

 

Now, what _is_ interesting is to add other drives. In this case, any old spinning rust, don't even need to use them. Then FAN2 decides to go on holiday. and it goes down to 11%, 6%, 6% which is more like it. HOWEVER,, none of the drives I stuffed in have 0xBE either, just 0xC2

 

(Update 16:20) In this case, Redfish temperature sensor "04-HD Max" whereever that comes from, turns out to be the culprit. As long as you have that, quiet fans. (e.g. plug in a random wd blue ssd, HD-Max reports 35'c, blissful silence. Remove it. and FAN goes to 48%.)

(Update 16:50) Adding a WD Blue drive to PORT1 (think drives 1..4 of the SFF unit) of the raid controller has NO EFFECT. Putting that same drive on PORT2  (5..8) brings back 04-HD Max" and silence resumes!

 

In short, what a mess and there really needs to be some sort of document on what makes the fans go zoom.

 

 

 

I can't Cat Today.
10 REPLIES 10
TVVJ
HPE Pro

Re: ML30 Gen10 Fun with Fans...

Hello,

Swap known working good fan onto slot 2 and see if the issue is resolved. If so, then the issue may be with the fan.

Regards,



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[All opinions expressed here are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
BunnyPon
Valued Contributor

Re: ML30 Gen10 Fun with Fans...

Plus point for not being an AI getting it completely wrong.

As of now, I have verified that it is not the fan. (from another HP that thinks it is a Hitachi that NO ONE KNOWS HOW TO FIX AT ALL. The only other case, even HP swapped the motherboard rather than address the problem.)

Fan 2 going ZOOM is a function of 04-HD Max being present or not. 

The real mystery is where that comes from, what halluciates that value for no apparent reason when you plug an SSD (wd blue) into any slot on PORT2 of the raid controller. Verified several times with different drives too.

I can't Cat Today.
PR7
HPE Pro

Re: ML30 Gen10 Fun with Fans...

Greetings!
Please let us know if you are using genuine HPE drives?
We cannot predict server behaviour when using Non-HPE drives, as they are not controller by HPE firmware. 
We have had similar issues in the past where replacing with HPE drives resolved the fan noise issues.
Recommend you to use HPE drives and test the same and keep us posted on the outcome.



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
BunnyPon
Valued Contributor

Re: ML30 Gen10 Fun with Fans...

I do not believe humans can afford genuine HP drives, the only thing that is more expensive are those little ink cartidges. Especially not hobbists and especially not at the usury HPE Japan charges.!

If I do use (6 year old SATA 1TB drives from Yahoo Auctions, genuine HPE) then the machine is happy because 04-HD Max.

 

The magic is that 04-HD Max, as long as that exists, the fans behave. If that goes away, it's turbine time!

Hence the request for more information on where 04-HD Max comes from.

Screenshot 2024-10-19 181145.png

What you can see here is from the second ML30 Gen10 (broadcom lan) Fans and now with 04-HD Max added for luck.

[    0.000000] DMI: HPE ProLiant ML30 Gen10/ProLiant ML30 Gen10, BIOS U44 08/01/2024
[    2.730545] scsi 6:0:0:0: Direct-Access     ATA      HP SSD S700 500G 9A0  PQ: 0 ANSI: 6
[    2.799724] scsi 6:0:1:0: Direct-Access     ATA      HP SSD S700 500G 9A0  PQ: 0 ANSI: 6
[    2.893937] scsi 4:0:0:0: CD-ROM            HPE      DVDROM DUD0N     UMD1 PQ: 0 ANSI: 5
[    2.969901] scsi 6:0:2:0: Enclosure         HPE      Smart Adapter    7.11 PQ: 0 ANSI: 5
[    3.001277] scsi 6:2:0:0: RAID              HPE      P408i-p SR Gen10 7.11 PQ: 0 ANSI: 5
...
[ 5829.330460] scsi 6:0:11:0: Direct-Access     ATA      WD Blue SA510 2. 6100 PQ: 0 ANSI: 6
[ 5829.478506] sd 6:0:11:0: Attached scsi generic sg6 type 0
[ 6063.312922] scsi 6:0:12:0: Direct-Access     ATA      WD Blue SA510 2. 6100 PQ: 0 ANSI: 6
[ 6063.473257] sd 6:0:12:0: Attached scsi generic sg5 type 0



As you can see here, a genuine HP drives (not HPE) 

What is interesting is that if I unplug the two WD drives, 04-HD Max goes away and it is fun with fans time! This happens within about 30 seconds of unplugging them and about a minute or so later, telegraf running on the other machine notices and the results are quite loud. Here is a sample graph showing the result of pulling out those drives for 5 minutes. Note the other temperatures all benefit from turbine mode.

55% on the rear fan is, while shy of the boot time take-off thrust, unbearable. 

Screenshot 2024-10-19 182724.png

I lack the energy to try again with the NVMe, just like the documentation / API does not tell me how to delete the thrice accursed "Hitachi Profile"

 

 

 

 

I can't Cat Today.
PR7
HPE Pro

Re: ML30 Gen10 Fun with Fans...

Greetings!
The "04-HD Max" sensor in HPE servers typically refers to a temperature sensor that monitors the maximum temperature of hard drives installed in the server. This sensor helps ensure that the drives operate within safe temperature limits to prevent overheating, which can lead to drive failure and data loss.

When using third-party hard drives in HPE servers, the "04-HD Max" sensor may not function as intended. HPE servers are designed to work optimally with HPE-certified drives, and using non-certified or third-party drives can lead to several issues which cannot be predicted as they are not tested.

Thank you for understanding.



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
BunnyPon
Valued Contributor

Re: ML30 Gen10 Fun with Fans...

04-HD Max is a completely made up value though, I want to know where it comes from.

Does it come from the raid controller (being just a JBOD in this instance.) or from the ILO itself?

Why, for example, with the P408i, does NOT having anything plugged into port 2 make it disappear? Plugging anything in though, and Hello Sailor! Be they genuine HPE drives or affordable alternatives.

------

As they say... The Porridge thickens.

Today I reinstalled everything, just the same as last time care of copy paste, just instead of HP 700 SSD, I used WD Blue.  amsd, etc. And it was Turbo Fans again! 11/42/8%

plugged in a spare intel DC3500 I had, and 04-HD Max shows up at "35'c" Which is unrealistically high, and the fans slow down to 11/6/6%

# it does not appear to be amsd
194 Temperature_Celsius     0x0022   100   100   050    Old_age   Always       -       23
194 Temperature_Celsius     0x0022   100   100   050    Old_age   Always       -       24
194 Temperature_Celsius     0x0022   100   100   014    Old_age   Always       -       27 (Min/Max 20/35)
194 Temperature_Celsius     0x0022   100   100   014    Old_age   Always       -       28 (Min/Max 21/35)
190 Temperature_Case        0x0022   075   075   000    Old_age   Always       -       25 (Min/Max 18/25)
194 Temperature_Internal    0x0022   100   100   000    Old_age   Always       -       32

Here you can see smart on all 5 drives. the first 2 are HP, second 2 are WD (194 only) the last one is Intel (190 and 194)

It's not amsd, because I just killed it. But who could it be at 35'c? Well, it's recorded as the max for the two WD 1TB drives.

So anyone attempting to blame AMSD for the fans is, I am afraid, barking up completely the wrong tree.

What is interesting is that restoring the machine back to the previous drives.... Turbine mode again. With exactly the same hw. So I am quite mystified as to what changed, if anything.

 

 

I can't Cat Today.
PR7
HPE Pro

Re: ML30 Gen10 Fun with Fans...

Greetings!
I appreciate the extensive troubleshooting and research you have conducted to isolate this issue. I kindly request that you submit a support ticket and share the AHS logs so that wwe can investigate the matter further.
Thank you for understanding.



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
BunnyPon
Valued Contributor

Re: ML30 Gen10 Fun with Fans...

Case number: 

5385693662

I can trivially reproduce this on 2 different ML30's and the DL20. One of the ML30's has a completely unrelated issue, it thinks it is a hitachi, which, while adorable, makes it only fit for spare parts.

The device you inquired about above is not a device supported by HPE.

Can you please ask your server purchaser to support it?

At this point, I would just like to thank the HP engineer who closed this, UNREAD because apparently A serial number that I could register AND is recognised by your system is NOT actually an HP, despite saying that on the motherboard, 

This fun with fans problem is trivial to reproduce on GENUINE ML30 Gen 10 that actually THINK they are HP's and not HITACHI or NEC too.  Or even on a DL20. 

The only reason to use HP's for anything is the rather wonderful ILO. He just put me right off.

 

I can't Cat Today.
Sunitha_Mod
Honored Contributor

Re: ML30 Gen10 Fun with Fans...

Hello @BunnyPon,

We deeply regret the inconvenience caused. We have reported the issue and the concerned team will reach out to you. 

BunnyPon
Valued Contributor

Re: ML30 Gen10 Fun with Fans...

The saga continues with yet another complete and utter non-answer / blaming the fact that OTHER HARDWARE will make the fans spin.

(Case ID is 5385722700 for those HPE employees needing a laugh.)

At this precise moment in time, "DEEPLY UNIMPRESSED" would be an understatement and about the only think I can say is that if dell made a cute box with full remote controllability like the ML30, they would get my money. They don't, so they haven't. YET. That's how miserable it is.

 

This is the current environment. Summer is long gone and the chances of this hardware getting to 35'c within the next 7 months are small..

Note that 04-HD MAX is coming from the WD SSDs smart 194, but if the two Intel drives were not there, there would be NO 04-HD MAX and the fan would be roaring. This has been demonstrated a number of time. and I'm still getting the runaround.

It isn't as if HPE can't do this sort of simple test themselves. Pretty much any old SSDs will do.

Screenshot 2024-11-28 165452.png

I can't Cat Today.