ProLiant Servers (ML,DL,SL)
1819777 Members
3880 Online
109607 Solutions
New Discussion юеВ

Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFUSE

 
SOLVED
Go to solution
Explorermf1
Occasional Advisor

Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFUSE

Hi!

I have DL 380 Gen10 plus with double P38997-B21 (P39384-001 ) 1600 watt. firmware 2.00. I try to use Nvidia A100 (model p100b) with GPU riser and when we power on the server we receive the error:

Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFUSE (10h))

I would appreciate any idea.

 

8 REPLIES 8
Suman_1978
HPE Pro

Re: Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFU

Hi,

Please make sure you are following the prerequisites for adding this card.  Refer to the QuickSpecs of this server, page# 61, 62

A similar issue is discussed here but for a different model.
A Maximum of Two HPE NVIDIA Quadro RTX 6000/RTX 8000 Graphics Accelerators Is Supported

Thank You!
I work with HPE but opinions expressed here are mine.
Recent Support Video Releases



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
BPSingh
HPE Pro

Re: Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFUSE

Greetings!

Is the server working fine without the GPU card installed?

I assume you are using HPE certified "NVIDIA A100 80GB PCIe Non-CEC Accelerator for HPE R9P49C"

Please refer to the notes section on page# 61 which is important  https://www.hpe.com/psnow/doc/a50002553enw?jumpid=in_lit-psnow-red

If you feel everything is in place then I believe that the logs should be analyzed to investigate the issue.



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Explorermf1
Occasional Advisor

Re: Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFU

Hi Suman_1978. 

thank you for refer. This happens even with single A100, both first and second riser, let alone two or three more cards.  And this link for Runtime Fault but I have Power On Fault. I guess there is a difference. Correct me if I am wrong.

Explorermf1
Occasional Advisor

Re: Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFU

Hi BPSingh,

thank you for refer. Server working fine without the GPU card. Server working fine with the GPU card but without riser gpu power cord. iLO device invventory shows "unknown" product name but status is OK. And WS does not see A100 et all.  I suppose that due to absence of additional power. But ll riser slots working fine (check using network and HBA adaptors).  I'm starting to think that this is GPU 8p Keyed Cable Kit. I want to change it and, anyway, I will write about that here.

BPSingh
HPE Pro

Re: Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFU

Greetings!

Thank you for the additional information. 

If the issue continues to occur, then as mentioned earlier, the logs would need to be analyzed for further investigation. I would advise you to open a support case to check server configuration, Error signature etc. 



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Explorermf1
Occasional Advisor
Solution

Re: Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFU

@BPSingh Hi BPSingh,

thanks again for refer. I finally have been able to launch DL 380 Gen10 plus. The power cable were causing the problem. Priveosly, as it turned out, I used power cord for nvidia tesla m60, but it not compatible with pinout nvidia A100. I'm honestly surprised DL 380 Gen10 plus is so smart to protect accelerators and itself from burn-out. Wounderfull work HPE! I obtained cable P/N 869820-001 (spare No: 875097-001) and problem was solved. Both A100 work stable.

At first, I just wanted to buy HPE DL300 Gen10 Plus GPU 8p Keyed Cable Kit (P39102-B21), this was not as easy a task as I had first thought. However, I'm sure that cable P/N 869820-001 from this Cable Kit. Does anyone know there's got to be something else? As I understand it, Kit also contains some "SPS-BAFFLE System", but I cant to find P/N.

jsairt
Visitor

Re: Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFU

Hello,  
I ordered this cable for my Proliant 380 gen 9,  I am running into the same issue with my RTX3050.  Any Idea where to find the proper pinlayout for connecting those gpu power pins, https://www.amazon.com/gp/product/B08JTPPTTM?ref=ppx_pt2_dt_b_prod_image ?  

hope someone can point me to this info... 

thx.

Regards,
J

Sunitha_Mod
Moderator

Re: Server Critical Fault (Service Information: Power On Fault, System Board, AUX/Main EFU

Hello @jsairt,

Thank you for writing to us! 

You might want to consider creating a new topic by utilizing the ""New Discussion"" button, as this will not only enhance visibility compared to the old topic but also boost your chances of receiving responses from experts.



Thanks,
Sunitha G
I'm an HPE employee.
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo