- Community Home
- >
- Servers and Operating Systems
- >
- HPE ProLiant
- >
- ProLiant Servers (ML,DL,SL)
- >
- Re: DL380 G8 critical fault
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-01-2021 10:46 AM
тАО01-01-2021 10:46 AM
DL380 G8 critical fault
Hello
Proliant DL380 G8 had powered off mysteriously 1 or two times at about 00:00 at night. Booted back up allright by just pressing power button. Now happened again and cannot start anymore. It does try for a fraction of a second but instantly cuts power and system led flashes red, which means system critical fault, right?
Ilo is the only thing that is working. Cannot switch to backup bios because the boot process has to begin to be able to switch. I tried removing everthing from the board, still only powers on for 1/4 sec.
I did download IML log from ILO and have a code that needs decoding, (see below, error 98) also interesting that system time apparently got reset to 1/1/1970 on the last "98" error after the final crash.
I have tried removing cmos battery, shorting pins, jumper nr. 6 on, removing every component from the mainboard but still shutdown in 1/4 sec.
Currently I have the mainboard loose in my hands and visually inspecting, dont see anything suspicious.
"ID","Severity","Class","Last Update","Initial Update","Count","Description",
"98","Critical","Power","01/01/1970 00:01","12/28/2020 21:54","12","System Power Fault Detected (XR: 14 00 MID: FF 0D FC 0E C0 FF FF 2F 2F 0C 0C 00 9C 20 00 01 03 15 00 00 00 00 00 00 00 00 00 00 00 00 00 00)",
"97","Repaired","Power","11/25/2020 23:54","11/25/2020 23:53","1","System Power Supplies Not Redundant",
"96","Repaired","Power","11/25/2020 23:54","11/25/2020 23:53","1","System Power Supply: Input Power Loss or Unplugged Power Cord, Verify Power Supply Input (Power Supply 2)",
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-02-2021 08:01 AM
тАО01-02-2021 08:01 AM
Re: DL380 G8 critical fault
Situation update:
Had the motherboard out for inspection, all looked well. Power supply still cut power immediately when startup happens.
I took another 12V power supply and put wires to the incoming PSU rails and force fed the board 12V for a couple of minutes. After this it started staying on by itself. Took away feeding wires and the board now seems to have fixed itself, at least for now. Server is up and running, will see if it holds.
Since nothing really seems to have been wrong or broken, I'm confused as to why this happened.. Would still be nice to get that error code decoded, for future reference.
- Tags:
- pdate
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-03-2021 06:01 AM
тАО01-03-2021 06:01 AM
Re: DL380 G8 critical fault
Server was up for maybe 2 days but now shutdown again with error
System Power Fault Detected (XR: 14 00 MID: FF 0D FC 0E C0 FF FF 2F 2F 0C 0C 00 9C 20 00 01 03 15 00 00 00 00 00 00 00 00 00 00 00 00 00 00)
What is this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-03-2021 06:22 AM
тАО01-03-2021 06:22 AM
Re: DL380 G8 critical fault
Restarted the server and now system fans 5 and 6 are running at 78% and 97% speed, was 27% before.
20-VR P1 Mem sensor now says temp is 96 degrees C, massive increase since last boot and probably a faulty reading.
Mem VR sensors 19, 21, 22 are at 23C. So this error might have something to do with the 20-VR P1 Mem sensor or voltage regulator?
pls, does someone know what the error code means?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-06-2021 06:19 AM
тАО01-06-2021 06:19 AM
Re: DL380 G8 critical fault
Hello Vesil400,
The error code mentioned doesnтАЩt necessarily correspond to the PSU failure and the error is pointing towards an issue with the system board of the server
However suggestion is to bring the server to minimum configuration with One Processor or One memory module and no PCI cards connected to the server and check the server status
In case the server continues to work in minimum configuration then start adding the components back to the server and check if any particular component causes the failure.
Also please ensure that the server is updated with the latest Service Pack for Proliant and the firmwares including BIOS,ILO, etc. are all on the latest versions.
Are there any Non HPE parts also being used on the server? Please confirm the same as well.
Thanks
I work for HPE
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-06-2021 11:31 PM
тАО01-06-2021 11:31 PM
Re: DL380 G8 critical fault
Hello
Yes I have tried minimum configuration, didnt work either with one cpu, one ram stick and no risers.
Currently server start then fails at random intervals, maybe instantly or 1 day..
Any info what the 20VRM memory temp sensor looks like and where it is located? Obviously giving a false reading..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-08-2021 05:09 AM
тАО01-08-2021 05:09 AM
Re: DL380 G8 critical fault
Hello Vesil400,
Ideally the sensors would be the part of the system board.
Based on the error message you have been receiving and since the server is not stable, this looks like an issue on the system board.
I would suggest to open a case with the technical support with the hardware logs for detailed analysis and part replacement within warranty if need be.
Thanks
I work for HPE
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-09-2021 06:25 AM
тАО01-09-2021 06:25 AM
Re: DL380 G8 critical fault
It can also be a problem of a defective SID (the display whats telling whats wrong: System Insight Display).
Please disconnect the small flat cable to the SID .. If the server then stay to work (it works wthout it), the SID is probably defecti.ve. It can be replaced by SPN: 662515-001 (DL380 with 3,5 '' disks) OR 662516-001 (DL380 with SFF/2,5 '' disks)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-09-2021 06:38 AM
тАО01-09-2021 06:38 AM
Re: DL380 G8 critical fault
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-09-2021 06:44 AM - edited тАО01-09-2021 06:45 AM
тАО01-09-2021 06:44 AM - edited тАО01-09-2021 06:45 AM
Re: DL380 G8 critical fault
Server has been running 2 days without crash now, probably random but happy its working at least for a while.
Lacking other obvious problems, what about the VR P1 Mem temp sensor. Is it capable to shut down the system without logging any error for doing so? Apparently the sensor is defective, sometimes about 60-70 degrees and currently reading 90+
All the heatsinks are cold to the touch, and ambient temp is actually under 10C