ProLiant Servers (ML,DL,SL)

DL380p Gen8, Freenas 11.3 / Uncorrectable PCI Express Error (Crash and Reboot)

 
SOLVED
Go to solution
Seimari
Occasional Collector

DL380p Gen8, Freenas 11.3 / Uncorrectable PCI Express Error (Crash and Reboot)

Hello,

I have Proliant DL380p Gen8 with:

2x E5-2637 v2 processors

256GB 16x16GB PC3-14900R DDR3 ECC (non HP SmartMemory)

1x Chelsio T520-CR nic

1x LSI 9305-16i HBA

1x Samsung 970 PRO 1TB SSD with NVME PCI-E adapter.

10x Crucial MX500 2TB sata SSD @ HP SFF cages connected to LSI 9305-16i

OS installed: Freenas 11.3 (FreeBSD)

System crashes and reboots:

---

EVENT (05 Mar 11:08): Unrecoverable System Error (NMI) has occurred.  System Firmware will log additional details in a separate IML entry if possible

Integrated Management Log Severity: CRITICAL

iLO URL: *****

iLO IP: ******

iLO Name:  *****

iLO 4 firmware:  2.72 Oct 20 2019

Server Model:  DL380p Gen8

System ROM:    P70 05/24/2019

---

EVENT (05 Mar 11:08): Uncorrectable PCI Express Error (Slot 6, Bus 32, Device 2, Function 0, Error status 0x00004000)

Integrated Management Log Severity: CRITICAL

--

I'm quite sure that Slot 6 is LSI 9305-16i.

I think that this crash happens when Freenas increases ARC (ram) cache and there are very little free RAM left. 

I have Intel QPI enabled and QPI Bandwidth Optimization (RTID): Optimized for I/O. I haven't tried yet disabling QPI.

This crashing happens once or twice every day.

3 REPLIES 3
SanjeevGoyal
HPE Pro

Re: DL380p Gen8, Freenas 11.3 / Uncorrectable PCI Express Error (Crash and Reboot)

Hello,

As per shared detail, server bios updated with the latest.

I would request you to log a case with the HPE because of one of the components occurring the issue.HPE team will collect AHS and they will share further action plans or replace suspected parts based on the AHS logs.

If you feel this was helpful please click the KUDOS! thumb below!   

Regards,

 

 


I am a HPE Employee

Accept or Kudo

Seimari
Occasional Collector

Re: DL380p Gen8, Freenas 11.3 / Uncorrectable PCI Express Error (Crash and Reboot)

After couple of tries and failures i did get uploading AHS done to HPE website.

Now it says: "Server is not Entitled. Please check these options for renewing your license."

Is that HP Proliant support license or which one?  Can i buy it online?

Quite weird that i cannot troubleshoot my server without a license.

 

Seimari
Occasional Collector
Solution

Re: DL380p Gen8, Freenas 11.3 / Uncorrectable PCI Express Error (Crash and Reboot)

Status update:

- updated Freenas to 11.3-U2.1
- changed no-execute memory protection to disabled (bios)
- changed Intel QPI Bandwidth Optimization (RTID) setting from Optimized for I/O to balanced (bios)

Now it works and it doesn't crash anymore. I don't know which one above changed this.