ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

Random Error Out of the Blue: Uncorrectable PCI Express Error

 
jeffearls
Occasional Visitor

Random Error Out of the Blue: Uncorrectable PCI Express Error

Woke up to a stalled server, checked the iLo log and found:

 

System Error 09/18/2014 03:56 09/18/2014 03:56 1
 

Unrecoverable System Error (NMI) has occurred. System Firmware will log additional details in a separate IML entry if possible

 
PCI Bus 09/18/2014 03:56 09/18/2014 03:56 1


Uncorrectable PCI Express Error (Slot 2, Bus 0, Device 1, Function 0, Error status 0x00014000)

 

We're running VMware on the machine...  I jumped in vCenter to track down the device and found:

 

Intel Corporation Sandy Bridge IIO PCI Express Root Port 1a #1

 

Segment Number: 0
Bus Number: 0
Device Number: 1
Function Number: 0
Capabilities: Bridge Subsystem ID, MSI, PCI Express, Power management
PCI Device ID: 0x3c02
Device ID: PCI 0:0:1:0
Vendor ID: 0x8086
Subsystem ID: 0x0
Subsystem Vendor ID: 0x0
Secondary Bus Number: 7

 

The only thing we've changed from the stock server is the addition of a FusionIO 320GB PCIe SSD, about a year ago.

 

Per this thread: http://h30499.www3.hp.com/t5/ProLiant-Servers-ML-DL-SL/DL380p-Gen8-with-uncorrectabl-PCI-express-error/td-p/5995669#.VBrE8vldVRg

 

I checked our System ROM, and we're at 02/25/2012, four days after the suggested version.

 

Thoughts?

 

How can I physically identify "PCIe Root Port 1a #1" to see what's plugged into that might have generated the error?

 

Thanks!!

 

Jeff

 

 

 

 

1 REPLY
RbBsmn
Regular Advisor

Re: Random Error Out of the Blue: Uncorrectable PCI Express Error

Update your firmware, BIOS and iLO, then perform an nvram clear through Intelligent Provisioning.

If the Fusion IO card is installed in slot 2 I would recommend creating an AHS log, vm - support log and IO bug report and contact HP to let an L2 check for issues with the IO card.
Did my post help? Thank me with kudo's! :)