ProLiant Servers (ML,DL,SL)
1846667 Members
4332 Online
110256 Solutions
New Discussion

/tmp/.gpu.2.xml: <dram_uncorrectable>3</dram_uncorrectable> one of the GPU shows this

 
SantoshLohar
New Member

/tmp/.gpu.2.xml: <dram_uncorrectable>3</dram_uncorrectable> one of the GPU shows this

Hello All,

We have a latest HPE server DL385 Gen11 wiht Nvidia GPU's.

We are getting this error on one of the GPU.

/tmp/.gpu.2.xml:                                <dram_uncorrectable>3</dram_uncorrectable>

On ILO console there is no error/alert reported .

What can be done here 

 

Regards,

Santosh Lohar

2 REPLIES 2
BPSingh
HPE Pro

Re: /tmp/.gpu.2.xml: &lt;dram_uncorrectable&gt;3&lt;/dram_uncorrectable&gt; one of the GPU shows this

Did the GPU recover, request recovery, fail to recover, or not attempt recovery at all?
Please introduce the GPU into production and monitor if the error count continues to increase rapidly or remains stable.
If the error count continues to increase then please log a support case, share NVIDIA logs to investigate the issue. 



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
support_s
System Recommended

Query: /tmp/.gpu.2.xml: &lt;dram_uncorrectable&gt;3&lt;/dram_uncorrectable&gt; one of the GPU shows this

Hello,

 

Let us know if you were able to resolve the issue.

If you are satisfied with the answers then kindly click the "Accept As Solution" button for the most helpful response so that it is beneficial to all community members.

 

 

Please click on "Thumbs Up/Kudo" icon to give a "Kudo".


Accept or Kudo