- Community Home
- >
- Servers and Operating Systems
- >
- HPE ProLiant
- >
- ProLiant Servers (ML,DL,SL)
- >
- Re: DL320Gen11 with disconnecting NVMe devices
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-02-2024 10:33 AM - last edited on тАО02-08-2024 05:29 PM by support_s
тАО02-02-2024 10:33 AM - last edited on тАО02-08-2024 05:29 PM by support_s
We have a new vSAN cluster with 4 x DL320Gen11, each with 4 x 3,84TB NVM. One of the hosts shows devices errors for the NVMe(s) in vSphere/ESXi. But there is no error in ILO and according to HPE nothing in AHS logs. The server has the problem since beginning of the week now and was running fine for a couple of weeks, including some HCIBench runs.
When issues start, it's not always the same NVMe. And most of the time, after the fist device disconnects in OS (not shown on PCI bus anymore), at least a second one follows imediatly or later. The frustrating part is that HPE support is pointing at VMware. VMware support shoud identify the broken device/part. VMware support checked logs and the server in a remote session, outcome is that some hw is broken and responsible for the disconnect of the NVMes. But from the OS side its not possible to see if it's a broken NVMe or the backplane or....
So there is no real progress in resolving this. Does anyone have any idea how to narrow down the issue? I'm already powering off single NVMes from within the ILO to see if the error reoccurs (funny that HPE support did not suggest that, I'm not on-site). But I've not yet a result. Any tests for the NVMes that can be triggered somewhere in RBSU? And how/where can I disable NVMes in RBSU?
Update: with drive in bay 3 powered off, the one in bay 2 still failed a few hours later. Now I've disabled both. But I'm not happy at all how HPE suport is handling this. Somehow I'm supposed to proove which device has failed. In case it's the backplane that's nearly impossible. We are not paying _a_lot_ of money for HPE support contracts and in the end nobody moves or tries to fix this on-site.
Solved! Go to Solution.
- Tags:
- Prolaint server
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-04-2024 08:03 AM
тАО02-04-2024 08:03 AM
Betreff: DL320Gen11 with disconnecting NVMe devices
With 2 NVMes powered down over ILO, there has been no error in 36h. So what does this mean, 2 faulty NVMes? Broken backplane?
Embedded:Port=3A:Box=1:Bay=4 Enabled 3.84 TB NVMe SSD
Embedded:Port=3A:Box=1:Bay=3 Disabled 3.84 TB NVMe SSD
Embedded:Port=4A:Box=1:Bay=1 Enabled 3.84 TB NVMe SSD
Embedded:Port=4A:Box=1:Bay=2 Disabled 3.84 TB NVMe SSD
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-07-2024 01:00 AM - last edited on тАО02-07-2024 09:03 PM by Sunitha_Mod
тАО02-07-2024 01:00 AM - last edited on тАО02-07-2024 09:03 PM by Sunitha_Mod
Re: DL320Gen11 with disconnecting NVMe devices
Hi There,
Thank you for reaching out.
May we have the case # or the serial # via Private message on which the issue is being handled so we may check the progress?
I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-22-2024 12:42 AM
тАО02-22-2024 12:42 AM
SolutionAfter a week with issues we removed all NVMes and reconnected cables. Issue is fixed since then (2 weeks now). A bit suprising as the server was running ok for already 4 weeks and then a connection problem seems to the root cause.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО02-28-2024 07:34 AM
тАО02-28-2024 07:34 AM
Re: DL320Gen11 with disconnecting NVMe devices
Hello @pirx,
Perfect!
We are glad to know the issue has been resolved and we appreciate you for keeping us posted.