- Community Home
- >
- Storage
- >
- HPE SimpliVity
- >
- SimpliVity 380 3.7.8 with GPU instability
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-31-2019 01:54 AM
тАО05-31-2019 01:54 AM
SimpliVity 380 3.7.8 with GPU instability
Hello,
I have installed two different environments with each time 2x SimpliVity 380 (one in direct connect) without any specific problems.
Recently I had to install another environment 2x SimpliVity 380 with NVidia GPU in 10GB direct connect.
The installation went fine. After few days, the first node crashed completely for unknown reason. Because the environment was not in production, I reinstalled completely both nodes (factory reset).
Everything went fine during 2-3 weeks until some VMs have the warning sign "Lost Sync" and the first EXi appaered "disconnected". I could still ping the ESXi server, the Omnistack, the VMs running on this host but impossible to connect to esxi console (web, ssh or via iLO) nor Omnistack. There was no other options than reboot the server.
After the reboot, the Omnistack couldn't be synchronised. Some errors started to appear on the VMs running on the second node/omnistack (errors to start, errors to reach some files, ...).
I opened a ticket at HPE and the maraton started:
The engineer analyzed some logs and took some support bundles ... but no reason has been found for the crash.
He tried to reinitialized the failing omnistack during two days (launching scripts, editing the VC MOB, editing the disks seen in the Omnistack with hexadecimal editor, ...) without any success. He said afterwards there is no other option than re-deploy the node completely from the deployment manager.
The node has been redeployed and the Omnistack resync everything. The HPE support said that everyhting is fine and close the case.
Unfortunately, few hours later 7 VMs were lost ... they disappeard from the storage. We restored them from backup.
After that, we tried to re-create some VDI Pools but they couldn't access the datastores so we delete and re-create the datastores from the SimpliVity interface. It worked but now we have errors related after a recompose on all VMs: File system specific implementation of loct(file) failed ... The customer has lost confidence in SimpliVity.
I have the feeling that the storage is corrupted and we will have storage issues all the time ...
I will open a new ticket at HPE to see what they will say but for the moment I'm disapointed by their support.
Has somebody expercienced something similar? Or an idea?
Best regards,
FredZone
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-31-2019 06:31 AM
тАО05-31-2019 06:31 AM
Re: SimpliVity 380 3.7.8 with GPU instability
Hi @FredZone
Thanks for using the forum.
I am curious about the exact case here. Don't get me wrong........you obviously had an experience which wasn't to your liking, and for that I apologise.....but being a senior support engineer on the SimpliVity platform for a number of years now, some of your statement seems a bit odd - i.e, the sequence of events doesn't make sense.
For example, we don't just "loose" VM's. If you did indeed loose VM's, then you must have had a serious hardware issue or something like that, and also a single node environment. Any two node or more cluster provides automatic protection from such potential issues by providing SimpliVity storage HA and subsequent failover to secondary data copies.
I don't think it will be possible to provide a solution or ideas to your query without understanding the exact issue correctly, with all the facts.
If you have not already created a new support ticket, please do so. If you want to post the case ID here, I can maybe follow up also and check what's happening with it.
Thanks,
DeclanOR
I am a HPE Employee
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-31-2019 06:55 AM
тАО05-31-2019 06:55 AM
Re: SimpliVity 380 3.7.8 with GPU instability
Hi DeclanOR,
Thx for your quick answer.
I also thought about HW issues but no error has been seen.
The support said that "something" changed in the environment. We inspect all the layers and everything seems normal.
Here are the two tickets number:
5338444463 ==> normally close
5339021888 ==> just opened now
Thanks if you can do something,
FredZone
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-31-2019 08:19 AM
тАО05-31-2019 08:19 AM
Re: SimpliVity 380 3.7.8 with GPU instability
Hi @FredZone
Thanks for providing the info.
I have looked at both cases, and both cases are opened and pending an update from you.
I recommend that you respond to the requests on both cases and allow our engineers to complete any work needed.
The original case is very well documented and I can see the sequence of events. I will reach out to the owner of the original case and confirm a couple of things, but please continue to work on and respond to the open cases and allow our engineers complete all necessary work.
Thanks again,
DeclanOR