HPE SimpliVity

Re: OVC faulty in two host federation

 
330_IT
Occasional Contributor

OVC faulty in two host federation

Our environment:

vSphere 6.5 4564106

Omnistack 3.7.0 - QTY: 2

Servers: Omnicube CN-3400 - QTY:2

One of the OVCs in the Federation is in a Faulty/Disconnected state and the other Alive/Disconnected state.  I do have a nostart file in /var/svtfs/0, but permission is denied when attempting to remove the file via svtcli.

Here's the output from:

tail -f $SVTLOG

2021-09-02T04:03:07.407Z WARN  0x7fe4de27f700 [:] [storage.stgmgr.aiocommon] aioCommon.cpp:296 io_getevents returned 0 with 1 submitted with 10 sec 0 ns timeout

2021-09-02T04:03:07.407Z WARN  0x7fe4ddc73700 [:] [storage.stgmgr.aiocommon] aioCommon.cpp:296 io_getevents returned 0 with 1 submitted with 10 sec 0 ns timeout

2021-09-02T04:03:07.407Z WARN  0x7fe4de687700 [:] [storage.stgmgr.aiocommon] aioCommon.cpp:296 io_getevents returned 0 with 1 submitted with 10 sec 0 ns timeout

2021-09-02T04:03:07.407Z WARN  0x7fe4ddef8700 [:] [storage.stgmgr.aiocommon] aioCommon.cpp:296 io_getevents returned 0 with 1 submitted with 10 sec 0 ns timeout

2021-09-02T04:03:07.407Z WARN  0x7fe4ddd75700 [:] [storage.stgmgr.aiocommon] aioCommon.cpp:296 io_getevents returned 0 with 1 submitted with 10 sec 0 ns timeout

2021-09-02T04:03:07.407Z WARN  0x7fe4df728700 [:] [storage.stgmgr.aiocommon] aioCommon.cpp:296 io_getevents returned 0 with 1 submitted with 10 sec 0 ns timeout

2021-09-02T04:03:07.407Z WARN  0x7fe4dfa2e700 [:] [storage.stgmgr.aiocommon] aioCommon.cpp:296 io_getevents returned 0 with 1 submitted with 10 sec 0 ns timeout

2021-09-02T04:03:07.407Z WARN  0x7fe4de504700 [:] [storage.stgmgr.aiocommon] aioCommon.cpp:296 io_getevents returned 0 with 1 submitted with 10 sec 0 ns timeout

2021-09-02T04:03:07.831Z INFO  0x7fe511047700 [:] [storage.iocontext] ioContextList.cpp:86 Wrote nostart file to '/var/svtfs/0/nostart'.

2021-09-02T04:03:07.831Z FATAL 0x7fe511047700 [:] [abort] assert.cpp:26 /home/svtbuild/jenkins/workspace/svt-datapath/projects/storage/src/storage/cpp/stgmgr/ioContextList.cpp:96:failFast: Call to io_queue_release timed out after 60 seconds.

Can anyone help me with fixing this please?

7 REPLIES 7
US_SimpliVity
Valued Contributor

Re: OVC faulty in two host federation

330_IT,

It is very likely that you have a drive failure that is causing the WARNINGs from storage.stgmgr.aiocommon.

Please examine the Storage health using iDRAC (IPMI).

If a logical or physical drive is in a degraded or failed state, then Hardware Maintenance will be required prior to restarting the Storage Services.

Failed hardware replacements or assistance with restarting storage services after hardware maintenance has completed will require a Support case opened with HPE Support.

I am an HPE Employee.

tonymcmillan
Frequent Advisor

Re: OVC faulty in two host federation

Are you saying that the OVC will fault if there is a drive failure in the storage array and will not running again until the array is healthy? 

gustenar
HPE Pro

Re: OVC faulty in two host federation

@tonymcmillan 

The suggestion is to ensure that you are not experiencing any hardware issues as this is what could've caused the failure. Make sure all your hard disks are healthy before trying to recover the OVC. If you experienced a hardware failure, replace the component prior to recovering the OVC. 

To delete the nostart file you must be on root account, but for 3.7.0 you will need the support team to assist with the task. 

I am an HPE employee
Accept or Kudo
tonymcmillan
Frequent Advisor

Re: OVC faulty in two host federation

I understand this is not the expected behavior and the issue is addressed in a newer version. Do you know which version fixes this?

gustenar
HPE Pro

Re: OVC faulty in two host federation

You are running a very old SimpliVity version. The recommendation is that update to the latest supported version for your platform 3.7.10 U1. 

I am an HPE employee
Accept or Kudo
tonymcmillan
Frequent Advisor

Re: OVC faulty in two host federation

Our platform is four Large HPE DL380 Gen10s from mid-2019 with the TIA card. 

Are you saying we avoid the 4.x versions? If so, can we still use the newer versions of ESXi?

 

 

gustenar
HPE Pro

Re: OVC faulty in two host federation

@tonymcmillan 

Sorry for the confusion, that recommendation was based on the configuration posted in first message. 

If you are running HPE Simplivity 380 Gen Servers you can upgrade to Software version 4.1.0 U1 which is the latest one available. 

I am an HPE employee
Accept or Kudo