StoreVirtual Storage
1753637 Members
5857 Online
108798 Solutions
New Discussion юеВ

Re: VMware SCSI Aborts

 
M_C
Occasional Advisor

VMware SCSI Aborts

We have been fighting a problem iwth SCSI aborts across multiple clusters. All clusters are two node setups on Dell servers.

The problem seems to only happen during snapshot removals when the Dell Raid card is perfroming a patrol read to find bad blocks.

In some situations we get a latency warning from the StoreVirtual VSA's, occaisionaly this may leave a node in an inoperable state. We have to reboot the VSA to bring it back on line.

As a test I have disabled the automatic patrol reads on the dell raid card. 

WIll the VSA's perfrom their own checksums and data scrubbing in the back ground? We are using 12.6 and 12.7.

the system is setup to use network RAID10 so I am happy to let the VSA do its own checks and leave the patrol read off perminatley, but only if it is running its own checks.

We have had multiple tickets open with HP and Dell and no one can find any issues. HP report they can see the latency at the hardware level, but we dont see anything recorded by vmware, so I am not totally convienced

3 REPLIES 3
Ravi_K
HPE Pro

Re: VMware SCSI Aborts

WIll the VSA's perfrom their own checksums and data scrubbing in the back ground? We are using 12.6 and 12.7.

=> VSA should do that so you can keep it off from Dell side, make sure the hardware disks are healthy .

Accept or Kudo

M_C
Occasional Advisor

Re: VMware SCSI Aborts

Many thanks for the reply. Do you know if this behaviour is documented anywhere? It's just that i know many of the software defined solutions reply on the underlying hardware to manage this data scrub 

Ravi_K
HPE Pro

Re: VMware SCSI Aborts

You are absolutely correct, most rely on Raid controller for the scrubbing . I do not have any document to support it for Storevirtual.  here too it just does the basic checks, the rigorous checks are by passed ( Even for Physical Storevirtual node raid controllers are tweaked for tuning) to overcome performance issues so if you are not comfortable to disable the feature permanently for thorough h/w check, you may enable it once in a while. 

Accept or Kudo