MSA Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

MAS P2000 SAS and VMWare 5.5 - insufficient space in datastore

 
SOLVED
Go to solution
michael_d1
Occasional Contributor

MAS P2000 SAS and VMWare 5.5 - insufficient space in datastore

Hi, We are having some strange issues with an MSA P2000 SAS and VMWare and wondered if somebody had seen similar problems before. 

Our environment is as follows: 

HP MSA P2000 G3 SAS – firmware TS251P006-02

2 * HP servers used as ESX hosts (ESXi 5.5) directly attached via SAS to the MSA. Each server is connected to each controller. 

We initially had a single RAID-5 vdisk, with a single volume provisioned to ESX. This has worked fine for years. The hosting team use thin provisioning at a VMware level. Recently they have had problems as it appeared they had run out of actual capacity due to the thin provisioning. This caused problems with the VM’s as you’d expect. It appeared that this was purely a capacity management issue.

The hosting team worked to free up some space and have managed to get the VM’s on-line. There is now around 400GB free in the datastore according to VMWare however they are still seeing disk capacity issues when trying to provision VM’s. We assumed this was a problem with VMWare not freeing up the now unused thin provisioned space…. However ..

In the background, an additional 4 disks were purchased to allow us to create a new vdisk and present additional capacity to VMWare. We took the decision to create a new RAID-5 Vdisk rather than to extend the existing one as it seemed like the safer option. The vdisk was created and a single volume presented to the ESX hosts (roughly 1.3TB). The ESX hosts detected the volume and a datastore was created all without issue. We assumed our job was done. Unfortunately even though the datastore was created without issue, the hosting team are unable to migrate any VM’s to it. Migrations will start and then fail with an ‘insufficient disk space’ error. 

Maybe coincidentally, on the day the majority of the issues started, the Compact flash card was replaced in one of the MSA controllers due to a failure. Since the Compact Flash card replacement, the array is up and healthy. 

We have tried the following:

Migrating VM’s from the old datastore to the new.

Creating new thick provisioned VM’s on the new datastore.

Creating new thin provisioned VM’s on the new datastore.

Rebooted each of the MSA controllers one at a time (to avoid an outage rather than a whole array reboot).

Rebooted both ESX hosts.

Disabled VAAI within ESX (the HP specific VAAI plug-in doesn’t seem to be installed)

Provisioned a smaller volume.

Created the datastore as VMFS3 rather than VMFS5.

Logged into the ESX hosts via SSH and tried to copy into the new datastore. 

During all testing we get some data copied and then the ‘insufficient disk space’ or similar error. The amount of data copied seems to vary but it’s never very much. 

We are at a loss to understand what could be happening as the system has worked fine for a number of years. We’re not even sure where the problem lies … VMWare/MSA etc.  As there is some data that can be copied and the fact that you can create the datastore, it feels like the volume is writeable but maybe the communications is being interupted.

Any help/suggestions would be much appreciated!

3 REPLIES 3
Torsten.
Acclaimed Contributor

Re: MAS P2000 SAS and VMWare 5.5 - insufficient space in datastore

TS251P006-02 is from 2014.

 

Read about VAAI and other changes:

 

https://support.hpe.com/hpsc/doc/public/display?sp4ts.oid=3302295&docLocale=en_US&docId=emr_na-c05388074


Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   

Re: MAS P2000 SAS and VMWare 5.5 - insufficient space in datastore

Your issue seems to be related to VMWare specific.

You can browse your datastore and try to check apart from Virtual Machine VMDK files what all files there which is consuming all space. It could be some big files for some reason there. You need to involve VMWare expert to troubleshoot this.

You can also search in Google with "insufficient disk space" error message and you will get lots of help from VMWare,

You should also check from VMWare console or CLI that if new datastore getting locked for some reason. Please find the below article to get some idea,

https://www.petenetlive.com/KB/Article/0001292

From MSA perspective, I would suggest to upgrade Controller firmware to latest version TS252P005,

www.hpe.com/storage/MSAFirmware 

 

 

Request you not to keep the forum open for long time if don't get any help as here everyone will be MSA expert only.

 

Hope this helps!
Regards
Subhajit

If you feel this was helpful please click the KUDOS! thumb below!

***********************************************************************************

 


Accept or Kudo
michael_d1
Occasional Contributor
Solution

Re: MAS P2000 SAS and VMWare 5.5 - insufficient space in datastore

So this was a strange one and took a fair amount of time to daignose but it appears there was never an issue with VMware thin provisioning but an 'invisible' fault on the array. Controller B seems to have been the cuplprit and caused data corruption when the volume failed over from A to B during the compact flash failure (in A). After much head scratcing we eventually had controller B replaced and all of our problems have gone away. Unfortunately we had to recover the environment due to the corruption (after re-creating RAID groups and volumes). At no point did the array suggest there were any faults with the controller and we were even able to upgrade the MSA code whilst the bad controller was in place. This controller may have been 'bad' for a very long time but as at the time we only had a single volume which was owned by A, we never saw any problems until it was failed over to B.