StoreVirtual Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

Performance Issues

Scott Caryer
Advisor

Performance Issues

I am having performance issues with my 2-node VSA cluster. When I connect directly to each ESX server in the cluster, I can not identify a performance bottle neck other than the fact that each VSA is running about 85% CPU utlization. I do NOT know if that is the norm or not.

 

With that said, if I reboot 1 of the VSA appliances, will that affect normal operations of the VMs on the HOST ESX server?

 

Thanks in advance...

Scott

6 REPLIES
oikjn
Honored Contributor

Re: Performance Issues

85% is definitely high.  Do you have 2 CPU on each VSA?  If not, you should.

 

Rebooting or shutting down a node should really be an issue as long as you setup your initiators on the host computers correctly as MPIO should handle the loss of a node.  

 

 

Are you doing lots of snapshots?  That is usually what spikes CPU usage.

Scott Caryer
Advisor

Re: Performance Issues

I do have 2 CPU on each VSA server w/ 4 Gig of RAM each.

 

The Virtual CEnter Event viewer did indicate their was very high I/O Latency on both ESX servers. These servers are HP DL380 Gen8 servers with 65 gig of RAM and 2 x 8 Core CPUs. Although not much RAM, but great horses under the hood.

 

I was able to down or vmotion the few VMs on one Host to the other. So I shutdown VSA2 and rebooted ESX2. The shared storage was still available to the ESX1 server. (A good thing, and the whole reason why we are using VSA right!)

 

I started VSA2 and life appears to be fine. However, I can NOT vmotion a VM on ESX1 to ESX2 now since it is complaining that ESX2 has a issue with seeing the VMDKs on the shared iSCSI storage. I rebooted VSA2 and the issue still persist.

 

Not sure what exactly I should do now. I would like to reboot VSA1 however, I am leary that all VMs will crash as they currently don't see the storage from ESX2.

 

Thoughts?

 

oikjn
Honored Contributor

Re: Performance Issues

what speed are those CPUs?  If they are 1.8Ghz then it might be too slow as the VSA is limited in the number of cores its using.  Typically you have to be pushing some serious IO to get CPU issues on the VSA unless you are creating or deleting snapshots.  

 

MPIO should have failed over the connection from one VSA to the other so if it hasn't switched over there was a config issue and you need to setup that connection again.  I wouldn't attempt another failover while that is not connected as you probably will lose access to those LUNs from that server and that will affect your server avaiability.

 

 

overloaded CPU sounds like the cause of your problems, but WHY its happening is the question.  if ESX isn't showing a bottleneck and your CPUs are 2.2GHz+ and you aren't doing snapshots/rebuilds, then I'd check BIOS settings for CPU C states and/or contact support.

Scott Caryer
Advisor

Re: Performance Issues

The CPUs are 2.593 Ghz.

 

I resolved my storage issue by simply doing a "Rescan of the Storage" and my shared iSCSI datastore popped up.

So I proceeded to vMotion my VMs on ESX1 and rebooted that ESX Host. Again, upon a reboot, and a restart of VSA1, I still had to manually perform a "Rescan of Storage" for this ESX Host to see the shared iSCSI storage as well.

 

So back to the real reason, how determine why both ESX Host were expiencing high I/O Latency. Is there any VSA logs on the datastore that can be reviewed?

 

Thanks in advance...

Scott

oikjn
Honored Contributor

Re: Performance Issues

did you try rebooting just the VSA first?  The VSA is only as good as the underlying hardware, so maybe there is something in the esx host logs that gives you a clue.  

 

next time (if there is one), try rebooting just the VSA to see if that "fixes" the problem and then if it does that might help clue you in if there is an issue with the host or with the VSA software.

Scott Caryer
Advisor

Re: Performance Issues

Great idea!!! I will just reboot the VSA the next time. Ironically, the issue has occured again. However, VC is not indicating I/O Latency verbiage in the Event Viewer. I did see some events listed below:

 

Login to iSCSI target iqn.2003-10.com.lefthandnetworks:virtual-san:27:hcplc-lun1 on vmhba34 failed. The iSCSI initiator could not establish a network connection to the target.

 

I rebooted the one-ill performing VM, and it appears to be fine...