HPE EVA Storage
1752633 Members
5745 Online
108788 Solutions
New Discussion юеВ

Re: EVA 4400 - need help with performance issues in VMware

 
sam bell
Regular Advisor

EVA 4400 - need help with performance issues in VMware

Hi,

we currently have the following EVA 4400 configuration which is used for both a DWH system and our VMware vSphere envrionment:

- EVA 4400 (09534000), 4 enclosures, 32x 10k FC 300GB disks, 1 disk group
- attached to two independet fabrics (2x HP StorageWorks 8/8 SAN switches)

The EVA is accessed by the following hosts:

- DWH: 2x DL380 G6 (W2K8 SP2, 1x QLogic FC1242SR)

Each server accesses two Vdisks on the EVA, one RAID5 for MSSQL data, one RAID1 for MSSQL transaction logs. All four Vdisks are presented to Controller 1 and there are four paths to each LUN. However, the servers are told via HP MPIO DSM Manager to only use Controller 1.

- VMware: 3x DL380 G6 (ESXi 4.1, 2x QLogic FC1142SR)

The ESXi servers access four RAID5 Vdisks. All Vdisks are presented to Controller 2 and there are four paths to each LUN. The path policy on the ESXi hosts is set to Round Robin which, as of ALUA, only chooses the two paths to Controller 2 as active paths. Each vDisk holds 6-8 virtual machines (mainly Windows), overall 27.

From time to time we are having troubles with slow response times of virtual machines. In fact, this happens everytime the DWH servers are generating - what looks to me - heavy load on the EVA. As I am no expert in debugging storage performance, I am not sure whether it really is heavy load, but at least when looking at EVAperf during thoses times I can see the following:

http://www.abload.de/image.php?img=evaperf_12o78.png
http://www.abload.de/image.php?img=evaperf_2qo46.png

As you can see, compared to the VMware Vdisks there is much I/O on one DWH Vdisk. One the second screen you can see that there is much load on Controller 1 which is serving the DWH Vdisks. Controller 2, which is used for VMware vDisks only, is nearly idle compared to Controller 1.

Even though that Controller 1 is nearly idle and that there is not much I/O happening on the VMware Vdisks, all VMs feel extremely sluggish during these times. I.e. when working via RDP on a Windows VM, it feels like you were working on a computer that has a virus scanner running and therefore is slowing the hard disk down - opening the control panel for example takes seconds and you can watch every single icon appear slowly. The impact on end user applications running in these VMs is noticable (though not in all cases) but not a critical issue so far.

The problem affects all VMs that are stored on the EVA. VMs running on the same hosts but stored on an MSA2312fc are not affected and continue to run fine. The problem disappears immediately when the I/O on the DWH Vdisks lowers.

During the problem case, the disk latency of an ESXi host alternates between 10 and 50ms and is higher than usual (between 10 and 15ms):

http://www.abload.de/image.php?img=esx_1zhd2.png

Is there a way to debug this deeper to find out what exactly is limiting here? As I have written above, I am no expert in measuring and analyzing storage performance. But according to my tests the problems are definitely caused by the storage system.

Maybe 32 spindles are just to less to serve 27 (even though low utilized) VMs and one fully loaded DWH system?

Any suggestions are highly appreciated!
Sam

 

 

P.S. This thread has been moved from HP 3PAR StoreServ Storage to Storage Area Networks (SAN) (Enterprise). - Hp Forum Moderator

53 REPLIES 53
Thomas Callahan
Valued Contributor

Re: EVA 4400 - need help with performance issues in VMware

Are you running any Data Replication or Continuous Access on the array?
sam bell
Regular Advisor

Re: EVA 4400 - need help with performance issues in VMware

Hi,

thanks for your reply. No, we aren't running any replications.

Sam
Thomas Callahan
Valued Contributor

Re: EVA 4400 - need help with performance issues in VMware

I would try to rebalance all your vdisks to be evenly spread amongst the two controllers. You ideally want load to be split amongst them, was what I was told long ago.
Thomas Callahan
Valued Contributor

Re: EVA 4400 - need help with performance issues in VMware

Also, the EVA4400 is an Active/Active array. Make sure your hosts are set to utilize that, or you can end up overloading a single FP port.
sam bell
Regular Advisor

Re: EVA 4400 - need help with performance issues in VMware

Well, actually that was the configuration we had in the beginning. Then, after we were confronted with the problem described above, I changed the controller config so that I/O from the DWH system cannot lead to congestion on Controller 2 (VMware).

However, it didn't change anything.
sam bell
Regular Advisor

Re: EVA 4400 - need help with performance issues in VMware

Regarding the ports, both systems (DWH and ESX) have two active paths to the managing controller, so both FP ports of each controller are used.
Thomas Callahan
Valued Contributor

Re: EVA 4400 - need help with performance issues in VMware

Comparing to an evaperf output of my own, ( VMware ESXi on Proliant BL490c G6, hosted off a 4400 ) your evaperf output is missing data. Namely - Preferred Path, Redudancy, Number of Presentations.

What VCS code is the EVA4400 running, and Disk Code? What version of Command View?

We're running 72 VM's on our development VMware system right now, and the max latency I see against our 4400 is about 14ms. We spike to 80 during a heavy I/O point for a minute or two at night, but I've never seen it cause unresponsiveness.
Bulent ILIMAN
Trusted Contributor

Re: EVA 4400 - need help with performance issues in VMware

I have read a best practice about vmware and EVA but I can't remember or find the link now,

On VMWARE you should set the Preferred Path to one of the controllers as Failover/Failback for all Vdisks,

If not VMWARE always changes the active controller for a Vdisk and this reduces the performance,
CLEB
Valued Contributor

Re: EVA 4400 - need help with performance issues in VMware

Make sure your MPIO config is set to SQST and ALB. Otherwise if one path is congested it won't utilise the others.