StoreVirtual Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

4x P4300G2 SAS performance worries.

 
Highlighted
Occasional Advisor

4x P4300G2 SAS performance worries.

Hey folks,

 

I'm currently on a saga at work to see if we're getting what we paid for in our 4 node p4300g2 cluster. Using the CMC performance monitor I've rarely seen the maximum IOPS go over 2000 under heavy load, IOMeter 64k sequential reads peaked at 73.86MB/s read, and annectodal benchmarks of file transfers of known sizes from VM to VM on the cluster peaked at 34.98MB/s transfer.

 

I know I have some networking issues, working with Procurve 2810-24g's that apparently don't team connections across seperate physical switches, but is there more fundamental things I should be looking for? On the nodes the nics are teamed with adaptive load balancing.

 

We run an iSCSI setup from 5 Proliant servers.

15 REPLIES 15
Highlighted
Honored Contributor

Re: 4x P4300G2 SAS performance worries.

I can't remember what disk config the 4300 is, but it does sound like a switch issue.

 

check out some more of the VSA stats... watch your individual node latnecy on read/write and que depth.  Keep looking at other stats to try and find where the bottleneck is.  I bet it could be the switchs.  I can get >73 from two VSAs fed using network raid1 and hardware raid1 with two 7200RPM sata disks!  sustained IOS around 100 and peaks around 1500.

Highlighted
Occasional Advisor

Re: 4x P4300G2 SAS performance worries.

The individual nodes are phyisical RAID-5, leading to slower write speeds than read speeds, and network RAID 10.

 

Hopefully in the next couple of weeks my documentation of current setup and planned changes will magically finish its self so I change how things are setup since we don't have layer 3 switches. we use 3 ProCurve 2810-24g's, so with the p4300g2's and the proliant servers I think I can set it up so that the vmNetwork and iSCSI nics are teamed to the same physical switch, and use Fault Tolerance (ESXi 5) on our critical production servers. Hopefully then we'll still have some redundancy while allowing Adaptive Load Balancing to do its job, as well as the Round Robin Multipathing in ESXi.  

 

It's amazing how much of a learning experience this is. Lately I'm reading up on vLans, which I have to get right since we use a VOIP setup, but currently we weren't initially set up so that the regular virtual machine traffic is on a sepereate vlan from the iSCSI traffic. The vlan's were defined but not implemented, except for VOIP, that was set up and given higher priority than the rest.

Highlighted
Honored Contributor

Re: 4x P4300G2 SAS performance worries.

write speeds shouldn't be that much slower considering the cache on the VSAs unless your load is so mixed for so long.

Highlighted
Frequent Advisor

Re: 4x P4300G2 SAS performance worries.

Hello,

 

We also had concerns with the performance of a 4 node p4300G2 SAS cluster...

 

We are in the same boat, RAID 5 with network RAID 10.  We spent an extensive amount of time with HP LeftHand support, HP Sales Engineers and other escalated HP staff.

 

The best conclusion we have at this point is that performance is heavily dependent on your workload.  It turns out our workload is about 50% read, 50% write.

 

With discussions from other SAN engineers and other research, we are suspicious that a heavy write workload performs significantly more poorly than HP represents when using RAID 5.

 

We have been told numerous times that the rule of thumb is RAID 10 gives about a 10% performance boost at the cost of half your disk space.

 

We just started testing a VDI workload, 95% write and 5% read on a RAID 10, p4500 SATA cluster and performance and IOPS are a lot better, hitting upwards of 4000 IOPS with almost no queuing.

 

When our additional two node p4300g2's arrive we plan to do some IOMeter tests with varying read/write ratios and RAID 5 versus RAID 10.  Our gut feel is we are going to find that on a heavy write workload (not sure what heavy means yet, maybe 50% or greater...) that RAID 10 IOPS will be upwards of 2X performance versus RAID 5.

 

There is nothing conclusive in this post, but it feels like there is some mis-conception about RAID 5 in the LeftHand world and we are working to understand this more clearly so we can make better decisions on how to plan for both space and IOPS using a LeftHand SAN.

 

Kevin

Highlighted
Visitor

Re: 4x P4300G2 SAS performance worries.

Network Raid 5 is not at all recommended in an environment where we have require more write performace as the nodes have to do dual parity in writing the data and rebuilds take a longer time , SANIQ reads the volumes as network raid 10 as default, so if we use the network raid 5, the nodes have to two extra jobs-

1. Converting the raid 10 volume which is read by saniq into raid 5 while writing the data to the volumes , and
2. Other of writing the dual parity.

30 % of the top layer of the NR5 volume is always NR10(by design) as Saniq reads and writes data in NR10.

This is a tedious job for the SAN and as compared to NR10, the performance on NR10 for writing data is poor.

Hence NR10 is the best recommeded on the SAN's with more write performance

Sorry for my bad english.

I hope that helps.

Highlighted
Advisor

Re: 4x P4300G2 SAS performance worries.

Your numbers are similar to what I see on a 4 node P4300G2 cluster, and maybe slighly better even. I see reads sometimes as low as 15MB/s.  Oddly my write speed is usually high at around 50MB/s.  My load is much more reads than writes. I have R5 in the nodes and R10 at the network level.

 

I do have about 55VMs running on this storage, none of which are particularily high I\O.

Highlighted
Frequent Advisor

Re: 4x P4300G2 SAS performance worries.

We got a new two node p4500 g2 SAS Starter SAN in and I wanted to do some more testing.  It seems that there is about a 2.5x increase in IOPS using RAID 10 over RAID 5 for the hardware RAID.

 

This is a two node cluster, 8 drives 15K 450 GB SAS).  We ran IOmeter workload with the nodes configured as RAID 5, then again as RAID 1 + 0.

 

Our test setup was a Windows 2008 HP DL385 with 2 gig nics configured for MPIO using the HP DSM. At the time of our testing, we are running SAN IQ 9.5.  The LUN was created as a Network RAID 10 lun, fully provisioned.

 

Our IOmeter workload was configured similar to the original tests:

  • Disk were left unformatted and the physical drive was selected
  • 1 worker thread
  • 8000000 sectors
  • 64 outstanding IO’s

For each test the access specifications were setup as following:

  • 100% access
  • Request size 4K
  • 100% Random

Tests were run for 45 minutes with 30 second ramp up time. I also did the right click run as administrator when running the tests in Windows 2008. Apparently you can get inaccurate results without doing this.

The results were as we suspected but far from what HP engineers advertised to us. I am curious what the real word implications are but we have proven with one of our workloads that RAID 1 + 0 is substantially faster. Now IOmeter appears to be backing up this up.

 

4K, 70% Read, RAID 5 — 2792 IOPS, 10.91 MB/s, 18.24 Avg I/O, 5.66% cpu
4k, 70% Read, RAID 10 — 6827 IOPS, 26.67 MB/s, 7.94 Avg I/O, 11.59% cpu

 

4K, 50% Read, RAID 5 — 1952 IOPS, 7.63 MB/s, 38.21 Avg I/O, 4.36% cpu
4K, 50% Read, RAID 10 — 5178 IOPS, 20.23 MB/s, 9.7 Avg I/O, 8.35% cpu

 

4K, 15% Read, RAID 5 — 1437 IOPS, 5.61 MB/s, 88.42 Avg I/O, 2.67% cpu
4K, 15% Read, RAID 10 — 3703 IOPS, 14.47 MB/s, 13.34 Avg I/O, 5.82% cpu

 

I am not sure if I am missing anything or did anything wrong, but the tests look pretty consistent.  I was expecting the heavier write access to be fewer IOPS as it is, but I was expecting more of a hit on writes in RAID 5.  RAID 5 is supposed to be more intensive than RAID 10 so you would think that a 15% read pattern would get an exponential hit, but the above looks very linear.

 

I suppose this may have something to do with the way Network RAID 10 works.  There might be some funky write caching going on and it might be that the write operation to the second node is more intensive than the write operation to RAID 5, therefore the real bottleneck is the latency in committing the write to the second node, not committing the writes to RAID 5 versus RAID 10.

Highlighted
Regular Advisor

Re: 4x P4300G2 SAS performance worries.

Something broken there.

 

We run 2x P4300 in raid 6.

Network Raid 1.

 

We are using 2x procurve 2810 switches using ALB on the P4000's.

Both switches trunked (LACP) to a C7000 enclosure running hp blade switch GbE2c.

Running most dirty old bl460 g1's

 

Get around (over?) ~100MB using IOMETER to a single blade when i set it up. If I do the same thing on a second blade it scales well - could probably get close to 200MB on reads if I set it up correctly.

 

If you havent read into the pre-reqs to setting these up then perhaps you should consult google.

 

Flow control *must* be on.

 

Jumbo frames in almost all applications makes things slower / breaks stuff. - You need some -really- fancy switches to pull this off.

 

Make sure your STP is setup across your infratructure correctly.

 

Do not overload your switches - if they are running your iSCSI, that is probably all they should be doing, unless they are amazing. (2810's are not, but they should do a hell of a lot better than you are getting)

 

You *must* run the iSCSI in a seperate VLAN at minimum.

 

You *should* setup a gateway on this VLAN to allow the P4000's to report faults via email / SNMP (both is better, the P4000 wont spam you)

 

We are about to change our setup to stacked Cisco 3750's which should allow us to enable cross-stack LACP - Im expecting this to raise our write performance up to a similar level to our read performance (across multiple servers)

 

Good luck.

 

Remember changing link configurations on P4000's can be disruptive - so make sure you only stuff around with one at a time and make sure you have a FOM. Make sure you link paths are redundant to each node or you will be in a world of pain.

 

The P4000's are great for IO on 1Gb. Not so great for throughput compared to direct attached IO. Keep this in mind, avoid swapping etc and you can make them perform magic :)

 

Regards.

 

David Tocker.

Regards.

David Tocker
Highlighted
Regular Advisor

Re: 4x P4300G2 SAS performance worries.

Just adding to this - with direct attached RAID 5 storage, performance is generally crap. Add iSCSI and the performance is expected to be good?
Regards.

David Tocker