Re: High latency, low IO's, MBps

Thomas Halwax · ‎06-18-2011

I have two HP 2910al switches with a 10Gbit interconnect. The two P4300 nodes have 1Gbit NICs where one nic is connected to one switch using ALB to provide load balancing. Quiet simple. Maybe I can create a Visio drawing to show our environment.

Thomas

Thomas Halwax · ‎06-18-2011

See the attached file for my setup. One node is located in the ground floor. So are the VMWare servers. The other node is located in the first floor using FO with copper to FO converters.

All connections are 1 GBit links. The only 10Gbit is the interconnect between the switches.

Thomas

Fred Blum · ‎06-19-2011

@AuZZZie

This is the bench mark data I received from the HP partner:

4K blokken, 66/34 read/write, random IO's

HP P4000 G2 Products
Model BK716A P4300 G2 7.2TB SAS Starter SAN

Capacity in GB(with Network RAID 0)
RAID 0 x
RAID 5 5.694
RAID 6 4.546
RAID 10 3.420

Performance in IOPS(with Network RAID 10)
RAID 0 x
RAID 5 2.200
RAID 6 1.700
RAID 10 2.500

With HD RAID10 and nRaid10 I measured 2960iops with IOMeter 67/33 4K writes.

twg351 · ‎07-05-2011

Wow, I read this thread months back as I was beginning to work on my SAN-HyperV project. I hoped all of your efforts would help me ... and they did. BUT sadly it looks like we're all in the same boat.

I have all best practices done (ex: 4 1GB NICS use ALB,Server uses MPIO,flow control, jumbo, 2 switches, fully isolated networks,e tc.). I am also seeing 30-60 MB/second speed on the SAN when it's network RAID 10 (and hardware RAID 5). I simply cannot driop the Hardware RAID down to 10 due to cutting too much disk space out. And I need RAID 10 obviously for redundancy.

I see MB/sec of ~40 on average for the SAN vs ~400 for my local disk. That is crazy. I am amazed that Thomas Halwax somehow got 125MB/sec ... the only difference that I can see is the 10GB switch cable.

Fred's comment on adding a 3rd and/or 4th node to increase performance by 50% - 100% is noted. But not in my budget ... likely forever. So I'll give my current SAN setup a try and see how it goes ... I don't think my SQL database is going to work at this lower speed and odds are I'll end up removing the SQL cluster and having a non-clustered SQL server using local disks for speed instead of the SAN disk (thank goodness I have enough disk space on my local server for my SQL DB's).

This is unfortunate, if anyone has come up with a solution, anyone besides Thomas as I still have no idea why his setup is getting to 125MB/sec while all others seem stuck at the 30-60MB/sec range.

twg351 · ‎07-05-2011

I think I am re-thinking this now ... edit ...

I have been running more & more IO tests, and the more I run the more I think:

- using IOMeter generic scans is not overly helpful, I only got useful information when I customized the scans (R vs W, random vs seq, R+W, etc etc)

- READS from the SAN are pretty quick. I was even getting in the 100-110MB/sec range ... much better than 40MB/sec

- Comparing the SAN to the local RAID10 array was not valid as I had not run a full range of IO tests. Now that I have, I can say in general it's as simple as the local disk is MUCH faster on WRITES . They are about the same on READS. Changing the hardware RAID from 5 to 10 on the P4300 would likely help here, but it's not something I am going to re-do at this point as I think the current setup will do fine.

MarcGaethofs · ‎07-05-2011

Adding nodes is not the solution if you're still using ex. 2910 switch. The overall connection limit is 2 x 1 Gb/s. Even in alb the nodes are still talking 1 Gb/s to the servers. Even HP DSM and MPIO will not connect more than 1 nic to the node. So still 1 Gb/s limit = 120 MB/s throughput max. We have installed P4500 Multi SIte SAN (4 Nodes) connected to 4 2910 sw/ ISL 10 GB / 10 GB Uplink....HP Best Practice... But still max out 120 MB/s. We need 400 MB/s for our SQL... Only solution... add 10 GB nic's into the nodes (4)/Servers (8)/buy Procurve 6000 series switches (4). Total cost : 60K. Not in the budget.... We did expect to see better performance then the poor 100 MB/s. IO's are not the problem. We did peak to 8600 IOPS...But tp is very low. If you have 48 spindles i would expect 48 * 150 * 64 k segment size /1024 = 450 MB/s. Now for Multi sites i will say 250 MB/s (only half the disks are used "not exactly but OK.."). But we only can get 100 MB/s. Which is very poor performance

Thomas Halwax · ‎08-12-2011

Hi,

we have a VMWare 4.1 cluster with 10 Win 2003 Servers as guest, one system is a Oracle 10g Database. I monitored the overall cluster (2 node P4300) throughput and the IOPS for 24 hours on a "normal" business day.

Thomas

M.Braak · ‎08-13-2011

I'm afraid currently it isn't possible to get over ~120 MB/s because the current dsm only uses one nic at a time. To achieve higher throuhput you need 10 gb nics. Single threaded io's like a file copy are limited in throughput by san/iq also. Using vmware's round robin MPIO plugin could help to actively use both nics.

Dan Nichols · ‎08-15-2011

My understanding is that the Windows P4000 MPIO does support load balancing in the form of round robin (which is stated in the windows solution guide here) . This will make use of both Gbps NICs in the Windows hosts.

At the P4x00 end the NIC bonding and switch configurations will determine the maximum throughput.

- Using switches that don't support cross stack/switch etherchannel (Cisco) or distributed trunking (HP) such as the Procurve 2910's then you are restricted to ALB (adaptive load balancing) which only supports transmit load balancing (2Gbps read throughput per node/1Gbps write throughput).

- Using switches that do support these features (e.g. Cisco 3750G, HP 3500 series) you can enable LACP on the P4x00 nodes for full transmit/receive load balancing with 2Gbps throughput per node for read and write I/O.

Add to that the I/O paths per node and data locality awareness (only with the HP for Windows MPIO) and you should get throughput way in excess of 120MBps in ideal (simulated) conditions provided you have enough hosts to stretch the storage.

kghammond · ‎08-20-2012

I know this is kind of an old thread, but I wanted to add some research and thoughts to this.

We got a new two node p4500 g2 SAS Starter SAN in and I wanted to do some IOPS testing. It seems that there is about a 2.5x increase in IOPS using RAID 10 over RAID 5 for the hardware RAID.

This is a two node cluster, 8 drives 15K 450 GB SAS). We ran IOmeter workload with the nodes configured as RAID 5, then again as RAID 1 + 0.

Our test setup was a Windows 2008 HP DL385 with 2 gig nics configured for MPIO using the HP DSM. At the time of our testing, we are running SAN IQ 9.5. The LUN was created as a Network RAID 10 lun, fully provisioned.

Our IOmeter workload was configured similar to the original tests:

Disk were left unformatted and the physical drive was selected
1 worker thread
8000000 sectors
64 outstanding IO’s

For each test the access specifications were setup as following:

100% access
Request size 4K
100% Random

Tests were run for 45 minutes with 30 second ramp up time. I also did the right click run as administrator when running the tests in Windows 2008. Apparently you can get inaccurate results without doing this.

The results were as we suspected but far from what HP engineers advertised to us. I am curious what the real word implications are but we have proven with one of our workloads that RAID 1 + 0 is substantially faster. Now IOmeter appears to be backing up this up.

4K, 70% Read, RAID 5 — 2792 IOPS, 10.91 MB/s, 18.24 Avg I/O, 5.66% cpu
4k, 70% Read, RAID 10 — 6827 IOPS, 26.67 MB/s, 7.94 Avg I/O, 11.59% cpu

4K, 50% Read, RAID 5 — 1952 IOPS, 7.63 MB/s, 38.21 Avg I/O, 4.36% cpu
4K, 50% Read, RAID 10 — 5178 IOPS, 20.23 MB/s, 9.7 Avg I/O, 8.35% cpu

4K, 15% Read, RAID 5 — 1437 IOPS, 5.61 MB/s, 88.42 Avg I/O, 2.67% cpu
4K, 15% Read, RAID 10 — 3703 IOPS, 14.47 MB/s, 13.34 Avg I/O, 5.82% cpu

I am not sure if I am missing anything or did anything wrong, but the tests look pretty consistent. I was expecting the heavier write access to be fewer IOPS as it is, but I was expecting more of a hit on writes in RAID 5. RAID 5 is supposed to be more intensive than RAID 10 so you would think that a 15% read pattern would get an exponential hit, but the above looks very linear.

I suppose this may have something to do with the way Network RAID 10 works. There might be some funky write caching going on and it might be that the write operation to the second node is more intensive than the write operation to RAID 5, therefore the real bottleneck is the latency in committing the write to the second node, not committing the writes to RAID 5 versus RAID 10.

Another thought regarding those seeing substantial drop in performance when switching from Network RAID 0 to Network RAID 10. First, you have to accept that Network RAID 10 creates twice as many write operations, so depending on your hardware RAID setting those could in theory create a substantial overhead. Network RAID 0 on one node versus Network RAID 10 on two nodes should be roughly the same as far as write operations, but also keep in mind there will probably be additional overhead in the form of checksums, confirmations, etc. As stated in my previous paragraph, I wonder if the latency involved in committing the second parity bit to a second node is where most of the overhead comes from.

I guess we shouldn't be too picky about the overhead of Network RAID 10. Who else does active - active synchronization, especially at this price point??? Anyone?

Something else that the HP engineers mentioned to us that probably plays a role. They said in a two node cluster running Network RAID 10, SAN IQ can assume that each bit is on each node of the cluster, which it is. If you go beyond two nodes, the actually bits will be spread across all the nodes when doing Network RAID 10, so there is additional overhead on all reads and writes as to where to get the bits from or where to write them to.

A lot of food for thought. We are up to sixteen nodes of HP LeftHand gear. It is working great for its price, but the scalability part is not quite as advertised in our opinion. We are looking to supplement our LeftHand environment with some form of a tiering SAN in the near future.

Kevin

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: High latency, low IO's, MBps