StoreVirtual Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

HP P4300 SAN cluster seems slow. Am I expecting too much?

Jay68
Occasional Advisor

HP P4300 SAN cluster seems slow. Am I expecting too much?

Hi,

 

We have got 4x P4300 G2 nodes running SANiQ 10.0 connected to a 2 node Hyper-V cluster. P4300 1 & 3 at the 'HQ' site and P4300 2 & 4 at the 'HA' site. This has 2 volumes set up, a (RAID5, Network RAID10) 1GB fully provisioned 'Witness' volume for the Hyper-V cluster it's attached to, and also a (RAID5, Network RAID10) thinly provisioned 5.47TB volume for the Hyper-V vhds.

 

The cluster nodes have 4x 1Gb NICs dedicated and the 10.0 DSM installed on them.

 

I have seen this thread in which the last post shows that it's possible to saturate the 1GB connection at 125MB/s:

 

http://h30499.www3.hp.com/t5/HP-StoreVirtual-Storage-LeftHand/Overall-Performance-on-P4300-SAS-Starter/m-p/4645704/highlight/true#M1088

 

In the StoreVirtual console I'm seeing a Queue Depth Total peaking at about 85, maximum IOPS at 2k and the Throughput Total shows a maximum of around 150MB/s.

 

The DSM is obviously doing it's job as I'm seeing +125MB/s but theoretically I should be able to 500MB/s.  The performance monitor of the cluster server is showing a high Average Disk Queue Length so it appears to be waiting for the disk.

 

Am I setting my expectations too high by expecting to be able to get 250MB/s from four nodes?

 

Thanks,

 

James.

15 REPLIES
oikjn
Honored Contributor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

you don't mention some details about the way you are testing this.  file size and drive cluster size play a big role in determining throughput as you may be limited by IOPS which could cause the issue you are seeing.   Beyond that, I'm not aware of any link agrogation method (DSM, LCAP, ALB, whatever...) that is 100% effective so 1Gb+1Gb does not equal 2Gb throughput exactly and might be as low as 1.5Gb depending on what is going. 

 

If you are really curious, just make sure that you are getting balanced throughput on each NIC on your clusters and that each NIC is talking to each Node.   Beyond that, you can play with cluster sizes and watch IOPS and latency in CMC to maximize whatever stat you care about most (most people its IOPs while some care more about throughput).

Emilo
Trusted Contributor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

I can anwser your question on queue depth

p4300 g2 sas drives should have a queue depth of 4 x 8 x 2 = 96 that would be healthy.

How does your applications seem to be responding?

Make sure you have the dsm setup properly and are seeing traffic for those nics.

 

Jay68
Occasional Advisor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

Thanks for the help guys,

We've been struggling with the VMs being laggy for some time and thought that it may be the server that was at fault. I downloaded the Microsoft PAL tool and Windows is complaining that the Average Disk Queue Length to the Hyper-V volume is too high. So there's no real testing as such I'm just monitoring real-world performance.

I'm not expecting to see the full 2GB wire-speed to a single unit because I understand that there are overheads involved, but I have 4x 1GB server NICs and 8x 1GB NICs across the 4 nodes. I was expecting to see the 4x server NICs operating flat out in Task Manager (1GB to each node) but I've not seen them top about 35% (they are balanced though). Reason being is that we have another vendor's SAN and that frequently hits 90% on both server iSCSI NICs to the 2x NIC ALB SAN when it's under a heavy load.

I've started to wonder if this could be due to the setup of our LUNs. We have one LUN spread over a pair of nodes and that is network RAIDed with another pair of nodes. Should I chop this up into smaller LUNs? Preferably ones that don't span two nodes?
oikjn
Honored Contributor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

que is about I/O...  have you setup your drives correctly for cluster size?  I would watch CMC to see what I/O your luns and your nodes are seeing for read and write as well as your que depths for each as well.  maybe there is a performance issue or maybe the setup you have configured is causing excessive I/O requests to the SAN. 

Jay68
Occasional Advisor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

Hi,

 

This was initially installed by a 3rd party and I would have assumed they would have installed everything in line with best practices (you know what they say about assumption though). What would be the best way of checking?

 

I've made sure that the LUNs presented to the server were formatted with 512byte sectors if that's what you mean?

 

 

 

 

Jitun
HPE Pro

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

As each Node in a Cluster acts like a Controller/Gateway for a Volume, one should create volumes to match the number of nodes in the cluster at least.

So if we have 4 Node Cluster, we should at least create 4 Volumes, So each Node can act as a Gateway for the Volume.

Ideally create 2 or 3 times the number of Volume so each Node acts as a Gateway for at least 2-3 volumes.
This would help in distributing the load from one/some volumes to many which is handled by many Nodes.

Not 100% sure but isn't 512Bytes quite small.
--------------------------------------------------------------
How to assign points? Click the KUDOS! star!
oikjn
Honored Contributor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

Jitun, he is using hyper-v not ESX so I don't think that need for one lun per server is needed as all talk to all nodes assuming the HP DSM is setup.

 

Jay, you can check to make sure the DSM was setup and is working indirectly through CMC by opening up each/any LUN and going to the iSCSI sessions tab.  In that tab, it shows you all connections to the LUN, you should see every server and every NIC on the servers represented here and you should have one connection per NIC that just says the gateway connection and then one connection per NIC per SAN NODE that says the gateway connection+DSM.  If you don't see this, then it wasn't setup correctly.

 

As for the cluster size, 512 is REALLY small and incorrect for hyper-v.  it should be 64k for CSV drives.  I forget the size of the stripe as defined by lefthand, but I"m sure the network raid stripe is NOT 512b so trying to do random I/O of size 512b ends up causing significantly more I/O on the storage nodes as it might be spread across up to 64k clusters which means you could end up doing somethink like 120+ IO per actual I/O request.  One easy way to determine this is if you watch node I/O and LUN I/O in CMC and watch drive I/O on your servers and see how they compare....  I bet your nodes are doing way more I/O than your drives and its because of cluster format size....  that and if you aren't acutally using the HP DSM that can slow things down as well.

Jay68
Occasional Advisor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

Hi,

 

Sorry for not getting back sooner. I thought I'd subscribed to the thread, but hadn't...

 

I'm definitely going to chop up the LUN. I've been reading that only one controller can access a LUN at a single time. This would mean that all traffic would have to go through a single node, drastically reducing the performance (if this is true). Also from a backup perspective, it's a lot better to restore a lot of smaller LUNs as I can pick the order to restore them (or the VMs on them) in the event of failure this way.

 

One of the other things I read was that the LUNs cluster size should be 512 bytes because that is the size of the sectors on the physical disks. But, Looking at this thread:

http://h30499.www3.hp.com/t5/HP-StoreVirtual-Storage-LeftHand/Witness-and-Cluster-disk-cluster-size-best-practise/td-p/4721548#.Ufjn7XZwZaQ

it would suggest that 64K easily outperforms 4K in all tests. I'll make sure my LUNs are formatted the same.

 

Here's what I'm seeing on the iSCSI sessions tab: 

 

iSCSI Connections

 

The top server (DR server) seems to be configured nicely. All 4 NICs with a gateway connection (no DSM) to 192.168.15.4 then a DSM connection for each NIC to .15.3 and .15.4. But should this also have a DSM connection for each NIC to .15.1 and .15.2 as well though?

 

The bottom server (HQ server) however, seems to be missing a DSM connection from  192.168.15.102 to .15.1 and .15.2 and it also appears to be missing a gateway connection (no DSM) from .15.101 to .15.4 which strikes me as a little odd.

 

I'm wondering why only node 4 is showing as a gateway connection (no DSM). Is this because, as Jitun implies, only one node can act as a gateway for a LUN and in my case it happens to be .15.4?

oikjn
Honored Contributor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

can you re-sort that picture using initiaor IP?  My eye are going cross trying to match everything up.

 

You should have one non-DSM connection to one node per NIC as you see there and then you should have one DSM connection for every nic connected to every node in the cluster.  In this case I see four nics (ending 201,202,203,204) and four mode connections (ending 1,2,3, 4), so you should have 4 DSM connections per nic.  If you see some nics missing DSM connections it generally means one of three things.  1) that iSCSI session was not configured for MPIO and/or the load balancing setting was left at vendor specific and not changed to round-robin.  2)  its temporary and the DSM is still discovering its connections.  3)  you are using the HP DSM with multi-sites which would then only connect to the local site nodes unless there was a local path failure.

 

I attached a full correct connection for two VSA nodes with two servers (with two nics each) connected to them.  As you can see each NIC has two DSM connections and one primary connection.  you should see 1 primary and 4 DSM per NIC.

 

As for cluster size, it all depends.  if you want max IO, 512 may work for you, but that will be very limiting for throughput.  It seems like 64k is getting more and more popular by way of its requirement for CSV or ReFS drives so I've moved most of my LUNs to 64k, but there are so many combinations that you will go madd trying to figure out a best option and just need to understand what raising one option over another means for your performance.

Jay68
Occasional Advisor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

Well, we had a power outage last night and and migrated everthing over to our DR site.

I took the oppertunity to rip MPIO/DSM off the server and remove the iSCSI configuration. After plumbing it all back in and changing the MPIO load balance policy from Vendor Specific to Round Robin, I now get this:

 

Sites-iSCSI

 

It looks like this has fixed the issue where NIC .15.103 was not connecting to the SAN. Also, the sites information explains why  HQ server will only talk to the local nodes (.15.1 & .15.2) and the DR server the DR nodes (.15.3 & .15.4) but, both servers' primary connections are all to HQ node .15.2. I guess this means that this node is acting as the gateway for the volumes and that chopping up the LUNS may increase the number of primary connections.

 

Is there any benefit to having these 4 nodes split into two different sites as they're connected by 10GB fibre? Both sites have separate power sources so I'd like to be able to make sure that non of the LUNs are lost if node .15.1 and .15.2 are lost or if it's .15.3 and .15.4 that go down. If I don't sepcify the sites, could a LUN end up on node .15.1 and Network RAIDed to .15.2?

 

I think I may have confused issues slightly on the filesystem sizes. The sector size in CMC is reported as 512bytes but the NTFS cluster size reported by diskpart is 4K. I've read another forums thread where a poster said their VMs seemed slow and that formatting the cluster size of the volumes to 64K did make things snappier.

Emilo
Trusted Contributor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

It looks like this has fixed the issue where NIC .15.103 was not connecting to the SAN.

Enabling round robin is what fixed that issue. Here is the formula to make sure you have the correct number of connections.  Volumes * host * host nics * ( storage nodes +1) so in your case you should have 6 iSCSI connections per volume. Which it appears you do.

 

Also, the sites information explains why  HQ server will only talk to the local nodes (.15.1 & .15.2) and the DR server the DR nodes (.15.3 & .15.4) but, both servers' primary connections are all to HQ node .15.2. I guess this means that this node is acting as the gateway for the volumes and that chopping up the LUNS may increase the number of primary connections.

 

I see that you have multi-site enabled and you have the servers (sw iscsi)  setup at each site. That is how you can control which host will act as the gateway.  Make sure that the servers connect to the volumes that are closest to the users, and that 'site preference' is enabled.  With only tow nodes int the cluster you are not going to pickup much benifit ,but don't forget by enabling muliti-path gateway node is only used for administrative purposes (management stuff) you are getting direct connections from the DSM driver. With MS DSM/MIPO the driver has intimate knowledge of the  layout of the storage custer (normally performed by the  gateway) it can calculate the location of any block and allows the iSCSI driver to contact the storage system that owns the block directly, without using the standard gateway approach). Look at the algorythim below and see what you want the behaviour to be. I got this from the manual by the way.

 

Is there any benefit to having these 4 nodes split into two different sites as they're connected by 10GB fibre? Both sites have separate power sources so I'd like to be able to make sure that non of the LUNs are lost if node .15.1 and .15.2 are lost or if it's .15.3 and .15.4 that go down. If I don't sepcify the sites, could a LUN end up on node .15.1 and Network RAIDed to .15.2?

 

There is a specific algorythim that is built in to handle just how the fail-over will occur.  This all depends on if site preference is enabled or not.

 

If no site preference:

DSM sessions connect to all storage nodes.

 

If site preference is used in multi-site configuration:

DSM sessions only connect to preferred site, say Site A.

 

If site preference used, but preferred site is unavailable:

No DSM sessions connect.

Single iSCSI sessions connect to non-preferred site.

 

If site preference is used, but preferred is not in the cluster

DSM sessions for volumes not in that cluster will be established with the other site(s) that have the storage nodes for that cluster. (Will act as if you had no site preference.)

 

I hope this helps. If so dont forget to give kudos and that is solved.

 

 

oikjn
Honored Contributor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

I must say I am jelous of that 10Gb fibre intersite connection :)

 

I would definitely keep your Nodes assigned to sites to ensure you keep your site availability for data in case of a site failure, but you could try simply locially moving one or two servers from their current site to the FOM site (not physically, just logically) or make the servers "unassigned" to any site.  That would have them connect to all nodes, but at the cost of potential latency.  Generally you want to keep the traffic on the local site only to minimize bandwidth and latency issues as wirting to the second site would require effectively four hops of data to move along for you to request something, pass to the  DR site, have it NR10ed back to the primary, confirmed back to the DR and then confirmed back to the server.... that delay in latency generally outweighs any potential bandwidth gains associated with connecting to the additional nodes.  If you have the link speed and latency (as you should with 10Gb fibre) you can certanly try keeping some servers assigned to the primary site and others unassigned and see which works better in your situation.

 

 

Looks like you fixed the problem with the iSCSI sessions and you now have it configured correctly.  generally I find it easy to forget about moving the DSM to round-robin for one or two NICs and it sounds like thats what happened w/ whomever set it up the first time. 

 

Don't get destracted by the non-dsm session... think of that session equivalent to what the FOM is to the management group.  It doesn't actually move real data across that connection, its only there as a start to help the DSM figure out and make its connections, once one DSM connection is up, that can handle everything the non-DSM connection can do as well if the non-dsm connection were to go down....  not that you would want to test that on your production system, but if you setup a VSA lab, you could test this failover by setting up a similar configuration that you have now and then simply reboot or shut off the VSA that is currently the "gateway" and as long as you have the 2nd DSM connected to each nic, you will not see any interruption in LUN access....  this is exactly why the system can do a non-disruptive system update even through they generally require nodes to reboot.

 

 

Jay68
Occasional Advisor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

So would it be better to remove the multi-site configuration? The HQ and DR sites are two server rooms on the opposite side of a road. Originally the connection was a 1GB fibre link and site preference was set up to prevent this link being saturated.

 

The link speed has since been upgraded to 10GB so I'm wondering if there is any need for a multi-site set up? There should be no issues for the HQ and DR servers to connect to all four nodes from a network perspective. The only requirement we have, is that due to these server rooms being on separate power grids, I need to be able to make sure that the LUNs are available if the HQ site has a power outage (node 15.1 and 15.2 are lost), or if the DR site has a power outage (nodes .15.3 or .15.4 are lost). I need to be able to make sure that .3 and .4 are replicas of .1 and .2.

 

Am I correct in thinking that it is installation order that determines this? Just to make things extra complex we installed these in the order .1, .3, .2, .4 but this should mean the HQ nodes are replicated to DR.

 

I feel that there is a fair bit more performance to be had out of this thing, it's just knowing where to tweak it.

Jay68
Occasional Advisor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

It sounds better than it is, HQ building is literally 1min walk from the DR building. We're just incredibly lucky that both buildings are on different power grids.

 

I get the feeling that this slowness is probably down to a latency issue rather than throughput so I'll look at chopping up my LUNs and formatting them with a 64k cluster size first of all. If I still have issues I'll then I'll try moving the servers to an unassigned group and see how things go then.

 

Big thanks to everyone for all your help! 

oikjn
Honored Contributor

Re: HP P4300 SAN cluster seems slow. Am I expecting too much?

I don't know if I would define anything you've said as "slow".  remember that throughput is a function of I/O rate and I/O size.  your disks can only provide so much I/O.  After taking into account any disk raid I/O penalty is your I/O really lower than expected?  Watch your LUN I/O stats, Node I/O states and cluster I/O stats as well as latency on CMC and then watch I/O on your hosts.  It could be that every requested I/O on the host generates 5+ I/O to the SAN because of cluster formatting size.  Otherwise, it could simply be that the two are close together, but your nodes are only capable of 2000 I/O and in order to increase throughput you would have to increase your I/O average size.  I think we can say now that your SAN connections to the nodes are all correct and the issue isn't your host to SAN connections, so now you should probably focust on your node I/O and host I/O to see if they are what you would expect.