StoreVirtual Storage
1753427 Members
4943 Online
108793 Solutions
New Discussion

Re: P4300 10Gbps network configuration

 
Stephane-OTG
Occasional Advisor

P4300 10Gbps network configuration

Hi,

 

I'm about to upgrade our LeftHand P4300 nodes from 1GBps to 10GBps, so I thought it would be a good opportunity to reconfigure the network design, but I have some questions.

 

In a nutshell, our environment looks (will look) like this:

4x P4300 nodes with 2x10GBps network adapters.

8x HP DL360 (Hyper-V Cluster) with 4-Ports 1GBps Network adapters.

2x HP 6600-24G-4XG (20-ports 1GBps, 4-Ports 10GBps).

 

I realise that most people recommend to use ALB, but if I configure ALB, I will be faced with a few performance disadvantages;

  •  Although ALB will provide bandwidth aggregation for sending packets (20GBps per nodes), they will only be able to receive at 10GBps maximum
  •  To enable redundancy, I will need to connect each node's network card to a separate switch (and do the same with the hosts), but this means I will have to create a trunk between the two switches to allow traffic to flow to either cards. This trunk will limit the bandwidth to 4Gbps maximum if traffic has to go from one switch to another.

To avoid the above disadvantages, I would like to do the following:

  •  Keep the two 10GBps cards on each node separate (not bonded with ALB) and configure them with IP addresses on separate subnets.
  • Connect each node's card to separate switches and keep the switch separate (no trunk).
  •  On each hosts, connect 2x 1GBps cards to one switch and 2x 1GBps on the other switch (each pair of cards on separate subnets)
  •  On the hosts, configure MPIO with round robin (apparently, according to HP recommendation, this is the only support method of sharing the load)

I've attached a quick diagram...

 

Will the P4300 support this?

 

Thank you,

Stephane

 

 

P.S. This thread has been moevd from HP 3PAR StoreServ Storage to HP StoreVirtual Storage / LeftHand. - Hp Forum Moderator

4 REPLIES 4
HPstorageTom
HPE Pro

Re: P4300 10Gbps network configuration

The P4300 systems are not supporting multiple subnets for the iSCSI SAN. The ports for the iSCSI SAN must be on the same subnet (using either ALB, LACP or Active/passive bonding if you do have multiple NICs).

 

The only reason why you might have a P4300 system connected to two subnets is if you want to separate the iSCS SAN and a management LAN.

Sbrown
Valued Contributor

Re: P4300 10Gbps network configuration

What disk configuration do you have to support 10gbps? 20gbps (one direction) or 40gbps (two nic's)? 

 

If the switches support meshing/irf/stacking, use that. It will get rid of having to deal with trunking since the switch will merge into one true switch. The 2920 has 2x20gbs ports and the 3800 has 4 of those. I think the rest has IRF which does the same task but with the regular ports (connect 4x10gbps and one big switch). Gets rid of all the overhead and security issues with MSTP/double vlan's. I'm pretty sure all the higher end HP's support this now. 

 

Easiest to start with one switch to get a baseline of performance and move up to multiple switches to see where you lose (latency). You may find that running the switches in active/passive to be faster since traffic will  all sit on one switch with no hops. Latency will kill performance at 10gbe fast!

 

baseline your performance with 1 switch, all 10gbe, all sfp+ or DAC, then move to two switches. I think you'll find the older lefthand units are just too slow.

 

 

Emilo
Trusted Contributor

Re: P4300 10Gbps network configuration

I am not sure where you got your information about the different bonding.

No matter what you do you will not achive 20gbs unless you have a 20gbs network card.

The bonding method you choose is a matter a preference.

 

However you will need to choose a ip address that will be your saniq connection.

So you plan to break up the network will not work

 

In looking at the logs on the p4000 the way the bonding is setup with link agg it will use only 1 card unless the , card becomes saturated.  With ALB it does appear to balnce the load a little better.  There is always going to be preference and a debate about which is fast but really what you choose will be up to you.  Link Agg is more difficutl to setup. ALB will work with no additional configuration. Its impossible for you to get 20GBS from either technolgy on the recieve or send side. You have to remember that once a conversaton is established it cannot span accross to different physical cards.  ALB is able to rec on both NICS just as Link Agg is able to. This was impleemented with saniq 9.0. Dont get to fancy with your configuration keep it simple.

 

 

Sbrown
Valued Contributor

Re: P4300 10Gbps network configuration

esxi will take each lefthadn lun and each esxi nic and login.

 

then you can set KB and/or IOPS round robin to balance for your needs.

 

2 x 10gbe nic's to 4 lefthand is 8 connections. enabling round robing will then switch nic's every 1000 iops (default).

 

google for how to alter this. 1 IOPS does not work well since it will send one packet then switch interface. 

 

Important things: use thick provision on ESXi eager zero and enable VAAI. SVMOTION is terribly slow with thin provision on vmfs since it sometimes will use very slow datamover (many reasons).

 

Also never mix base-T flow control and fiber. Fiber requires PFC flow control and base-T has too much latency for PFC flow control to work (out of range). BASE-T flow control range is too large for fiber and will cause latency issues big time. Which is why they came up with faster PFC flow control for fiber.  (Using vlan you must use PFC flow control or really bad things will happen)

 

Again: BEST to not mix BASE-T and 10BASE-fiber at all.  

 

Maybe better: disable flow control if you cannot ensure it works properly everywhere.  Properly is asymetric not both RX and TX - i'm not sure if all HP switch can handle this configuration. 

 

The main problem with congestion is the communication between lefthand. If you are using 100% TX/RX of the lan to your lefthand (maybe all SSD?) then you will leave no bandwidth for each node to communicate. 

 

If you ask node 1 for data in raid-1 - it is possible for data to be on node 3 and 4 - node 1 will need to ask the other nodes for data and then send it to you.  The node 1 must ask node 3/4 to send the data? but where? back to node 1? or will it forge mac to send directly to client iscsi ? (anyone know?)

 

What about that node you need to use to keep quorum on those 4 nodes? It was my understanding at one point it was a dead-node that contained no real data? Does the ISCSI client ever ask for data? If so how does the data traverse the network?

 

If you use network raid-1 with only two nodes, half of the data would be present on each node. Does node-1 still need to ask node-2 for data? It would be wise for the client driver to keep a cache of known data location so it would never ask the wrong node to read/write data? This would be very easy since the coherency penalty would merely be "sorry I don't have the data , let me ask the other node to send that to you?"

 

Now there are severe issues with running VSA - at least with ESXi 5.1 - the hypervisor is designed to run many vm's concurrently fast. It very much is not designed to run VM as fast as a physical box.

 

BUT VSA gives you options! You want to throw TWO(2)  P420/2gb FBWC into that SE1220 P4300 G2 ? You can figure out how to stuff some SSD into the mix and enable up to 1.5TB (per controller 2GB FBWC with SAAP 2.0 key) of SSD caching ? Monstrous gains in read speed and reduced latency. 

 

More VSA: What if you have advanced nic (BE3) that can loop back traffic on the vswitch so you can run vm's on the lefthand VSA - then the local VM could talk to the lefthand VSA node without going out to the switch at all (not all nic support this feature!!). In a 2-node configuration with Network raid-1 you could hit 50% of your data without any network traffic - this could have massive speed implications.

 

More VSA 2: LSI and PMC advanced raid controllers caching includes write-caching to ssd. All writes are sent to SSD first, then flushed to drive. Think of it as 256GB FBWC instead of 2GB. 256GB has to be better yes?

 

More VSA 3: What if you skip raid on each node? Perhaps 5 single drives of the SSD nature. Then network raid-1? (hint: this is exactly what windows 2012 supports). the answer of course is that lefthand says this is crazy. (back in the days people would create software raid over iscsi in windows!) 

 

So where are the engineers that can give us the light? I welcome them! 

 

Should we save our pennies for 3par VSA? just kidding of course.