StoreVirtual Storage
1751807 Members
3501 Online
108781 Solutions
New Discussion

Re: P4500 - 120TB Solution - 10Gb

 
YonU
New Member

P4500 - 120TB Solution - 10Gb

Hi guys,

 

I have a 120TB solution (5x P4500 24TB nodes) in place that is currently connected to two stacked Cisco 3750s with 1 gb connections. We have ESX 4.1 as well as Redhat servers presented to the storage. 

 

I understand that at most, each volume is managed by a single node, and that the network throughput (for each volume) is the maximum throughput of one node. Also, in most cases (LACP, ALB, etc) only one NIC is actively used from your host to your node because there is a one to one mapping (whether you are using IP hash or mac hash).

 

With that said, is one NIC sufficient to push a node to its limit (making it the bottleneck) or would the environment benefit from the 10gb upgrade pack?

 

I would be pretty suprised if the 10gb upgrade pack doesn't help, but i'd like to know if anyone has had recent experience with the upgrade and what performance improvements were noted on P4500 midline SAS (7.2) nodes.

 

Thanks,

5 REPLIES 5
YonU
New Member

Re: P4500 - 120TB Solution - 10Gb

Reply to myself.. Per the quickspecs a similar solution (60TB with 5x P4500 12TB nodes) sequential writes are listed at 300mb/s if everything else is optimally configured.

 

GIven that my nodes are larger and have the same number of spindles is it safe to say that I would expect somewhat worse performance?

 

Thanks,

YonU
New Member

Re: P4500 - 120TB Solution - 10Gb

One last reply for now I promise..

 

I understand that each volume has one node that is responsible for it, and handles all front end communication with hosts. With that said does anyone know the underlying mechanics for this? Are hosts redirected to specific nodes via IP address? If that is the case the nodes that are configured using IP hash LACP would be able to load balance their connections to different volumes across their NICs because each volume would have a unique IP address.

 

Thanks

Bart_Heungens
Honored Contributor

Re: P4500 - 120TB Solution - 10Gb

Hi,

 

Which Data Protection Level did you assign to the volume(s)? This will say how many copies of blocks of data will be written accross the nodes...

Further let it be clear that all storage nodes in the cluster handle trafic... The more nodes, the higher the throughput...

 

Mostly configured is ALB, which means that it makes (theoretical) 2Gb for data coming out of the nodes, and 1Gb towards the nodes...

 

Speed will be also dependent of the config and/or the utilisation of the DSM on Windows servers which creates multiple active I/O sessions... If it concerns ESX follow the Best Practices guide of how to configure the VMkernel iSCSI initiators to obtain the highest throughput and load balancing...

 

 

Kr,

Bart

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
YonU
New Member

Re: P4500 - 120TB Solution - 10Gb

Hi Bart,

 

Below is more information on the environment:

 

3 ESXi 4.1 hosts with 4 dedicated 1gb nics for ISCSI. Configuration uses VMware documented best practices (1 IP per vmkernel, VMware round robin NMP. Nics on each host are split across the two stacked 3750 switches.We also have two redhat hosts with 4 iscsi NICs each that although not currently configured correctly, do have an option for an HP P4000 approved round robin configuration that is native to Redhat. We are currently using LACP instead, which depending on the mechanism that Lefthand uses might, or might not be effective.

 

Each node is configured with ALB. and all volumes are RAID5. One outstanding item is flow control which is not enabled. As many other posts note there are flow control issues on the HP nodes, primarily with a bug where node reboots can change flowcontrol from an enabled to a disabled state and vice versa. I did have flow control enabled at one point and did not note a performance improvement.

 

Regarding your comments. I'm fairly certain that  although blocks are indeed written to all nodes, each volume has one "Gateway Connection" to which all host traffic is directed to. That node is then in charge of coordinating the distribution of all blocks to the other nodes, depending on the network raid selection used. Therefore performance does not scale linearly when you specifically talk about one volume. One volume -> one gateway connnection, regardless of the number of nodes. However you can easily overcome this by creating more volumes as you add more nodes.

 

I am interested to find out how hosts are "directed" to the gateway connection nodes, whether its by mac address or IP, as that has an implication on whether LACP is indeed an effective way of load balancing traffic for hosts that access multiple volumes and have multiple NICs.

Bart_Heungens
Honored Contributor

Re: P4500 - 120TB Solution - 10Gb

Hi,

 

The managers running on the nodes hold a virtual map that says which blocks are written on which node... That is why the best practice is 3 of 5 managers, more then that would be too much overhead on communication between the nodes...

 

If you implement that MPIO DSM on Windows servers, that map is send to that Windows host so that host knows which blocks are on which node... This means that you get optimal trafic since that Windows host has active connections to all storage nodes...

 

Now, the rumour is there already long but one day I assume (this is my meaning, not from HP) that there will be also some kid of ESXi MPIO DSM... But it isn't there yet...

So now your ESX server will send the request to the VIP (the VIP is randomly chosen based on load of the hosts where the managers are running), the VIP sends the request for that specific block to the concerned node and that node will send the block directly to the ESXi host, so not through the VIP... This means that the request only passes through the VIP node, not the data blocks itself... So normally that VIP activity shouldn't be that much additional load on the storage nodes...

 

I have some pictures of all this from the training material but unfortunately I cannot share this on the forums...

And like they say, a picture says more than 1000 words... I hope it is a little bit clear to you...

 

Kr,

Bart

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !