Re: Collapsing a multi-site SAN down to one vlan

prcoen · ‎02-28-2011

After going back and forth with support for a while, they advised that we change our P4500 G2 multi-site SAN solution from dual vlan/VIP down to one. I'll go into why below. We are still running san/iq 8.5 - we're avoiding the 9.0 upgrade until we get some of the configuration issues worked out. We're also almost 100% virtualized under VMware vSphere. I think I'm OK with HP's recommendation, I just figured I'd sanity check it here first and get some opinions.

What they're suggesting, since I have the storage nodes split up into two sites already, and they're showing up in A/B/A/B/A/B/A/B order when I edit the cluster in the CMC is that I can shut down all the VMs, log out the iscsi sessions from the cluster, delete the second VIP and reassign all the storage nodes in site B to the subnet for Site A. It won't change the storage order in the cluster, things will just start communicating, and the VIP failover is supposedly fast enough to avoid problems. We are running a virtualized FOM in a third site.

We're using vSphere 4 update 2+patches, and we're running into the iSCSI software initator issue - no routing between subnets from the software initiator. And since each iSCSI target is assigned a single storage node as a gateway (and thus in a single subnet), losing a single connection from any VMware host will cause it to lose a path to any storage on the corresponding subnet (I guess we could have two paths per subnet, but I haven't tested that configuration yet. And that doubles my copper SFP and switch port count). Without the ability to route across subnets with iSCSI (or doubling my connection count, if that would even work), any storage switch reboot or failure (two in each site) is a single point of failure. That's no good. A single switch failure is more likely than a site failure, honestly.

What I was told is that for environments like ours (multiple buildings, single campus, VMware hosts in clusters distributed between the sites), they're moving away from the multi-subnet model in 9.0 anyway, and we'd be better off just switching now.

It makes sense for me. Certainly sounds better than my current situation, or doubling the number of vmnics assigned to iSCSI from 2 to 4, or doing the unsupported VMware "manually add a route for the iSCSI interfaces" outlined under "Third Topic" in the Multivendor Post on using iSCSI with VMware vSphere document. http://virtualgeek.typepad.com/virtual_geek/2009/09/a-multivendor-post-on-using-iscsi-with-vmware-vsphere.html

Thanks in advance for any opinions or suggestions.

teledata · ‎02-28-2011

I'm a bit confused..

Do you have both storage subnets "stretched"?

When I setup a multi-site SAN I configure (2) stretched VLANs, so BOTH storage subnets are accessible at both "sites".

Then you add both VIPs to the vSphere hosts.

This way if you have a local path problem, it simply fails-over to the WAN site...

Maybe I'm not seeing your issue properly. Do you have a network diagram of your iSCSI environment that you could post?

http://www.tdonline.com

prcoen · ‎02-28-2011

I don't have a diagram, but I think you're describing what we did. Both subnets (they're actually two subnets in a single vlan) appear at both sites. The VMware hosts have an interface in each subnet.

However, while there are two VIPs - one on each site - san/iq assigns only one gateway for any iSCSI target/volume, on a single storage node at any given point.

If that storage node holding the gateway is in site A on subnet A, and any vSphere host in any location loses its connection to subnet A due to a cable failure or a switch failure, there's nothing to fail over to for that target. The vSphere hosts' connection to subnet B cannot talk to an assigned iscsi target gateway in subnet A, because vSphere's iscsi software initator doesn't support routing. There's no gateway assigned for that particular volume in subnet B on any of the storage nodes there. And nothing in that situation would make san/iq reassign the gateway for that target to another storage node. It's either going to be a problem with a single vSphere host or a single switch, neither of which will disrupt array communications.

The only way to work around that, aside from the unsupported routing trick, or doubling the number of connections (which I haven't tested it yet) so I have two connections to each subnet - one per switch per vlan per vsphere server, four total per server - is to drop it down to one subnet.

The way it's set up currently, it can work in the event of a complete site failure. But a single switch failure/reboot is very disruptive.

Does that make sense?

teledata · ‎03-01-2011

I think I understand.

What if you setup 1 vSwitch for iSCSI traffic with 2 pNICs:

Nic 1 goes to switch A. Switch A is a trunk port and both storage VLANs are allowed

Nic 2 goes to switch B. Switch B is a trunk port and both storage VLANs are allowed.

You setup 2 iSCSI kernels. one in each subnet, and tag the portgroup for the appropriate VLAN.

Nic 1 is the primary NIC for (iSCSI kernel 1) Storage Subnet "A", Nic 2 is the standby for Storage Subnet "A"

Nic 2 is the primary NIC for (iSCSI kernel 2) Storage Subnet "B", Nic 1 is the standby for Storage Subnet "B".

This way you would have both VIPs on each vSphere host, 2 physical NICs into redundant switches. Each host has redundant paths to BOTH storage subnets.

You have BOTH VIPs entered into ALL vSphere hosts. The Storage stack will use the 2 storage paths for failover, the network stack will provide failover from link, nic, or switch failure.

I'll have to dig more into what you were saying about "moving away from the multi-subnet model in 9". I'm not sure how they are accomplishing this, unless they have built better/faster VIP failover algorithms. In the past the VIP failover was too slow for database apps (SQL/Exchange), and VMs and could cause crashes during site link failures as they would timeout before the VIP failed over. Maybe this has changed in 9.0??

Any HP/P4000 moderators care to comment on this? Sounds like I've got some reading to do! ;)

http://www.tdonline.com

prcoen · ‎03-01-2011

I think the "moving away" is that they're going to start counting on the servers' site assignments to determine it, rather than counting on the subnet.

I actually think that config would work. I was somewhat caught up in the "configure one NIC, disable the other on each portgroup" advice. You know, ours may even be simpler. We don't need trunks - it's two subnets in a single vlan. We don't need tagging at all.

I have to think through what to do on the Linux side. Our only non-VMs on this are going to be a pair of Novell SLES 10 Sp3/ OES2 file servers in a cluster, and I need to figure out whether I can get the correct standby behavior out of the software iSCSI initiator, DM-Multipath and the two NICs.

In terms of VIP failover - how long are we talking? It looked to me like a VIP move only takes a few seconds if the VIP-holding storage node goes down for some reason but the site is up - that's about the same amount of time that it would take the VRRP gateway on our brocade 10GE switches to fail from one to the other in the case of a switch outage. When I was reassigning gateways on storage nodes, during a planned shutdown, I did the VIP-holder last, but it moved pretty much instantly.

In the case of a site failure with a FOM configured, I'd expect the FOM to intervene and the VIP to come back more quickly as well. They did say to retain our FOM in the new configuration as well.

teledata · ‎03-03-2011

Prcoen -> So it sounds like you are moving back to a non-multi-site SAN configuration then?

Did you get anything from HP indicating that they had recently improved the VIP failover?

I helped a customer (about a year ago) migrate TO a multi-site SAN because the VIP failover was too slow, and the only way we maintained high availability (particularly for Exchange) was the multi-VIP configuration.

Just wondering if improvements they made in 9.0 should change my thinking on this going forward.

http://www.tdonline.com

prcoen · ‎03-03-2011

Not really - I didn't get into it too much with them. My general impression (when reconfiguring storage nodes) is that it often goes pretty quickly. But even without that, right now, a complete site failure for us is the least likely failure scenario, and we could probably tolerate having to restart a few things.

Single link/port failures or storage switch failures are more likely. And there's only a 1-in-8 chance that a storage node that fails is the VIP holder.

So if I'm going to have to compromise anyway, I may as well cover the most common outages smoothly. In a major disaster, we still have replicated data, it may require a little intervention to bring back some services. Right now, we're guaranteed it'll require intervention.

I do wish there was a feature where you could specify that you don't want a few critical volumes to be gatewayed through the same storage node holding the VIP. That way, if the VIP-holder fails, those critical volumes are still running since there's an established path, and if one of those fails, you're more likely to have a running VIP to ask for a new gateway assignment. There's probably a downside to that, but it would be handy.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Collapsing a multi-site SAN down to one vlan

Collapsing a multi-site SAN down to one vlan