StoreVirtual Storage
1745886 Members
4825 Online
108723 Solutions
New Discussion юеВ

Multi-site SAN with two nodes - has our engineer misconfigured?

 
SOLVED
Go to solution
PCrid
Advisor

Multi-site SAN with two nodes - has our engineer misconfigured?

Dear all,

I have just stumbled onto something and I suspect the engineers we had in to install our P4300 G2 systems may have made a bit of an error in configuing our LeftHand nodes.

The situation is this: Our site has a seperate server room and comms room at about 30m distance, linked by multiple Cat5e connects. This allows us to have our SAN VLAN split accross two switches, one in each room with a 4xGb trunk between them, and we can connect our nodes to each of the switches over the interconnects.

we have 4 P4300 MDL SAS nodes, however to ensure that if we were to lose 3 nodes we would only lose half our volumes, we have them configured as two seperate MGs, each with one cluster of two nodes in them. Best to take the example of one MG at a time, and I have attached a pdf of a single management group setup for clarification.

Each MG has two nodes set up as a multi-site cluster in a single subnet with a single VIP, and has a FOM installed on an ESX box in the server room. The FOM is on independant shared storage and could equally be brought up in the comms room in the event of a server room-wrecking disaster.

I was quite comfortable with this until I read on a recent forum post that a multi-site cluster should have two subnets configured and two seperate VIPs as a best practice, which is obviously not the case here.

Am i right in assuming that this is not best practice, and that our failover may be somewhat less than instantaneous in the event of one of the nodes going down? ARe there any other drawbacks?

If this is the case, will moving to a standard 2-node cluster fix this?

If we move to a standard 2-node cluster, am I right in thinking that if we have the FOM in the server room, then in the event of someone hacking through our server/comms room links we would have quorum in the server room, and that if we lost the server room entirely, we could bring the FOM up on the ESX box in the comms room to restore quorum?

Finally, if we can achieve our goals of protecting against failure of a single room or intra-room links with a standard 2-node cluster, is it possible to convert from our current 2-node multi-site cluster to a standard cluster?

Thanks in advance for your assistance,

Regards,
Pete
5 REPLIES 5
teledata
Respected Contributor

Re: Multi-site SAN with two nodes - has our engineer misconfigured?

Why not just add a second subnet (and VIP), and run (two) true multi-site clusters, with 2 VIPs? In you case I would have 2 VLANs (spanning both switches) for the 2 storage subnets.

Then you add both VIPs from both clusters to your ESXi storage paths. Now you have 2 distinct storage paths to each cluster. (4 unique storage paths per ESX host)

This is what I did with one of my customers. One multisite cluster, with the FOM in its primary location, the other with a FOM in its primary location. (so each multi-site cluster would have a local FOM). This way each cluster can sustain link, or node failure, and maintain storage availability to its local ESX hosts.

You are correct that without the multi-site SAN there can be some delays in the VIP moving from node to node. This provides the fastest recovery, and lowers the risk of a volume encountering a timeout due to link/node failure.

We found that the 2-VIP multi-site clusters was able to keep the Exchange and VMFS volumes online, where the 1 VIP multi-site cluster would timeout before the volume availability recovered.
http://www.tdonline.com
PCrid
Advisor

Re: Multi-site SAN with two nodes - has our engineer misconfigured?

Thanks for the comment - can I just check, as each cluster (and each management group) consists of only two nodes, are you suggesting that for each cluster I put a single node in one subnet and a single node in another?

The other question regarding that is that when I stumbled accross mail discussing that, I thought I'd read that the ESX boxes will not support MPIO for a SAN accross multiple subnets - is this still the case?

Final dumb question - given the nodes are really only in different rooms, not different sites, and wired as though they;re in the same room, do we derive any benefit from having them set up as multi-site clusters rather than simple clusters?

Sorry to follow one question with even more - just trying to get a handle on it.

Thanks again for your help,

Regards,
Pete

teledata
Respected Contributor
Solution

Re: Multi-site SAN with two nodes - has our engineer misconfigured?

It really ends up a decision between MPIO and hyper-redundancy. The big benefit is using multiple VIPs. With multiple VIPs you can take advantage of the storage stack in VMware to perform path failover, but you are correct, today you would have to give up MPIO unfortunately.

Multi-Site also gives you logical control over where the redundant data is stored... In a 2 node cluster it isn't really a big advantage, since you already KNOW that a network RAID 10 volume has 2 copies, one on each module, so this isn't a benefit unless you plan to expand beyond the 2 nodes per cluster.

http://www.tdonline.com
PCrid
Advisor

Re: Multi-site SAN with two nodes - has our engineer misconfigured?

Thanks for the info.

Given we like to keep logical and physical separation between storage units (healthy paranoia is my watchword), we're extremely unlikely to want to scale up our clusters, we'll simply add new clusters as and when required. With this in mind, based on what you've said about 2-node clusters, it looks like we'd be better converting to a standard cluster.

Is this as unpleasant an experience as I fear - i.e. remove node from cluster, create new single node cluster, migrate volumes to new cluster, disperse old cluster, add remaining node to new cluster, restripe as RAID 10, re-establish all our remote copy scheules? Or is there an easy way to do it without all of the above?

Thanks,
Pete
PCrid
Advisor

Re: Multi-site SAN with two nodes - has our engineer misconfigured?

Sorry, and on a related note, I was under the assumption that a standard cluster with a shared VIP would handle path failover between nodes in the event of node failure, thus the redundancy still stands? Can I just confirm that that's the case, as obviously we do want redundancy between our two nodes.

Thanks again!