- Community Home
- >
- Storage
- >
- Midrange and Enterprise Storage
- >
- StoreVirtual Storage
- >
- Re: VSA Multisite and VMWare FT
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-05-2012 05:33 AM
07-05-2012 05:33 AM
Is it possible to have the following scenario:
Two site VSA. Lets say site1 and site2.
Site1 is primary and has the FOM running.
Running a VM in Fault Tolerant (FT) mode.
If we lose site1, will the FT machine continue to run from site 2? My understanding is that it will not continue to run because the FOM has been lost on site1 and hence site2 is also dead and FT secondary that was running on site2 cannot run anymore. Is that correct?
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-05-2012 05:54 AM
07-05-2012 05:54 AM
Re: VSA Multisite and VMWare FT
Correct. There would be a loss of quorum in this scenario, so the ESX(i) hosts would be disconnected from the SAN. This condition would persist until you brought site 1 back online.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-05-2012 06:44 AM
07-05-2012 06:44 AM
Re: VSA Multisite and VMWare FT
yea, you would need manual intervention there to deal with quorum unless you have the FOM hosted at a third site.
I haven't priced it out, but it shouldn't be too expensive to have some hosting provider run a FOM for you since the storage/CPU/Network requirements are all very minimal... if you need dual live sites you can probably shell out the coin for a nosted 3rd location. Our company couldn't so we went with two seporate management groups using async replication instead.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-05-2012 10:27 PM
07-05-2012 10:27 PM
Re: VSA Multisite and VMWare FT
Also, if only the connectivity between FOM and primary site is lost but FOM to secondary site connection is still up, does the secondary site now become active?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-06-2012 04:25 PM
07-06-2012 04:25 PM
Re: VSA Multisite and VMWare FT
The third site for the FOM is ideal. The FOM is just another vote. You need to have a major of managers in contact in order to have quorum.
If you have two nodes each at two sites, that is four votes, but you need a total of three for quorum so what happens when one site with two nodes goes down? The answer is you lose quorum. The answer is to add a FOM, but if you put that FOM at the same site as two of the other nodes, you end up with only two nodes voting when the site with the FOM goes down so you lose quorum anyway.
Trying to do automated failover with TWO sites is very tricky. It seems like it should be simple, but the issue of dealing with split-brain is not really possible without human intervention.
To answer your final question, I don't know 100% and would have to test, but I would imagine that as long as your primary and backup sites are still in communication with each other, it would not be a problem if your FOM loses contact with one of your sites since as long as the two sites have communication together they can provide quorum on their own without the FOM and I would be the SAN/IQ software has some logic about how to deal with a situation like what you asked about. On the flip side... what would happen if you don't have a FOM at a 3rd site and the two sites lose communication?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-08-2012 10:52 PM
07-08-2012 10:52 PM
Re: VSA Multisite and VMWare FT
Yes, split brain becomes a problem.
So we need to test out what happens if FOM links goes down to the primary, secondary or both. I guess HP should have tested these scenarios somewhere.. any document? or I have to test out myself...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-08-2012 10:58 PM
07-08-2012 10:58 PM
Re: VSA Multisite and VMWare FT
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-09-2012 07:38 AM
07-09-2012 07:38 AM
Re: VSA Multisite and VMWare FT
the manual probably talks all about this.
How can VPLEX really deal with split brain? It is impossible for a system to know if it has gone split-brain or if one complete site just failed.
If you have site 1 w/ two nodes and the FOM, it has three votes. If your second site has the other two nodes (and two votes), what is the difference between losing communication between the two sites and if one site were to suddenly go up in smoke? The answer is NOTHING and the site with the three nodes will stay live while the site with two nodes will go offline thinking it has no quorum majority. This is exactly what you want to happen because it ensures that you only have one live copy of data. IF the site w/ 2-nodes relizes that it is split and still serves data, now you have two sites with to different copies of your information.
There is just no way to have two sites live running as a single cluster and avoid every possible downtime situation.
As you are figuring out, there are tons of complications when trying to deal with a multi-site sync. cluster. 99% of the time, it makes more sense to just use snapshot replication.... just because the software has the multi-site cluster capabilities doesn't mean it makes the most sense for you to use that feature. Often times the costs associated with doing multi-site sync is just way too much... for example, you need to have a connection with the uptime/bandwidth/latency requirements to meet your local bandwidth usage with it typically $$$$$$$$. If you don't have that connection, your local data is going to have to slow down because you have to wait for the data to get sent to the remote site and confirmation of the write before your servers get confirmnation of their write. At that point, what is ~50$/month in hosting on windows asure or amazon cloud or whatever other cloud service you chose for your third virtual site to actually setup the multi-site cluster according to HP best practicies..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-10-2012 11:34 PM
07-10-2012 11:34 PM
SolutionRight. I understand the split brain issue.
But here is how VPLEX handles it: Once the sites are split, they are independent and do not connect back without manual intervention.
Now this may not be acceptable for certain situations, but in our case, I guess this feature is extremely useful.
Assume that we have a database running on the primary site. The secondary site has been configured for no client access from outside.
Once the split occurs, pirmary continues without any problems. The secondary site ups the database in its own right thinking it is primary. But since clients do not have access to it and it is not talkning to the main site, everything is fine.
Now, I agree that this scenario may not be that simplistic in practice but I think this is one acceptable and workable solution in a split brain scenario -- both sites becoming primary and application configuration will decide which one continues to operate.
Anyway, we will be testing the third site configuration for FOM and see how it goes.
Thanks,
Amjad.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2012 07:24 AM
07-11-2012 07:24 AM
Re: VSA Multisite and VMWare FT
I"m not sure exactly what the practical difference is betwen a LUN that is down and a LUN that has an initiator has no access to... either way the server connected to the LUN is gonna have a heart attack. If that downtime is acceptible, then you should be able to get away with a virtual manager instead of a FOM and just follow the procedures to force quorum manually any time there is a failover event. Doesn't sound ideal to me, but if you are happy with the required manaul intervention on VPLEX, then manual intervention on SAN/IQ should be just as easy. Or you could keep a standby clone of the FOM at teh 2nd site and then simply turn it on when the first site goes down. The important thing with that is to make sure you turn off the original FOM or keep it from starting up when the primary site comes back online. Then once you get everything back online you could shut down the cloned FOM and turn back on the original.
Nice thing about the VSAs is that you can test out everything you are curious about for free in a VM lab. Just spool up four VMs and an FOM and setup a new management group and assign the IPs and sites as you want and then play around with different failover situations. Theonly tricky one is the FOM communcation with one but not both sites which would probably require an L3 switch or a firewall to force that situation.