ESX
cancel
Showing results for 
Search instead for 
Did you mean: 

SAN Replication for Fault tolerance using EVA4400

APT_SG
Occasional Contributor

SAN Replication for Fault tolerance using EVA4400

Hi Everyone,

I hope that someone would point me in the correct direction - it looks like I have no enough konwledge in the subject and timeframes are too tight for me to explore different scenarios in depth..

We have two datacenters few miles away from each other connected by 100 Mbps link.Each datacenter will have 5 BL490 blades with ESX Standard hosting about 50 VMs. Eac hsite has HP eva4400 SAN with SAN replication set up.VC is going to be in the first datacenter and both datacenter are networked.

SAN Replication is block level so it seems like I cannot just replicate changes but all writes would have to be replicated.This should not be a problem as link can sustain about 1.8 TB a dayand data can be buffered.

I am having trouble however visioning how recovery would work in this case.We don't need instant recovery , I would say 4 hours recovery time is accepted so fancy automatic SRM like DR scenario would not be easily accepted due to the financial reasons, however any comments are welcomed.

Current idea is following: replicate LUNs from primary site to the secondary.When disaster strikes, IT personnel switches on ESX hosts on the remote side and connects replicated LUNS to them, then registers VMs and changes IP address.

I understand that this seems like horribly manual process and I almost sure I have missed some obvious pitfalls here.

Could someone let me know what direction should I go?An articles regarding the subject?

This is a brand new setup and we would rather build up basic recovery process and scale it later.I just need to have a right direction to allow for such scalability.

Thank you very much in advance!
3 REPLIES
Patrick Terlisten
Honored Contributor

Re: SAN Replication for Fault tolerance using EVA4400

Hello,

first of all: Forget the ip address changes. This will work, but if you have a site failure, you have to change ip addresses, the routing etc., and you should hope that no application uses the old addresses any more. But from some real-world scenarios a worked with: Changing ip addresses will not work flawless.

The next point is the vCenter. You will need a vCenter on the DR site. So you should have a valid backup of your vCenter, the vCenter database, a running domain-controller and a Command View EVA server on the DR site.

In case of a disaster, you should failover the DR groups (if not happend). The presentations for the ESX servers on the DR should already happend. Take care that you use the same LUN IDs. If this is done, you biggest "problem" should be the networking part.

Regards,
Patrick
Best regards,
Patrick
APT_SG
Occasional Contributor

Re: SAN Replication for Fault tolerance using EVA4400

Thank you Patrick
Rob Buxton
Honored Contributor

Re: SAN Replication for Fault tolerance using EVA4400

SRM provides an automated way of scripting the required changes you need to do to bring VMs on line.

We use EVA SAN to SAN replication. For testing we snapshot the replicated LUNs and present those to DR based ESX hosts. Again SRM can do this process for you.
We are able to isolate our DR site from the production network but keep the data replication going.
Having a known map of IP addresses to use at DR is critical.
You can simply replicate the LUNs that Virtual Centre and its database are on. Or maintain a second instance of VC and use that.
So depending on the size SRM may be the tool to consider for automating a lot of this.

If it's a windows / AD environment, you may want to consider extending the AD to DR.
Those AD servers may already be active and the AD environment established. That could save some time in the recovery environment.