TruCluster
cancel
Showing results for 
Search instead for 
Did you mean: 

Two site TRU cluster and disaster recovery project

SOLVED
Go to solution
Vladimir Fabecic
Honored Contributor

Two site TRU cluster and disaster recovery project

Current situation:
One location, 4 node v5.1B Tru cluster, storage EVA4000
Planned situation:
Two locations 4 node Tru cluster (2 nodes on each site), two storages (EVA4000 on one site and EVA4100 on other site) replicated with CA, dark fiber between sites

Cluster interconnect is ethernet, and there is about 30 km between sites.

Did anyone make project like this?
Does anyone has working solution similar to this?
What are the problems?
Any performance issue?
Any other suggestion?
In vino veritas, in VMS cluster
10 REPLIES
Rob Leadbeater
Honored Contributor

Re: Two site TRU cluster and disaster recovery project

Hi,

I've never done this, but I'll guess the limiting factor will be what bandwidth you have between the two sites...

Is the same link being used for both the cluster interconnect and the storage replication ?

Cheers,

Rob
Vladimir Fabecic
Honored Contributor

Re: Two site TRU cluster and disaster recovery project

Is the same link being used for both the cluster interconnect and the storage replication ?

Yes, this should be CWDM (as I said, dark fiber between sites). But I guess it should not be a problem because of dark fiber bandwidth (maybe am I wrong?).
In vino veritas, in VMS cluster
Ivan Ferreira
Honored Contributor

Re: Two site TRU cluster and disaster recovery project

Did anyone make project like this?
I did.

Does anyone has working solution similar to this?
We had. But with memory channel interconnect (FC) and a less distance.

What are the problems?
In our environment, using oracle RAC, the Ethernet cluster interconnect was a real problem, so we changed to MC. If your cluster is active/passive, then go for this configuration.

Any performance issue?
If you cluster is active/passive, this will work. If you have a real application cluster for exaple, this won't work.

Any other suggestion?
I really don't like ethernet cluster interconnect.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Vladimir Fabecic
Honored Contributor

Re: Two site TRU cluster and disaster recovery project

Thanks a lot Ivan.

There is no RAC (I hate RAC because I had bad experience with it).
It is cluster configuration where caa applications are failovered between nodes (can call it active/passive from that point of view).
Please post some details about building and testing it. How did you boot cluster on other site during testing? Did you have some problems with replicated cluster root, usr and var? And please post anything you think that is important.

I found this document:
http://h30097.www3.hp.com/docs/patch/51B/bl27/HTML/apas01.html#e-d-cluster-fig
In vino veritas, in VMS cluster
Ivan Ferreira
Honored Contributor
Solution

Re: Two site TRU cluster and disaster recovery project

>>> How did you boot cluster on other site during testing?

No major problems as CA will replicate the devices WWID. After CA failover, you must:

Run the init command at the console.
boot the system.

>>> Did you have some problems with replicated cluster root, usr and var?

No. If you break the DR Group in CA, make sure that the disk is not in read only mode.

You can also, have the vdisks presented on source and destination storage without any problem.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Vladimir Fabecic
Honored Contributor

Re: Two site TRU cluster and disaster recovery project

>>> You can also, have the vdisks presented on source and destination storage without any problem.

How do you present replicated vdisks while normal sutuation? How do you see those (read only disks) in device manager?
In vino veritas, in VMS cluster
Ivan Ferreira
Honored Contributor

Re: Two site TRU cluster and disaster recovery project

Just present them as normal vdisks. You won't see those disks, they are "hidden" by the storage controller automagically.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Vladimir Fabecic
Honored Contributor

Re: Two site TRU cluster and disaster recovery project

Thanks a lot, Ivan.
You helped me a lot.
In vino veritas, in VMS cluster
DCBrown
Frequent Advisor

Re: Two site TRU cluster and disaster recovery project

Any other suggestion?

The critical system disks (cluster root (/), /usr, and /var file systems) should be configured using synchronous data replication. However, if available, other application disks can be configured using asynchronous data replication. This eliminates the 60km round trip delay that would otherwise be present on all writes.

This opens the timing window slightly in terms of what data could be lost during a disaster event. Also this requires more attention be paid to the replication link status/health etc. but is something that many business are willing to risk manage for the benefit of faster normal operations.


DCBrown
Frequent Advisor

Re: Two site TRU cluster and disaster recovery project

The main link of http://h30097.www3.hp.com/docs/patch/51B/bl27/HTML/apa.html has the release notes associated with enhanced distance clusters. The page includes links to requirements, recommendations and restrictions.

I prefer the attached drawing over the one referenced. In reference to the attached drawing - Since the secondary volumes are not visible, access to primary volumes on Array 1 from Data Center 2 are over a long san link (ISL) and may experience delays. The recommendation for this type of configuration would be to bind applications that need access to disks in set P2 to nodes 3 and/or 4 and bind applications that need access to disks in set P1 to nodes 1 and/or 2 to minimize IO delays associated with cross-site san link length. Typically this configuration would utilize asynchronous replication on application disks to avoid similar delays due to the backend data replication which occurs over the same long links.