Operating System - Tru64 Unix
1748210 Members
2980 Online
108759 Solutions
New Discussion юеВ

Re: CA recovery of a TRUcluster environment.

 
SOLVED
Go to solution
uma_3
Advisor

CA recovery of a TRUcluster environment.

I am looking for comments regarding recovering a TRUcluster environment that has been replicated with Continuous Access (CA).

I have read documentation around member boot disk and the shared cluster file system recovery that have generated many questions.

When CA replicates all the Trucluster disk, "member boot disks, quorum, cluster...", I am expecting to use new dsk devices. Is this correct?

Recovering the cluster with these device changes is the challenge. My current understanding tells me each replicated member boot disk's cnx partition would need updating. Is it possible to boot to the shell from the OS CD and mount the one of the replicated member boot disk, edit the sysconfigtab to show changes to clubase and vm?

Also is it possible to edit clu_bdmgr.conf and pass this to the new member boot disk using #clu_bdmgr -h dskx clu_bdmgr.conf?

Hopefully this isn't to confusing.

Any insight into this is much appreciated.


7 REPLIES 7
Ivan Ferreira
Honored Contributor
Solution

Re: CA recovery of a TRUcluster environment.

We use TruCluster with CA. Currently we had to failover because a failure on a disk group, 4 disks failed at the same time without reason.

When you replicate the cluster file sytems and the members boot devices, when you failover to the destination storage, the device names does not change, because the WWN is also replicated.

The failover process is very simple, just do the failover at the continuous access management console, then, boot the servers again. You don't have to do any other configuration changes.

As a procedure, after a failover, we run the wwidmgr -clear all and wwidmgr -quickset again. Sometimes, the member does not boot if these commands are not run.

Currently, I can say (by now) that the EVA is not very reliable (comparing with an older storage), maybe because of the FC technology. We had severals extrange incidents.

We had a disk group with protection double, we turned off the storage, and after turning on, disks started to fail, when the 3 disk failed, the disk group failed. What I can say is that the continuous access works good because we was able to boot everything and start working without problem from the destination eva.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
uma_3
Advisor

Re: CA recovery of a TRUcluster environment.

Ivan,

Thanks for your response and sharing your experience. In our lab we replicated the storage and noticed the vdisk wwn were the same between the two storage arrays. We also noticed the UUID's of the vdisk were different. Our observation would have us guess that TRU64 would use the UUID to create new device files and as new files were created, we would have to manage these changes to get the cluster back up. From your feedback the device files are unchanged and bringing the cluster up is a matter of using wwidmgr and rebooting.

I would have to wonder if the reason a wwidmgr quickset needs to be rerun is the difference in the UUID's.

Thanks again.
Ivan Ferreira
Honored Contributor

Re: CA recovery of a TRUcluster environment.

I'm not sure about that, but I think that is because a change on the path to access the device. I have not checked if the device name changes at SRM level, that is one thing that I have to check in my following failover.

We will be doing a failover this weekend. I will check that information.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
uma_3
Advisor

Re: CA recovery of a TRUcluster environment.

One clarification:

The world wide lun number and the uuid are seen at the vdisk level. Another setting at the vdisk presentation level is the os unit id.

It is the os unit id that wwidmgr see as udid.

From the SRM prompt using wwidmgr, I would wonder if the changed uuid is picked up.

Ivan Ferreira
Honored Contributor

Re: CA recovery of a TRUcluster environment.

Yes, WWN and OS_UNIT_ID are the same on replicated VDISKS.

Last weekend we did a failover, after the failover, your HBA won't see the devices, and that's why you have to run the init, wwidmgr -clear all and wwidmgr -quickset -udid commands. But the device name at console level or operating system level won't change. For example, you won't have to run set bootdef_dev.

Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
uma_3
Advisor

Re: CA recovery of a TRUcluster environment.

Thanks Ivan, we also were able to failover a TRUcluster environment over the weekend. We followed the same instructions you noted. Yet, feedback from the site indicated they did have to reset the bootdef_dev. Not a big deal. Your information was much appreciated. I think we can move forward from here. Thanks again.


uma_3
Advisor

Re: CA recovery of a TRUcluster environment.

Close.