HPE EVA Storage
1826933 Members
2419 Online
109705 Solutions
New Discussion

Re: CA: Source DR Groups show as failed

 
Jonathan Harris_3
Trusted Contributor

CA: Source DR Groups show as failed

Seeing as HP's woeful support team have now sat on this for two days, having offered nothing more than 'have you tried rebooting?', I thought I might get more of a response here.

Running Continuous Access between EVA 3000s. I added an extra DR Group a couple of days ago, at which point the connection status of all DR Groups on the source EVA switched to 'Failed'.

The odd thing is that everything appears to be working ok (at least, I hope so!). All the links are up, there are no errors in the logs, none of the groups are logging, the destination EVAs show everything as working fine and I've set up new DR groups from both the source and destination EVAs without any issue.

I've tried restarting the CVE service, rebooting the appliance, suspending and resuming DR Groups, but nothing seems to clear these failure icons.

Anyone have any thoughts?
12 REPLIES 12
Ivan Ferreira
Honored Contributor

Re: CA: Source DR Groups show as failed

In command view you see the source and destinatios vdisks correct? no logging? link status?

Do you see replication data using evaperf drt?

Have you tried checking the DR status using SSSU?
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Amar_Joshi
Honored Contributor

Re: CA: Source DR Groups show as failed

Hi Jonathan,

It's a pure thought so please tolerate me if I am adding anymore frustration to your situation.

Can I say that your CA pairs are working? You might be knowing it by suspending the pairs, it should have started logging. If yes, then probably the CA is working but it's a more of cosmetic issue (either in Comm.View or CA-EVA). You can run (or ask HP to run) SSSU and gether actual configuration and confirm what is the actual status.

In worst case you may need to restart the remove EVA (i know it's pain) or need to upgrade VCS/CA.

In extreme case where you don't have shutdown of remote EVA, delete CA pairs, reinstall CA, create new pairs and see (might work).

Hope my 10 cents worth.
Owen_15
Valued Contributor

Re: CA: Source DR Groups show as failed

Hi Jonathan,

This sounds like a cosmetic issue as mentioned. Rebooting the SMA (Appliance) mornally clears these issues up, but not always. The next thing to try would be to start up RSM, Replication Storage Manager. Commandview has some odd inter-relationship with this product for the CA component of operation.

Open up RSM, and check what the status of everything is via this interface. Try creating a DR group and then deleting it again afterwards in RSM. Jump back into CVEVA and see if it has cleaned up the status of the DR Groups, it may take some time (20-30 minutes).

As a point of note, these days I always do any CA work within the RSM interface. This was on an EVA5000 on 3.028.

Good luck.

Regards
Owen
Jonathan Harris_3
Trusted Contributor

Re: CA: Source DR Groups show as failed

Many thanks for the replies. In answer to the various questions:

I can see the source and destination vdisks. There is no logging, but the link status at one end shows as failed.

I hadn't thought to check with evaperf, but after running evaperf drt, there does seem to be traffic going through the tunnels. That at least gives me a certain peace of mind!

In SSSU, for the source EVA, operationalstate on all the DR Groups shows as 'attention' and the linkstate shows as 'failed'. On the destination EVA both operational and linkstates for the DR Groups show as 'Good'.

I'm fairly sure that it is a cosmetic issue, it's just rather disconcerting when you have 40 DR Groups suddenly show up as 'failed' (especially considering the value of the data).

Unfortunately, restarting the EVAs just isn't an option, nor is deleting the DR Groups.

RSM is already running and shows a similar status to CVE (I believe RSM queries CVE for all its info). The creation of the new DR Groups mentioned in my original post was completed through RSM (and deleted too). Also, RSM resides on the same box as CVE, so that has been rebooted too.
Uwe Zessin
Honored Contributor

Re: CA: Source DR Groups show as failed

May be it is this problem, which is described in the CV-EVA V6.0.2 release notes. After all, you said the problem started after you added a DR Group.

""
* All DR groups display in the user interface and command line interface

The HP Command View EVA graphical user interface and the HP StorageWorks Storage System Scripting Utility (the utility) can now to display more than 43 DR groups.
""
.
Amar_Joshi
Honored Contributor

Re: CA: Source DR Groups show as failed

Hi Jon,

I won't say that RSM rely on CVE but they are quering the same object and getting the same response "faile", also the SSSU.

If traffic flow gives you a peace of mind then I will say you are sitting on a disaster. Take an immediate action to resolve it. What if the Secondary CA Volumes are not consistent?????

Get HP's highest level attention (whatever they call ELEVATION/ESCALATION or ENGINEERING) and resolve it.
Uwe Zessin
Honored Contributor

Re: CA: Source DR Groups show as failed

> I won't say that RSM rely on CVE

But it does! The RMS server talks to the Command View EVA agent.
.
Ivan Ferreira
Honored Contributor

Re: CA: Source DR Groups show as failed

Just because you see the link as "failed" it won't mean data corruption. At most replication is not being transferred, but if you see traffic on the tunnel and vdisks are not logging, then maybe is a "extrange" issue with the tunnel connection.

Suddenly, the most probably option that will give you HP to you (as usually does with us) is to shutdown/reboot the controllers. I had to reboot both controlers, one at the time, in some situations. You can do this with the destination EVA first to see if the link goes up.
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
Jonathan Harris_3
Trusted Contributor

Re: CA: Source DR Groups show as failed

Amardeep,

CA replicates at block-level, so for the destination blocks to be corrupt the original blocks that are replicated down need to be corrupt. If the blocks on the destination disk(s) are bad, then they are marked as such and the data reassigned.

If the local data is valid, CA is running and there's traffic flowing between sites, the chances of corruption are minimal. I already had the first two, so adding traffic flows does give me that extra peace of mind.

As an aside, there is no way for CA to report corruption anyway, so the chance of not knowing about underlying corruption is one that we take anyway, even if everything shows as running correctly.

I am currently escalating through our account management team, so hopefully I should get a decent response at some point.
Amar_Joshi
Honored Contributor

Re: CA: Source DR Groups show as failed

Hi Uwe,

Thanks for rephrasing, what I wanted to say that chances of having CVE responsible for this problem is very minimal.

Hi Jonathan,

You are talking about replication, yes it's going to be fine but the DR "failover" functionality comes from the CVE, a FAILOVER has to be activated from remote EVA to make the disks primary. Since issues are there, you never know what's next in the queue, unless you try. So I am just adding precautions, nothing else.
Jonathan Harris_3
Trusted Contributor

Re: CA: Source DR Groups show as failed

We carried out some testing at the weekend to prove that replication was still working ok (it was) and then rebooted the master controller on the source EVA, as per the (eventual) recommendation from HP support.

This appears to have resolved the problem.

Thanks for everyone's responses on this.
Jonathan Harris_3
Trusted Contributor

Re: CA: Source DR Groups show as failed

See above.