HPE EVA Storage

Excessive data exchange retry rate

 
fh-djb
Advisor

Excessive data exchange retry rate

We are using Continuous Access to replicate LUNs between EVA 8100's geographically dispersed over a dedicated 12 Mb point-to-point circuit. The LUNs have replicated w/o problem, but approximately every 4 hours we get the following message from ISEE:

Problem Found:
CUST: CA warning: IDCEVA8100: Excessive data exchange retry rate on the inter site link.


Full Description:
Excessive data exchange retry rate on the inter site link is preventing acceptable replication throughput: Reducing data exchange resources. The UUID of the remote DR group and the UUID of the Peer storage system is stored in the FRU section. A summary of events have been consolidated in the evidence section.

Full Corrective Action Statement:
Check fabric switch settings and inter site link quality between the Data Replication Source Site controllers and the Data Replication Destination Site controllers.

Anyone have any ideas where to start looking to isolate the problem?
5 REPLIES 5
Víctor Cespón
Honored Contributor

Re: Excessive data exchange retry rate

Fisrt check the FC swithes where the EVAs are connected, on the ports used for replication, for encoding errors (Encd err outside frames, Loss Sync, Loss link). Compare those numbers to the total frames sent and received.
Then check the long-distance link, you'll need to contact the company providing it and the IP routers if it's a FC over IP link.

Re: Excessive data exchange retry rate

Dear Colleagues, we have the same issue with 2xEVA4400
Moreover we have have 2EVA4000 and no such issues on these systems, they are replicating over the same ISL link and I'm sure that link (direct fc cable) is ok

What could be a problem?
Thannk you
Víctor Cespón
Honored Contributor

Re: Excessive data exchange retry rate

Please be aware that the latest firmware for EVAx400 includes two data replication protocols. Each one needs different settings on the FC switches.

The HP FC replication protocol is the one being used util now. It needs port-based routing and in-order delivery set on the SAN switches:

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c00590193

HP SCSI-Compliant replication protocol is the new one included on XCS 09534000 and does not need that.

Probably you have created the replication between the EVAs 4400 without looking at this.

Check the Continous Access Implementation Guide.

http://bizsupport2.austin.hp.com/bc/docs/support/SupportManual/c02491009/c02491009.pdf
Johan Guldmyr
Honored Contributor

Re: Excessive data exchange retry rate

Hi,

the OP had this setup:

"12 Mb point-to-point circuit"

Is that enough??

As far as I understood it it works like this:

When an EVA has a DR-link and it needs to send something over it, it uses the whole link.

What if you have two EVAs on the same link?
Víctor Cespón
Honored Contributor

Re: Excessive data exchange retry rate

The "Excessive data exchange retry rate" message is usually due to a misconfiguration of the SAN switch parameters. Just check the CA implementation guide, it has which parameters you should set on each case.

The size of the DR link depends on how much data you want to replicate, it should be enough to transmit the average amount of data. That depends on the amount of data written to the vdisks being replicated. No easy way to calculate, you have to monitorize the written bytes to the vdisk.

The data is copied to the other EVA in 16 KB chunks, as many as will fit on the link. If you have two replications over the same link, they will share the bandwith, of course, like two apps on your computer downloading from the Internet connection.