Array Performance and Data Protection
cancel
Showing results for 
Search instead for 
Did you mean: 

Any way to skip or restart a replication job? (Speed issues)

 
Highlighted
Occasional Advisor

Any way to skip or restart a replication job? (Speed issues)

Hey all, we've noticed that our arrays (AF-7000 running 5.0.10.0-742719-opt) will have severe replication issues if the replication process is interrupted in any way.  We have a number of LUNs that get hourly snapshots, and then replicate at the top of each hour.  If all is well, the snapshot replication across site to site VPN will run at upwards of 500 Mbit/sec; it has worked reliably for years. 

Sunday saw the global CenturyLink outage occur, which impacted our site to site VPN.  Upon conclusion of that event, VPN was stable and fast, but the arrays crawled along for over 12 hours at a few Mbit/sec completing the outstanding replication jobs.  Once those were done, it immediately resumed normal speed.  Support tunneled in during this occurrence and noted iperf reported 300-500 mbit/sec of bandwidth, but did not diagnose further before the problem resolved itself.

Today we had another network event, but this time only lasting a few seconds.  VPN re-established in what would have been less than 10 seconds, but that's all we needed to throw the array into this same weird state where it's cranking along at a whopping 3 Mbit/sec.  I'm watching it with CLI stats --replication --partner and it's bouncing around from zero to 1MB of data, but averaging 3 Mbit.  I stopped/resumed replication, it stops, and it resumes at the same horrible speeds.

I'm sure, just like Sunday, whenever it decides to finish, it will go back to normal.  I'm curious if there's a way to force it to skip the current snaps, or to abandon what has been done and start fresh on the current, as I suspect that would also clear this issue.

Anyone else seen similar behavior?

2 REPLIES 2
Highlighted
HPE Pro

Re: Any way to skip or restart a replication job? (Speed issues)

The array do not drop speed  after restoring from a Network outage.  Looks like this is an ISP issue.  We have seen the ISP pipe the VPN through different route. It is recommended to check with ISP about this behaviour. 

I could suggest you to Please contact NImble Support again. They will deep dive to verify if everything is fine from Array perspective. 

1. Is this uni-directional or Bi-directional replication? 

2. Which network is used for replication? (Management or data network) and what is port speed?

3. Have you set any throttle for replication?

You can pause and resume the replication in the volume collection section.


I work for HPE

Accept or Kudo

Highlighted
Occasional Advisor

Re: Any way to skip or restart a replication job? (Speed issues)

Thanks; I've got a new ticket open.  The issue was only occurring in one direction, the opposite array returned to normal performance at the conclusion of the network event.  The replication is via the management interfaces and no throttling.