StoreVirtual Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

Primesync - Incremental Snap taking too long

DodiB
Occasional Advisor

Primesync - Incremental Snap taking too long

Hi,

I'm working on getting a remote snaphot setup using primesync. I've got the first snaphot copy done of the volume from A-B then from B-C, this is the full copy effectively from my understanding. This snapshot size is 100Gb approx. (thin with no replication).
I've been trying to get the incremental remote copy from A-C working and it seems to be doing something unexpected (at least by me). The snapshot size is 10Gb (thin with 2-way replication), so that's about 5Gb to copy to the remote site. The copy was running for about 18 hours with a bandwidth of 1Mbps set. The data copied was almost complete but, nearly at 5Gb, but the progress of the job was showing as 27%?? I'd watched this for a while and decided I must have done something wrong when I setup the remote snapshot so I cancelled it and restarted this but it seems to be behaving the same way.
Is this expected behaviour? It seems like it may be scanning the remote volume to ensure that block consistency is there but I didn't expect for this to happen and the time it is taking is a factor as I can't leave replication running at this bandwidth during the week and during the day?

Cheers
DB
10 REPLIES
Gauche
Trusted Contributor

Re: Primesync - Incremental Snap taking too long

By my math that copy should take at the very least 11.37 hours to complete because of the bandwidth setting.

I think it is reporting a lower % then you see in data complete because that % does take into account the scanning of the snapshot being copied too, not just the copy.

Also, check you management group bandwidth. The remote copy bandwidth is dependent on the MG bandwidth.
For example if your remote copy bandwidth is set to 1Mbps but the Management group bandwidth of either management group (sending or receiving)is lower, say 0.5Mbps, the copy will go at the slower speed because the bandwidth limiters must be enforced.

Hope that helps.
Adam C, LeftHand Product Manger
DodiB
Occasional Advisor

Re: Primesync - Incremental Snap taking too long

Hi Gauche,

I ended up just letting this run and in the end it ran for a number of days and copied the whole volume from what I could tell which is not how I expected this to work at all but thought that maybe I had done something wrong. I'm doing another primesync just now but it seems to be behaving the same way. Is there a up to date detailed procedure for this as the current document seems to be based on SANiQ 6.5? One of the thngs that's unclear is the creation of the snapshot from A-C, is it correct that I would need to delete the source snapshot on the B mgmt group that corresponds to the C full snap before I can use the target volume on C as the target volume for the incremental snap from A-C?

Cheers
DB
Bryan McMullan
Trusted Contributor

Re: Primesync - Incremental Snap taking too long

What was the exact amount of data that was being copied for the remote snapshot? (The one after the prime?)

You can't really estimate correctly based on snapshot size what the remote snap transfer size will be as it does a remote differential and just transfers the changed bytes. I've also seen many times that the estimated time and % will be off, sometimes greatly. When you hit that 5GB mark where it looks almost done, what is the estimated time remaining on the transfer?

Gauche's estimates of 11.5 hours looks pretty good to me (based on my estimates using my 30 remote snaps), but that would be at a sustained 1Mb transfer rate. Depending on your site-to-site connection (is it a true point-to-point, vpn over the internet, etc.), you may not be reaching that amount at all times.

What is your cluster bandwidth set to on both sides? If it is set to low, that may cause additional delay as well. The recommendation that I've received from LHN support is 4MB for normal operation.

Bryan McMullan
Trusted Contributor

Re: Primesync - Incremental Snap taking too long

Sorry DB, I didn't realize your original question was from 4 months ago. Ignore my previous response.

As for your new question, I don't believe that to be the case. When you are setting up a snap from A to C, can you just name the remote volume differently? That should prevent you from needing to delete it. Unless I'm not really understanding the question.

Here's the latest copy that I'm aware of for the remote snaps:

http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01727843/c01727843.pdf?jumpid=reg_R1002_USEN
DodiB
Occasional Advisor

Re: Primesync - Incremental Snap taking too long

Hi Bryan,

Thanks for the reply and the link, unfortunately I don't see anything in that guide to help in this particular scenario (although it'll be useful for other!). The primesync method I'm trying to use here is centred on using a temporary mgmt group that I physically move between the sites so that I don't have to take days to replicate the first snapshot over the WAN. Where I'm having a problem is at the final part of this.So I have three sites:
Site A (Source) - Site B (Primesync)- Site C (Target)
I want to copy VolumeA on Site A to Site C and setup scheduled remote snapshots.
1. I setup Site B at the same location as Site A
2. do a remote snap from A-B so I have a remote copy of Volume A on Site B called SiteB_RC_VolumeA.
3.I then shutdown the SiteB storage server, 4. drive it to Site C and put it on the network there
5. Setup a remote copy from SiteB to SiteC so I end up with SiteC_RC_VolumeA
(Up to here all is working fine and as expected)
6. I now try to establish remote snapshots between SiteA_VolumeA and SiteC_RC_VolumeA on the assumption that SiteA will understand that SiteC_RC_VolumeA is an existing remote snapshot of VolumeA and it will simply do an incremental remote snapshot.
7. In order to make SiteC_RC_VolumeA visible to the remote snapshot wizard as a valid target I need to delete the snapshot on SiteB that was created when doing the remote snapshot of SiteB_RC_VolumeA to SiteC_RC_VolumeA
8. I run through the wizard and start replication
However this is where it falls down and it seems to end up being a full snapshot which is what I was trying to avoid originally. The docs for this 'feature' are a little old and unfortunately don't go into massive detail about the exact steps to do between Site B and Site C (http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01750188/c01750188.pdf?jumpid=reg_R1002_USEN)

Cheers
DB
Bryan McMullan
Trusted Contributor

Re: Primesync - Incremental Snap taking too long

I think I see where you are going. I think when you do the snapshot to site B and then shut it down and move it to site C, are you changing the cluster on it? Or is this just really one node and only maybe the IP changes when you drive it from site B to C?

If you are adding it to a different cluster, it will lose all data that was on it previously and the data that is on the new cluster overwrites the old data as it is restriped. Then when you try to push it again from site A to C, it has to make a brand new copy.

If it is just a single node cluster, it would make no sense on why it is recreating the full push. Though I would think you may be better off just grabbing the node from site B and playing at site A to guarantee it is working and then just moving it to site C (if that is possible). Let it run through a few remote snaps as it will be extremely fast on a LAN.

DodiB
Occasional Advisor

Re: Primesync - Incremental Snap taking too long

Hi Bryan,

I'm keeping the management group and just changing the IP on B when I move it to SiteC so each site represents a separate management group effectively.
At this stage the replication is so far through and as this is production I'm not really keen on doing much more retries.
It should work but I don't see why it's not working and that's the frustrating bit seems like it's one of those situations where a small detail or intricacy is all that's stopping it from working. Pity as it's a great feature to have but once bitten twice sy as they say!

Cheers
DB
DodiB
Occasional Advisor

Re: Primesync - Incremental Snap taking too long

Hi Bryan,

I'm keeping the management group and just changing the IP on B when I move it to SiteC so each site represents a separate management group effectively.
At this stage the replication is so far through and as this is production I'm not really keen on doing much more retries.
It should work but I don't see why it's not working and that's the frustrating bit seems like it's one of those situations where a small detail or intricacy is all that's stopping it from working. Pity as it's a great feature to have but once bitten twice shy as they say!

Cheers
DB
Bryan McMullan
Trusted Contributor

Re: Primesync - Incremental Snap taking too long

DB,

You are right, it should be working fine. The system should have no idea that you are moving the node from one site to the other if you are not changing anything but the IP.

We recently did a datacenter move in July and I move our production cluster (35 nodes) from one location to another and updated all IP's). All remote snaps continued to work as expected with no issue. We did the same with our remote site last year and we did not experience what you are.

Is the remote snapshot you are doing scheduled? Or are you manually starting the snapshot? If it is manual each time, I don't think it will do a differential and will instead copy everything.

If you are doing it manually, could you try setting up a scheduled remote snapshot and when you get the prime over, just pause the schedule. When you move the node from site B to site C, then re-enable the schedule. That should allow the differential copy to work.

Hope that helps!

Bryan
BlueBerryIT
Occasional Visitor

Re: Primesync - Incremental Snap taking too long

Hi there,

 

Did you ever get this working? I have a need to do this for a customer and would love to know if you managed to get it working. There seems to be very little informaiton out in the ether.

 

Thanks,

 

Richard