Shadow Copy Breaks

Uwe Zessin · ‎10-01-2004

Volker,

has the behavior of a shadow copy operation changed? In the old days it did _always_ read from both disks and only wrote differences to the copy target. I remember this was _very_ visible in an environment with FDDI.

In 1997 I was working on a project were we were mirroring 50 GigaByte between two data centers. It was stored on two HS211(?) FDDI servers (each one a pair of AlphaServer 1000 with the disk storage connected via DSSI-based HS1CP controllers).

A new copy of a dropped member took about 6 hours. Somebody configured a different stripe size between both storage arrays. We corrected it and restarted a copy. Of course the disks looked completely different now and the copy took about 11 hours!

.

Ian Miller. · ‎10-02-2004

shadow copy reads from both and writes (if the result is different) to the copy target.
From the shadow manual
"Read I/O requests to either side of the copy fence are serviced only from a source shadow set member.

Write I/O requests before or at the fence are issued in parallel to all members of the shadow set.

Write I/O requests, after the fence, are completed first to source members, then to copy target members. "

So the sybase process will not be reading from the shadow target disk at all.

you may want to look at
http://www2.openvms.org/kparris/s2003_volshad_perf.ppt

____________________
Purely Personal Opinion

Keith Parris · ‎10-06-2004

What type of disk controllers are involved?

What type of caching is enabled on the controllers? Read? Writeback? Read_Ahead? Maximum cache transfer size parameters (for read and write)?

Does the data on the shadow-copy target disk tend to be similar to the source disk, or is it typically very different?

Any chance you could use Mini-Copy to re-integrate the disk?

Is the active Shadow_Server process which is performing the copy operation located on a node at the site of the source disk, or of the target disk? (You may get some benefit from having the Shadow_Server at the target disk site do the Copy, so the shadow-copy writes will be local to that site.)

What is the value of the SYSGEN parameter NISCS_MAX_PKTSZ?

I understand Sybase is reading the source disk. Any chance of making that source disk a 3-member or 4-member mirrorset and spread I/Os across more spindles?

Similarly, any chance of spreading the Sybase database across multiple storage units (e.g. with host-based RAID software and striping -- RAID 0 -- actually RAID 0+1 -- arrays) to provide more spindles for Sybase reads? (and then you could continue to use SHADOW_MAX_COPY=1 to limit shadow copy activity to just one of these spindles at a time if you wish)

If I understand your situation correctly, there is only 1 source disk and 1 target disk present, so you only have 1 choice of disk to read from and the 'Site' feature for member selection for reads won't be effective until the shadow copy operation completes and there is at least one member at each of the two sites.

Keith Parris · ‎10-07-2004

One more note on the SITE feature -- this is designed for multi-site clusters where there is significant distance (and thus latency) between the sites, such that it is of more benefit to performance to direct all reads to a local shadowset member than to spread the reads across multiple members. (I worked with one such cluster, where the inter-site distance was around 400 km and the measured round-trip inter-site latency was 4.4 milliseconds, which meant remote read requests had an additional latency of 4.4 milliseconds compared with local read requests.)

Unless your inter-site distance is large (say 100 km or more) or the inter-site latency is high (say a millisecond or more), it may be better to simply spread reads across the disks at both sites.

Given your Sybase application seems to be sensitive to read performance, and assuming your inter-site distance is not large, you may not want to use the SITE feature at all (or you may wish to carefully select your own READ_COST values so the 2nd site will start taking some of the load when the 1st site starts to get busy -- since READ_COST is added to queue length when selecting a member for a read, then if, for example, you had the READ_COST for remote members set higher by 2 than the READ_COST for local members, reads would normally go to the local member unless the queue depth on the local disk got up to 2 -- then some reads would start going to the remote member.

Wim Van den Wyngaert · ‎10-24-2004

So the conclusion is that there is no real way to slow down the shadow copy.

Of course there are ways to move the load to other disks but this is not a solution to my (general) problem.

Wim

Wim

Keith Parris · ‎10-25-2004

I don't think it's correct to say there is no way to throttle a Copy operation. (I had hoped you would supply answers to the questions I posed in my first post so that additional suggestions could be made.)

Re: Why can we throttle Merges but not Copies?

A Volume Shadowing Copy operation is designed to restore redundancy, and it is initiated under the direct control of the system manager (by doing a MOUNT command). So Shadowing tries to do its best to restore complete redundancy as soon as possible. If redundancy isn't as important as performance, one could defer the start of the Copy operation.

A Merge, on the other hand, will be initiated by events beyond the control of the system manager (a system crash or failure on a node which has a shadowset mounted), and may occur at times which may be inconvenient, so the initial design, as well as later additions like the SHAD$MERGE* logical names, were intended to minimize the impact of the Merge on user application I/Os.

Re: What can we do to throttle Copy work?

SHADOW_MAX_COPY is one control knob, but since you have the Sybase data on only a single disk unit, this has limited control.

If you used host-based RAID software to form a stripeset of shadowsets (RAID 0+1 array), you could leave SHADOW_MAX_COPY set to 1 and only one of the "n" shadowsets within that array would be doing a Copy at once, and also, as each of them completed in turn, that fraction of the total disk area would then become available for Sybase reads. This means performance would increase over time during the Copies. With only a single disk unit, the target unit only becomes available to service read operatoins once the entire area has been copied.

There are also a number of optimizations noted in my presentation from Amsterdam on Volume Shadowing Performance cited earlier in this thread. You can "undo" (or do the "opposite" of) some of those optimizations if they tend to favor Copy operations over Sybase reads. For example, if most of the Sybase reads are randon, controller read-ahead caching would benefit Copy operations more than Sybase reads, so you might want to turn read-ahead caching off, at least during the Copy. Reads seem to be what Sybase needs most here, so turning off write-back caching on the Copy target unit might slow the Copy. Similarly, if Sybase reads tend to be very small (we know that Shadowing Copy operations will be large: 127 blocks in size), then using a smaller value for NISCS_MAX_PKTSZ might slow down the Copy.

If the data tends to be very different at the time you re-introduce the target disk, then one optimization is to do a BACKUP/PHYSICAL operation beforehand to make the data (mostly) identical, which changes the target disk operations from a combination of reads and writes to a sequential stream of reads, but that would tend to run faster and have higher impact on Sybase reads. So in your case, where it is the source volume where performance is more critical, it might tend to throttle the Copy operation and have a lower impact on the source volume if you instead did an $INITIALIZE/ERASE on the target volume before adding it to the shadowset.

If the data tends to be very similar at the time you re-introduce the target disk, then it would be best to use a Mini-Copy if possible.

Wim Van den Wyngaert · ‎10-25-2004

Keith,

I didn't answer your questions because I am not interested in solving 1 case but I want to solve all cases.

So, the best solution (for me) is a backoff in case of heavy usage of the disk (e.g. when the thruput on the disk is higher than x, suspend copy until back lower than x). Perhaps also on controller level thruput ?

We use mini-copy where possible and have max-copy on 1 (and the other node on 0). I'm not going to play with the controller parameters/settings.

Thanks anyway.

Wim

Wim

Wim Van den Wyngaert · ‎04-03-2005

Made a simpel solution myself.

Begin:
$ set proc shadow_server/susp
$ wait 00:00:00.1
$ set proc shadow_server/nosusp
$ wait 00:00:00.1
$ goto begin

Run this while the shadow copy is busy and DIO rate goes from 50 to 25 (on my test machine). Make the first wait .2 and DIO goes down to 19. .5 results in DIO of 9.

It's a farmers trick but it works.

Wim

Wim

Robert Brooks_1 · ‎04-08-2005

Wim wrote . . .

Begin:
$ set proc shadow_server/susp
$ wait 00:00:00.1
$ set proc shadow_server/nosusp
$ wait 00:00:00.1
$ goto begin

Run this while the shadow copy is busy and DIO rate goes from 50 to 25 (on my test machine). Make the first wait .2 and DIO 19. .5 results in DIO of 9.

It's a farmers trick but it works.

--

Enjoy that hack while you can; we've made a change that will make the SHADOW_SERVER process unsuspendable.

We're also making it unswappable; we've had a couple of cases where bad things happened
when that process attempted to outswap to
a shadowed device where writes were temporarily halted while HBMM bitmaps were being cleared. It resulted in a deadlock condition.

As a result, we're making every effort to ensure that the SHADOW_SERVER will always be available. With the advent of HBMM, the
SHADOW_SERVER is a very important process; even if you are not using HBMM, the SHADOW_SERVER is now needed for any mounting of a shadow set.

The above changes will first be available in
future MANAGE and SHADOWING kits for V7.3-2 and V8.2. The first shipping release to contain these changes will be V8.2-1 (I64 only) and V8.3 (Alpha and I64).

We are aware of the need to put the brakes on full copies, just as one can with full merges.

We're looking at what can be done.

-- Rob Brooks

Wim Van den Wyngaert · ‎04-10-2005

Thanks for the warning. Hope the breaks with ABS get installed before 8.4 (if I ever upgrade it will be around 2007).

Wim

Wim

Jan van den Ende · ‎04-10-2005

ABS = Anti Blockier System.
To those unfamiliar with that German name: the more common English branding for the concept is ALB for Anti Lock Breaking.

hth

Proost.

Have one on me.

Jan

Don't rust yours pelled jacker to fine doll missed aches.

Wim Van den Wyngaert · ‎04-10-2005

Anti-lock Brake System (ABS)(Toyota.com).

I guess it is ABS in English ...

Wim

Wim

Jan van den Ende · ‎04-10-2005

Thanks, Wim,
I didn't know that one :-(
Show one is never too old to learn.

Proost.

Have one (Duvel) on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.

Wim Van den Wyngaert · ‎06-15-2005

Another possibility in 7.3+.

$ mc sysman class add sha /cpu=0-23=1%
$ set proc /id=xxx /class=sha
where xxx is the id of shadow server.

But it wont help much because the process only takes 1-3%.

Wim

Wim

Garry Fruth · ‎06-15-2005

Speeding up the shadow copy may be the solution. I question the idea that a single shadow copy consumes so many resources that it severely affects performance. Take a closer look at Ian's response regarding the fence. The slow down may be due to the fact that disks are not in a steady state, and read performance suffers.

Is application performance poor during a slow merge operations? If yes, it seems reasonable that a slow copy operation would also adversely affect performance for a similar reason - read performance beyond the fence.

BTW, there were many responses to this thread, I did not read them all. Forgive me if I have repeated someone else or headed off the mark.

Wim Van den Wyngaert · ‎06-15-2005

Garry,

The proof : the application takes 18 instead of 3 minutes to do certain things (Sybase dbcc, no other activities on the cluster).

The reason : this is an FDDI cluster and IO bottleneck is that the FDDI link can only handle 10 Mbyt/s.

So, I should be able to tell that e.g. no more than 10% of the FDDI capacity should be used for shadow copy, when other activity is present (backoff principle ?). The cpu class could limit it to e.g. 50%. But in my case about 80% of the capacity was taken. So, 50% would be an improvement.

Wim@home

Wim

Wim Van den Wyngaert · ‎06-15-2005

Btw : the bcp operation is a WRITE operation.
So, FDDI is used.

Wim

Wim

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks

Re: Shadow Copy Breaks