1752803 Members
5696 Online
108789 Solutions
New Discussion юеВ

Re: Shadow Copy Breaks

 
SOLVED
Go to solution
Volker Halle
Honored Contributor

Re: Shadow Copy Breaks

Wim,

I don't think shadow-copy is 'something special'. SHADOW_SERVER just issues huge IOs (127. blocks) as fast as possible to both disks involved in the shadow-copy, which will cost bandwidth in the FDDI link.

The CPU usage accounted to SHADOW_SERVER is not everything. Do you have MONI MODE data from the shadow-copy ? MONI MSCP on the remote site ? MONI DLOCK ?

You could change the IO-size with DEFINE/SYS SHAD$COPY_BUFFER_SIZE nn (with 31 <= nn <= 127).

Volker.

PS: I'm currently on vacation, so response time may be slower than usual ;-)
Jan van den Ende
Honored Contributor

Re: Shadow Copy Breaks

Wim,

without pointing at anything special (yet), I think something _IS_ wrong at your site.
WE also have inter-site FDDI (backed initially by 10Mb, now by 100MB ethernet),
and we have found out that shadowcopies have 'little ( <50% ) infuence on response times, as long as we keep the number of concurrent copies at no more than three....
unless, a process has heavy IO to a disk currently under copy, THEN performance may drop to 40 - 50 %.

We have had enough opportunity to get to those stats:
our backups were done by taking one member out the set, and adding in "the forth member". Then the 3rd member was backuped to tape, and served as a hot-standby backup.
Since MiniCopy (and SAN) were introduced, we reverted to a more conventional scheme, since the 4-disk scheme breakes MiniCopy.

The reason to introduce 4-disk way back when was a glitch in the way one applic used external communication with a remote system. All smooth in DECnet times, but (mandated) switching to IP initially required a restore about once a week. We soon learned that swapping the disk + shadow copy took minutes, restoring the tape and then shadowcopy took over an hour, and was otherwise a hindrance too. Now the app is better adapted to IP, so the balance shaifted again.

What is causing your performance loss?
No idea yet, but my bets are that something at your site is sub-optimal.
Exactly WHAT? That is the $1000 question.

Jan
Don't rust yours pelled jacker to fine doll missed aches.
Wim Van den Wyngaert
Honored Contributor

Re: Shadow Copy Breaks

Extra info for the 1000$ question :

The shadow server was doing 1 copy at the time at 1 disk was MSCP served (thus going over the FDDI).

A confirmation was given that performance is normally 2-3 minutes and now 18. Sybase is doing IO's with size 60 pages.

Sybase runs with prio=3, shadow_server with 4.

No logicals *shad* active except shad$merge_delay_factor (not used for copy).

The MSCP params :
MSCP_LOAD 1
MSCP_SERVE_ALL 1
MSCP_BUFFER 16384
MSCP_CREDITS 128
MSCP_CMD_TMO 0
Wim
labadie_1
Honored Contributor

Re: Shadow Copy Breaks

Check Mscp is fine with

[OpenVMS] MSCP Disk Server, Usage And Performance Tuning


http://h18000.www1.hp.com/support/asktima/operating_systems/009324D2-0E9A6D00-1C01E7.html

or simply
$ mc agen$feedback
$ sea sys$system:agen$feedback.dat mscp

do you have a high percentage of mscp_frag_io and mscp_wait_io
compared to msco_total_io ?
Wim Van den Wyngaert
Honored Contributor

Re: Shadow Copy Breaks

Gerard,

Wait and frag are both zero.

I decreased priority of shadow server to 1. Hope this will improve Sybase thruput. In any case, can not do much harm.

Wim
Wim
labadie_1
Honored Contributor

Re: Shadow Copy Breaks

Can you take a network analyzer, and monitor the protocols on your FDDI, so be able to say from 9 to 10 this morning, on the FDDI link, there was
- 68 % of 60-07 (Cluster protocol)
- 23 % of Tcpip
- 3 % of ...

there was a high (or low) percentage of retransmitted frames

the top ten talkers were alpha1 et alpha2, followed by alpha12 and alpha13...

Or ask your network colleague (or yourself) to do it.

Jan van den Ende
Honored Contributor

Re: Shadow Copy Breaks



In any case, can not do much harm.


.. and probably will not help much, eighther.

Since you have no CPU load problems, every process that reaches COM state will not have to wait for the CPU, and can do its thing.
One such thing is launching IO.
And those, once launched, are not influenced by process priority, only by IPL.

OTOH, if it DOES make a difference, THEN we enter interesting territory.

.. en zeker, Duvel staat vrij hoog op mijn Prefered List. Dentergems Wit scoort ook sterk, en Koninck, en diverse Tripels, en ... ik krijg er dorst van!


Cheers.

Have one on me!

Jan

Don't rust yours pelled jacker to fine doll missed aches.
Volker Halle
Honored Contributor

Re: Shadow Copy Breaks

Wim,

if the Sybase performance 'suffered' through a parallel shadow-copy, you have to ask yourself, which resource might have been the bottleneck:

- SHADOW_SERVER will read from the source disk and write to the target disk. So it needs some CPU cycles on the node where it's running, some on the MSCP server node, bandwidth and IOs on the local and remote disk channel and disks and lots of bandwidth on the FDDI

- Sybase also wants to be doing IOs, so it also needs CPU cycles and bandwidth and IOs to the disks involved. If the IOs are mostly reads, they should come from the local disk, if they are mostly writes, then FDDI bandwidth is also required.

I would start with making a drawing of all the components involved, try to collect some monitor data (MONI MODE, MONI MSCP, MONI SCS, MONI DLOCK) on both nodes during the shadow-copy and then try to interpret the results - looking for unusual load or behaviour.

This is not some kind of 'problem' which can easily be solved by one or two questions and a solution in this kind of forum.

Volker.
labadie_1
Honored Contributor

Re: Shadow Copy Breaks

I would add to Volker monitor items
$ monitor rlock
labadie_1
Honored Contributor

Re: Shadow Copy Breaks

on both nodes
$ ana/sys
lck sh active
lck stat/toptree=10
may highlight some things.