Operating System - OpenVMS
1752594 Members
3049 Online
108788 Solutions
New Discussion юеВ

Re: Backup taking locks ?

 
John Gillings
Honored Contributor

Re: Backup taking locks ?

Wim,

Rerun your test with SET WATCH/CLASS=MAJOR enabled on the file creation thread - that may give you a clue to the exact sequence of events. If you have enough log file space, doing the same on the backup side may also be interesting (but without timestamps you may not be able to correlate the sequences)

However, your results will be only of academic interest. Maybe people will eventually realise that it is simply NOT POSSIBLE, even in theory to take a reliable, useful backup of any storage which is undergoing active, uncooperative, unsynchronised changes.

It would almost be better if BACKUP/IMAGE held a doorbell lock on the volume, at the first sign of any change it simply stopped with:

%BACKUP-F-USELESS, Volume has changed, no point in continuing

or maybe change /IGNORE=INTERLOCK to /WASTE_OF_TIME

Perhaps this would convince people who insist on taking these risks to develop a reliable backup strategy?
A crucible of informative mistakes
Jon Pinkley
Honored Contributor

Re: Backup taking locks ?

Do you use a photo copier? No matter how good the copier is, the copies will never be perfect, but that doesn't necessarily mean the copies are not useful. The copies may not be admissible as evidence, but for many purposes, having an imperfect copy is better than no copy.

I am sure there are many people that have had disk failures that would be happy to have an imperfect copy of the drive instead of nothing.

I agree with you that any backup made of a disk that is mounted for shared write access that had active writers will have inconsistencies, as backup isn't instantaneous. Even splitting a shadow set member or using controller based "point in time" copies doesn't solve the problem of synchronizing with applications, and their in memory buffers, although any point in time method is better than backup/ignore=interlock of an active disk.

I claim that a backup/image without /ignore=interlock of an active write shared disk is more than a waste of time; it can cause locking problems for active applications. So while /ignore=interlock may be a waste of time, if your goal is a "perfect copy", at least it is less likely to cause other problems, and you will get a best try copy of the blocks used by files that were present at the time of the initial index file scan. No, it isn't "best practice", but not all sites have the budget for 3 member shadow sets, or EVA controllers with business copy licenses.

I am sure John Gillings has seen many cases where customers had useless backups, but my guess is that many of these backups weren't even made until after some other problem had occurred. For example, if a disk gets mounted in a partitioned cluster, any backup of that disk is still going to be corrupted. Likewise, if a drive is already getting parity errors and going into mount verification, any backup made of that disk is not going to be error free.

My point is that I am not convinced that all of the problems John Gillings has seen are due to the use of /ignore=interlock. More likely it was the system manager or operator ignoring the need for backups and verifying that the backups can actually be used to restore what is needed.

Also note that a backup/image of a live system disk is almost guaranteed to have more problems if /ignore=interlock is not used than if /ignore=interlock is used. Just for example these files wouldn't get copied unless /ignore-interlock is used:
SYS$COMMON:[SYSEXE]QMAN$MASTER.DAT
SYS$COMMON:[SYSEXE]SYS$QUEUE_MANAGER.QMAN$JOURNAL
SYS$COMMON:[SYSEXE]SYS$QUEUE_MANAGER.QMAN$QUEUES

I do agree that people should develop a reliable backup strategy. But blindly removing "/ignore=interlock" from your backups is not a solution to that problem.
it depends
Wim Van den Wyngaert
Honored Contributor

Re: Backup taking locks ?

We have a almost perfect backup. We monitor backup size, missing files, etc.

Just a story ...

A DSM application of ours didn't have transaction log. So, they kept files on a different disk to redo the transactions in case the disks should fail and they had to start with yesterdays backup. But much later, disks were merged and the redo files were placed on the same disk as the DSM db ...

Wim (out of work by the end of June)
Wim