Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

How to backup a shadowed system disk ?

 
SOLVED
Go to solution
John Gillings
Honored Contributor

Re: How to backup a shadowed system disk ?

re: DECexchange,

>I used to do this all the time and never
>ran into a problem

Famous last words!

Just because you never ran into a problem doesn't mean there are no problems to run into!

How many times did you restore your backups?

Devices removed from shadow sets are NOT synchronized against the file system. There is NO guarantee that an open file is in a consistent state. Similarly, any file backed up under /IGNORE=INTERLOCK cannot be trusted.

Yes, you'll sometimes (maybe even mostly) get away with it, but I've seen attempts to restore backups taken this way result in unbootable systems and lost data.

Do you feel lucky? Is that a reasonable attitude for dealing with your business critical systems and data?

BACKUP is not a command, it's a PLAN which needs to be thoroughly tested and verified. You need to formulate a backup strategy that takes into account your data. The most important part is the RESTORE. Think about that and work backwards to figure out what you need to save.
A crucible of informative mistakes
Jon Pinkley
Honored Contributor

Re: How to backup a shadowed system disk ?

While what John Gillings says is technically correct in that an online backup made with /IGNORE=INTERLOCK isn't guaranteed to work, and neither is a shadowset member that is removed while files are open, either of these options is better than no backup, or an online backup made without the /ignore=interlock switch.

What you will get if you attempt to do an online image backup without specifying /ignore=interlock is only the files that were not open for write. And it can also lead to other processes failing when they try to open files that are being backed up. That could explain the recent thread about a batch job failing due to not being able to open the authorization file.

Please see the shadowing manual, chapter 7 "Guidelines for Using a Shadow Set Member for Backup" for a list of the reasons that such a backup may not be 100% correct. If you limit what is being done to the system disk while you do your backups, or split the memeber off, you can reduce your risks substantially.

In John's VMS Journal article, he states that prior to 7.3, there were no guarantees about the state of a removed shadowset member. Since I don't have easy access to documentation that old, I wil have to accept what he states, but I am not sure exactly what was different prior to 7.3. Perhaps it was only the documentation that changed. Perhaps John can enlighten us.

Bottom line: Make sure you have a good backup as a failsafe restore point. As John states the best time to get that is immediately after a significant event like an upgrade of the operating system or the application of patches. And this should be done while the system disk is not being used for anything but the source of the backup. Then use some of the techniques John discusses in his Technical Journal article, like convert/share, to get online copies of the files that change (but note that if users are being added while you do this, even convert/share won't guarantee that the RIGHTSLIST and the SYSUAF are synchronized. And use something like Jess Goodman's DISPLAY_JOBS to snapshot the state of the jobs in the batch queue, and have command procedures that will recreate your batch and print queues, queue forms and if you use them, queue characteristics.

I don't agree that doing online backups with /ignore=backup is a waste of time and tape, but just like "any file backed up under /IGNORE=INTERLOCK cannot be trusted", any advice you get on a forum can't be trusted.

Good Luck,

Jon
it depends
John Gillings
Honored Contributor

Re: How to backup a shadowed system disk ?

re Jon:

>In John's VMS Journal article, he states
>that prior to 7.3, there were no
>guarantees about the state of a removed
>shadowset member. Since I don't have easy
>access to documentation that old, I wil
>have to accept what he states, but I am
>not sure exactly what was different prior
>to 7.3. Perhaps it was only the
>documentation that changed. Perhaps John
>can enlighten us.

Prior to V7.3 when you dismounted a shadowset member it was simply dropped out of the shadowset with no regard to what I/Os may have been in flight. In theory you could have had a disk that wasn't even mountable. Remember, the original purpose of shadowing was data redundancy, NOT helping take backups.

Post V7.3, the dismount takes care to ensure the DISK STRUCTURE and FILE SYSTEM of the removed disk is in a consistent state. This was an engineering reaction to the reality that people were using shadowing for backups.

HOWEVER, this is the limit of what shadowing can do for data consistency. There's no way it can reach up into applications to get open files into some kind of predictable consistent state.

I take Jon's point that something is better than nothing, but still stress that in the OpenVMS world the idea is we do things correctly! The tools are there. With understanding of your data and what you're achieve, you CAN engineer good backups. Unfortunately there are no magic wands to do it for you.
A crucible of informative mistakes
Hoff
Honored Contributor

Re: How to backup a shadowed system disk ?

BACKUP /IGNORE=INTERLOCK allows silent data corruptions in the output saveset.

Silent. Data. Corruptions.

If you want to piece together and recover a disk rebuilt from such a saveset, be prepared for active files to contain inconsistent data.

I had originally found the mechanism was rather hazardous, and then I chatted with the engineer that was then maintaining BACKUP. That was a real wake-up call around the risk.

John and I perpetrated the warnings that exist in the current (newest) OpenVMS documentation.

Oracle Rdb and such capabilities do have a mechanism for this, though RMS doesn't have a direct analog of Rdb's RMU and its on-line capabilities.

If you're doing an on-line BACKUP of active files, well, recognize it is not without risk.
Jon Pinkley
Honored Contributor

Re: How to backup a shadowed system disk ?

I don't remember anyone claiming that backup/ignore=interlock was not without risks. Neither is driving to work, but many people do that every work day and the vast majority of them return each night. Backup/ignore=interlock lets backup open files with a null lock, i.e. without regard for any other access to the file. The blocks that backup will copy from that file may be 100% garbage or 100% correct, and since there is absolutely no coordination with the process(es) that have the file open, there is no way for backup to tell. It just copies the blocks that are "used", i.e. the blocks up to the EOF currently recorded in the file header.

My argument is that those blocks in the vast majority of cases are much more useful than no backup. Is it possible that an indexed file when restored from such a backup will be internally inconsistent? ABSOLUTELY. However, it is still much better than what you have to work with if you had not specified /ignore=interlock and the file was open for write by another process.

However, I know of cases where no backups were being done regularly because "they are not guaranteed to be accurate, so we only do backups when we get a chance to take the system down." Likewise, I am afraid that there are probably sites doing online backups and not using /ignore=interlock because of the dire warnings in the FAQ and on this forum. And of course they aren't following the advice to test the backup, so they aren't aware of how unusable the backups are until they loose a disk drive. My point is that they have a much, much better chance of being able to recover most of the important data if they use /ignore=interlock than if they do not, if the only backups they are doing is online backup. I AM NOT RECOMMENDING an online "live" backup as the only backup, especially since there are relatively easy ways to get internally consistent copies of RMS indexed files as long as they are open for shared access. I am only stating that an incomplete backup is better than no backup.

I am not exactly sure what Hoff means by "silently corrupt". If you turn messages off then you will not be warned that a file was open for write. It is true that there is no indication of any problem at the time the file is restored, backup just restores the blocks that were saved to the saveset and resets the file attributes, it is not using RMS to read the non-saveset files record by record or to write them back to disk. And there is no way to determine the severity of the warning; it makes no difference if the file is open for update, but no activity has happened since the open, or if an index file with multiple keys is being actively loaded. In both cases backup will give the same "%BACKUP-W-ACCONFLICT, DEV:[DIR]FILE.TYP;VER is open for write by another user" message. If it is possible to modify the file while backup is copying it to a saveset, without backup giving that warning, I am not aware of the method (without cheating and using logical or physical I/O).

There is no question that it is possible to get indexed files saved to a saveset, that when restored will fail an analyze/rms. This is especially true for large files that take multiple minutes to backup to tape. That is the reason I stated that a backup of a removed member is better than a backup/ignore=interlock. The window of opportunity of failure is much larger when you are doing an online backup with no coordination of the files being backed up.

My complaint with the argument to not backup the system disk on a semi-regular basis is that you will likely not have a "current" backup handy, or worse, the "reference backup" has been misplaced, is offsite, or has been physically damaged at the time you need it. If you have relatively frequent backups, the probability that you will be able to find a good usable copy is also improved. And most backups of system disks don't take much tape with tape drives that have been made it the last 5 years. So even if you don't use the copy you make on a weekly basis for your disaster recovery, it is still likely to have some useful info on it if nothing else than copies of log files, and the latest changes made to command files used for startup, etc.

Jon
it depends
Highlighted
Ian Miller.
Honored Contributor

Re: How to backup a shadowed system disk ?

If you do /IGNORE=INTERLOCK then I think it possible for the file to be altered while being backed up without any warning messages issued. IIRC it could happen in a cluster but I may be mistaken.

The problem with /IGNORE=INTERLOCK is that it mostly works (produces a valid backup) but sometimes it does not and you can't tell the difference until you restore a file.

I agree fully with John's comment on what is needed is a RECOVERY plan not a BACKUP plan.
Having been in the situation of recovering a system after a disk failure and discovering the backups are inadequate then I'm more careful about these things.

The sequence previously described with a third shadow set member is the answer to the original question especially if you are on VMS 7.3 or later and do ensure the application can be quiesced.
____________________
Purely Personal Opinion