- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Has backup/image/ignore=interlock become usele...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-09-2009 02:35 PM
тАО05-09-2009 02:35 PM
Has backup/image/ignore=interlock become useless?
A simple site, with a system disk and a data disk, each a 2-drive shadow set. The procedures simply perform an image backup to tape using, for example, the following:
$ backup/ignore=(label,interlock)/image/verify -
dsa1: mkb600:dsa1.sav /media=compac/norewind
The backup is performed at idle times (no one logged in) and there is the occasional report of files marked for backup (all expected) and accessed for write (also expected). There appear to be no other errors or warnings reported, expected or not.
The problem is that when the tape is examined, the saveset is valid, but there are MANY files missing. The first detected instance was that an early part of a directory is copied, but not the rest of the directory. Entire subdirectories are also missing. And there does not (yet) appear to be a pattern.
As I mentioned, I am continuing to investigate and will provide additional information as it becomes available. And of course I should mention... OpenVMS/Alpha V8.3 on a DS20.
I am looking to see if there's any patches that might apply. What prompted this message was something that I did discover. I came across the following line in the V8.3 documents describing the "/IGNORE qualifier":
"Also, because of the way BACKUP scans directories, any activity in a directory (such as creating or deleting files) can cause files to be excluded from the backup."
Now, if this is what is happening here, then I am not impressed. For something like this to happen without any warnings, errors, or even informational messages is not what I've come to expect from OpenVMS!! I've been using OpenVMS for a lot of years, and I don't ever remember reading this before. A quick scan of previous (pre-8.x) documents appear not to include this statement, so I have to assume it is recent.
It leaves me wondering about the "way BACKUP scans directories", and if it is known to "cause files to be excluded from the backup" then why wasn't it addressed?!
Anyways. Any suggestions or insights are welcome.
thnx
\bill
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-09-2009 04:11 PM
тАО05-09-2009 04:11 PM
Re: Has backup/image/ignore=interlock become useless?
This detail has been in the OpenVMS FAQ for a very long time, and I've made myself somewhat of a nuisance on this topic (see the other thread going here in the forums, and see the comp.os.vms newsgoup) pointing out the risks of the qualifier.
Silent data corruptions.
There has been a request to get the hazards more clearly documented, and it looks like the risks have finally made it into the manuals. (The older documentation tend to presume you knew that the interlocks were present for a reason; to flag questionable data access. This is the same basic reason why there's been a longstanding standalone BACKUP (OpenVMS VAX) or boot the CD (OpenVMS Alpha) or DVD (OpenVMS I64) or another system disk to get a backup of an OpenVMS system disk.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-09-2009 06:12 PM
тАО05-09-2009 06:12 PM
Re: Has backup/image/ignore=interlock become useless?
Thanks. Yes, I am well aware of the silent data corruptions possible with /ignore=interlock. I have dealt with it on many a system recovery. My concern is not with files being corrupted, since those are identified when the saveset is created, and can (should) be appropriately handled in the rare event of a recovery.
My concern is that files simply do not appear in the saveset. And when I said many, I means hundreds of small data files. Directories in the saveset contain varying numbers of the files that they should: many are there, many are not. Some entire directories are missing. And none of the files are open during the backup.
It is a puzzler. Sadly, backups are slow, idle time is rare, and so testing is a tad tedious.
\bill
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-09-2009 07:06 PM
тАО05-09-2009 07:06 PM
Re: Has backup/image/ignore=interlock become useless?
BACKUP/IGNORE=INTERLOCK has always been useless. Indeed, any attempt to BACKUP an active disk is mostly useless. This is not a fault in BACKUP, it's a fault in expectations.
There are many ways that changes in a directory could prune off large branches in the directory tree, with no way to guarantee it will even be detected. There are many ways files can change between the start of a backup operation and the completion. Some are detectable as potentially affecting the state of the backup, some are not.
BACKUP/IMAGE is really only useful for saving and restoring a static system disk. Any potentially changing files need to be saved independently. Any application data needs to be handled by the application, NOT the operating system. Only the application can know when the data is in a quiescent state. Backup should be an architecturally integral part of any serious application.
This is not the fault of OpenVMS or any other operating system, it's a simple issue of time. Things change many orders of magnitude faster than state can be saved, so it's simply not possible, even in theory to have a generic, covers-all-cases mechanism for creating a backup that can be restored with the system in a guaranteed known state.
There's an OpenVMS Technical Journal article (in V1?) covering some of the issues. The take home message is stop thinking in terms of getting the data off the system. Turn it around, think about how you will restore your system if something fails, work out what you'll need and work backwards to figure out how to save it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-10-2009 01:45 AM
тАО05-10-2009 01:45 AM
Re: Has backup/image/ignore=interlock become useless?
bill,
of course, Hoff and John G. are very right!
Yet, your situation needs not be not as bleak as it obviously is now. And implicitly John G. indicated such:
>>>
or any other operating system, it's a simple issue of time.
<<<
And the main reason for my much more optimistic view you gave yourself:
>>>
each a 2-drive shadow set.
<<<
So, if you dismount one member of the set, mount that (process-private to avoid label conflict), and backup THAT drive, you will have brought the time issue down to only those activities that modify different locations on disk, and have already started but not yet finished.
Orders of magnitude less likely than such changes between reading a directory and procssing what has to be done according to that info. Or processing a (database, RMS, ...) index and processing the associated data. Or ... (any non-atomic activity or activity involving different disk locations.)
And HostBasedMiniMerge is fully integrated into VMS (patched 7.3-2 and) 8.x, so any pre-existing issues with merge performance have vanished.
Bottom line: modify your backup to profit from shadowing, and 99% +++ of your issues are past.
Success.
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-10-2009 03:29 AM
тАО05-10-2009 03:29 AM
Re: Has backup/image/ignore=interlock become useless?
>Thanks. Yes, I am well aware of the silent data corruptions possible with /ignore=interlock. I have dealt with it on many a system recovery. My concern is not with files being corrupted, since those are identified when the saveset is created, and can (should) be appropriately handled in the rare event of a recovery.
I'd have to assume you're not familiar with /IGNORE = INTERLOCK because you're (still) using it. (I thought it was bad and was discussing getting the badness better documented, and while talking with the then-current maintainers of the BACKUP utility, I realized I hadn't understand half of the possible badness here.)
>My concern is that files simply do not appear in the saveset. And when I said many, I means hundreds of small data files. Directories in the saveset contain varying numbers of the files that they should: many are there, many are not. Some entire directories are missing. And none of the files are open during the backup.
Those interlocks were designed and implemented for a reason. (The same sort of model holds with the cluster quorum scheme; it wasn't implemented to cause folks boot or run-time problems, that stuff was implemented to prevent data corruptions.)
I'm not sure which I'd consider better here: entirely missing, or silently corrupt.
>It is a puzzler. Sadly, backups are slow, idle time is rare, and so testing is a tad tedious.
How to split an OpenVMS software RAID-1 shadowset volume is in the host-based volume shadowing manual, IIRC. That (greatly) reduces the window, but you can still have the potential for inconsistency corruptions.
With OpenVMS, the only way this archival stuff can be done (reliably) is either with the applications quiescent, or with application-integrated archival support. BACKUP /IGNORE=INTERLOCK can't reliably copy a system disk (which is how I realized there were problems early on), and HBVS might (though this is usually rare, we are looking at enterprise applications) miss part of a a multi-block or cached or inflight change. (StorageWorks disks could drop multiblock writes; that was the reason that the shelves and the controllers could optionally have batteries.)
I get reasonably good and consistent backups off the local OpenVMS and Unix databases because I use the databases, and because the databases have archival support. The applications - the databases, in this case - have archival processing integrated.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-10-2009 04:17 AM
тАО05-10-2009 04:17 AM
Re: Has backup/image/ignore=interlock become useless?
The issues here are as Hoff, John, and Jan have mentioned.
An amplification on what Jan commented about the RAID set, however, is in order. Actually, it is a combination of something John and Hoff noted and the splitting of the RAID set.
Often, the best solution is to backup issues is to add a scratch volume to the RAID set, temporarily increasing it (in this case, from two to three members). When the three members are fully up-to-date, disconnect the third member, remount it privately with writes disabled and make the backup from the private copy (NOINTERLOCK will not be necessary).
However, one must be careful that the volume is quiescent when disconnecting the temporary shadow set member. If a directory is being updated at the precise instant that the disconnect is happening, the disconnected shadow set member will also have the directory in an inconsistent state. There is no magic here.
That said, the pause in system activity is straightforward to architect, because the disconnect can be done very quickly.
Often, what allows people to "get away" with backing up system volumes with /IGNORE=NOINTERLOCK is that they "know" that the only a small set of files on THEIR system volume are actually ever modified (e.g., SYSUAF, error logs), and they separate steps to preserve those files (e.g., using CONVERT/SHARE and other utilities).
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-10-2009 02:10 PM
тАО05-10-2009 02:10 PM
Re: Has backup/image/ignore=interlock become useless?
As for backing up the system disk on a regular schedule, I usually don't bother with that.
No point, really.
I do back up the system disk once in a while (after ECO kits or upgrades, or significant configuration changes), but I do archive the core files (see the SYLOGICALS.TEMPLATE file) regularly.
But the system disk in most OpenVMS configurations doesn't change all that often.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-10-2009 06:08 PM
тАО05-10-2009 06:08 PM
Re: Has backup/image/ignore=interlock become useless?
and ahead of it's time Host Based Shadowing
is.
Simply removing a disk and backing it up,
with host based mini merge, it should
go back quickly.
With the flexibility of adding and removing members, these operational issues have a simple solution.
That said, I have restored 100s of systems
backed up with /ignore=interlock and frankly,
they've always worked, though I always point out it's unsupported.
EMC says host based shadowing is obsolete, but it has nothing that solves operational
issues like host based shadowing. Me thinks it is they just don't want to bother coding
a long word.
If you want active backups, they generally
are part of an application, like Oracle has it's own backup, and RDB, with transaction
journals and such.
Bob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-10-2009 08:06 PM
тАО05-10-2009 08:06 PM