Operating System - OpenVMS
1753701 Members
4836 Online
108799 Solutions
New Discussion юеВ

Re: VMS 7.2-2 Patches - A word of warning

 
Robert Atkinson
Respected Contributor

VMS 7.2-2 Patches - A word of warning

I recently applied a set of patches to an ES40 system disk. Afterwards, our backup system log files complained of missing files.

When we delved in further, we found that the VMS$COMMON alias had been corrupted. This was possibly due to me trying to instal new patches without first installing PCSI v1.

The problem is that most sites would not know that the alias's are corrupted, as VMS backup does not give you an error message.

The easiest way to tell is do a SHOW DEVICE /FILES SYS$SYSDEVICE. If the files show as [SYSCOMMON...] as opposed to [VMS$COMMON...] then your alias is corrupted.

I am still pushing HP to find out how this happened, so that other people can be warned, so I'll post any details I get back.

Robert.
6 REPLIES 6
labadie_1
Honored Contributor

Re: VMS 7.2-2 Patches - A word of warning

This is an old problem, IMHO with no relationship with the patch PCSI V1

See for example

[OpenVMS] Checking the Alias Directory Structure on the System Disk

http://h18000.www1.hp.com/support/asktima/operating_systems/00999B63-3371E460-1C02A1.html

and

Product installations in a cluster may not install or function correctly when the directory alias is incorrect

http://h18000.www1.hp.com/support/askkcs/hpcg/215_0_141552576_3692579.html



Robert Atkinson
Respected Contributor

Re: VMS 7.2-2 Patches - A word of warning

Labadie, I agree that the problem has been around since VMS v5, and is usually caused by a restore of the system disk.

In our case, the system disk was fine before we applied the patches, but corrupted afterwards. We didn't do a restores, so I've come to the conclusion that it must have been the patches themselves, or the way we installed them!

Rob.
John Gillings
Honored Contributor

Re: VMS 7.2-2 Patches - A word of warning

Robert,

If you're certain that the backlinks were correct before the ECO and changed afterwards, then please log an urgent case with your local CSC. Perhaps you have a backup of the pre-upgrade disk? We could restore it and rerun your patches to reproduce the problem.

$ DUMP/HEADER/BLOCK=COUNT:0 SYS$SYSDEVICE:[000000]VMS$COMMON.DIR

may give a clue as to what happened.

It's difficult to see how or why PCSI would mess with directory backlinks. I'm not saying it's impossible, just that history says it's likely to be something else.

There are many ways to fix the backlinks, the simplest and safest is to use the freeware utility DFU.
A crucible of informative mistakes
Robert Atkinson
Respected Contributor

Re: VMS 7.2-2 Patches - A word of warning

John, thanks for the response.

I've asked our support company (ICM) to follow this up with HP.

We do have an image of the system disk pre-patches, which I'll make available to HP as soon as they are ready.

Although I'd said the easiest way to check was 'show device/files', I also did a 'dump/head' to find that the alias between SYSCOMMON and VMS$COMMON was completely reversed.

This was documented in one of the VAX articles, so it was quite easy to fix - just a very simple rename.

The worry was that the system disk backups we were taking were probably corrupt and useless, but there was nothing to tell us that. It was by chance that we noticed error messages in our log files from TapeSys, which did report the errors.

Why doesn't VMS Backup report such a critical problem?

Rob.
John Gillings
Honored Contributor

Re: VMS 7.2-2 Patches - A word of warning

Rob,

>Why doesn't VMS Backup report such a critical problem?

As far as BACKUP is concerned, there is no problem. The backlinks are legal, if a bit odd. The file system is intact and all files are accessible (though not via all directory paths). It's OpenVMS that can't deal with it. The *real* problem was the decision to call the "real" file VMS$COMMON and all the aliases SYSCOMMON. If they'd just called them all SYSCOMMON - it wouldn't be *possible* to get the backlinks wrong!) So, it's OpenVMS DATA corruption, rather than file system structural corruption.

Recent versions of ANALYZE/DISK know about it and can report it. Most incarnations can be fixed with two RENAME's, but some can't. DFU also knows about it and can fix all permutations.

Oh, if you're worried about this happening again, there is a way to eliminate the problem entirely. Simply create a top level alias called SYSCOMMON. This will ensure that no matter where the backlinks point, you will still be able to find files. It's more of a mask than a fix.

$ SET DEF SYS$SYSDEVICE:[000000]
$ SET FILE/ENTER=SYSCOMMON.DIR VMS$COMMON.DIR

Note this isn't "officially" supported, but it works. If you understand the nature of the problem, you'll understand why this fixes the problem.
A crucible of informative mistakes
John Gillings
Honored Contributor

Re: VMS 7.2-2 Patches - A word of warning

Rob,

One other thing... Your backups won't be useless, regardless of the state of the backlinks. First off, BACKUP knows about system disks and explicitly corrects these backlinks. Second, the system is bootable, it's only utilities that reconstruct file names from FIDs that fail. Third, if you ever restored from BACKUP you should be doing an ANALYZE/DISK [/REPAIR] immediately after booting anyway.

I agree that it's a relatively serious problem that needs to be addressed. Finding and fix whatever caused your event is worth some considerable effort, so please make sure you follow it up. But, keep in mind that the OpenVMS file system is resiliant enough to continue "mostly" working even with this type of corruption.
A crucible of informative mistakes