Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Re: File identification problem.

 
SOLVED
Go to solution
Honored Contributor

Re: File identification problem.

Barring cases of operational errors...

This (still) reeks of a file system or directory cache or volume corruption, or of a nasty executive-mode file system or I/O caching bug, of a memory or processor error, or an un-shadowed disk block error, or of a controller- or clustering- or firmware-related issue within the storage.

And no, these sorts of cases variously won't end well.

While I disagree with John G's position on the need to reboot servers (IMO, striving for longer server uptimes can be a deceptively poor practice, and longer uptimes can be one of the better outward signs of lurking managerial, operational and application stability issues), I do agree with John here; that a reboot should not be expected to fix cases such as this one.
Trusted Contributor

Re: File identification problem.

Now that the "dil" mystery has been lifted I go with Murali on looking at an on-disk directory corruption. The right comand to folow this would have been:

$ DUMP/DIRECTORY SY0:[CUP.LIVE]DAT.DIR

Due to lack of information lemme guess. This is a directory where at high rate temporary files are created/deleted.

Which program/process/procedure (FTP, nfs, who-knows) is creating/deleting these files?

Is this a VMScluster node?

/Guenther
Honored Contributor

Re: File identification problem.

Dave,

I will second Hoff's and Guenther's comments, with the added question of "What is the storage configuration? What disks? What Controllers?

- Bob Gezelter, http://www.rlgsc.com
Trusted Contributor

Re: File identification problem.

As 'tsgdavid' said earlier, an ANALYZE/DISK would be useful.

I wouldn't use /REPAIR immediately. Run it without first in order to understand the magnitude of the problem. It shouldn't take much more than 5 minutes to run unless things are very sick.

Also, are you running a defragger on the disk in question? I have seen instances where defraggers corrupted files, but admittedly not for several years.

Finally, how is the disk used and with what kind of IO rates? The symptoms you described could occur if there's a job that renames files and you happened to find the file just before the rename but when you looked again it couldn't be found under the old name.
Honored Contributor

Re: File identification problem.

Dave,
I agree with John that ANALYZE/DISK without /REPAIR should be done first.
However if you plan to run ANALYZE/DISK without the /REPAIR qualifier
then I would recommend using /LOCK qualifier to avoid any false alarms.

John,
>> Also, are you running a defragger on the disk in question?
>> I have seen instances where defraggers corrupted files, but
>> admittedly not for several years.
This sounds interesting.
Can you give me some more details about which (DFO, DFU ...), how
(any scenarios) defraggers can cause a directory file to get corrupted.

Regards,
Murali
Let There Be Rock - AC/DC
Honored Contributor

Re: File identification problem.

Any 3rd-party caching products in use?

Do the disks have multiple paths?

-- Rob
Trusted Contributor

Re: File identification problem.

Murali, the disk defragger that I had problems with was many years ago. It was a third party product - might have been Raxco, I'm having trouble recalling it.

I'm not even sure that it was strictly a defragger problem. It might have been errors on the disk that were made worse with the files being defragged (and obviously new disk blocks written).

We ended up with ANALYSE/DISK reporting the very nasty "multiply allocated blocks" (as in "multiple-ly") and I had to examine about 150 files to work out which of two files the blocks actually belonged to.
Honored Contributor

Re: File identification problem.


>defragger that I had problems with was many
>years ago

yes MANY years, like the mid 1980's! The last one I caught red handed was called "Rabbit7" in about 1990.

For the first decade or so of OpenVMS history, the digital party line was that defragmentation was unnecessary, and the company did not have a defrag product (other than image backup & restore). In one of the rare cases of OpenVMS management responding to market forces, they finally realised that regardless of technical pontificating about the features of the file system, people WANTED to defrag files, if only for the warm fuzzy feeling of everything neat and tidy.

OpenVMS introduced the "MoveFile" primitive function at the XQP level which gave a supported and reliable mechanism to correctly synchronise defragmentation as an atomic and crash proof operation. I think it was around V6, early 1990s.

As well as the digital product (known by various names, DFG, DFO, PFO...), the 3rd party products changed from rather dodgy hacks to using the proper interface, and most (though perhaps not all) defragger related disk corruption vanished overnight. Support centres no longer saw file systems with "muddy rabbit tracks" all over them.
A crucible of informative mistakes
Honored Contributor

Re: File identification problem.

OpenVMS introduced the "MoveFile" primitive function at the XQP level which gave a supported and reliable mechanism to correctly synchronise defragmentation as an atomic and crash proof operation. I think it was around V6, early 1990s.

---
It was definitely before V6; V5.5 or V5.5-2, I think, 1992ish.


-- Rob
Honored Contributor

Re: File identification problem.

John MCL, John Gillings,
Thanks for the information.
Good to know the history about the XQP's MOVEFILE primitive.

>> might have been Raxco, I'm having trouble recalling it.
May be you are referring to PD (Perfect Disk)

>> We ended up with ANALYSE/DISK reporting the very nasty "multiply
>> allocated blocks" (as in "multiple-ly") and I had to examine about 150 files
>> to work out which of two files the blocks actually belonged to.
Ahh. Must have been a painful experience.

It would have been interesting to know if Dave had any defraggers installed in
his system. If Yes then is the Disk/File enabled for defragmentation.

Regards,
Murali
Let There Be Rock - AC/DC