Operating System - OpenVMS
1828252 Members
3427 Online
109975 Solutions
New Discussion

Re: possible reasons of a file loss

 
mustafa_12
Frequent Advisor

possible reasons of a file loss

Hi,

Is there any possible reason of a file loss in OpenVMS. This loss event occured two times, both on Monday's but the dates are different. So I have checked the Sunday evening backup, the file was there. However, while i have checked the Monday evening backup, the file was absent. Between these two backup times, the only commands that are run on the directory of these files are "analyze /disk /repair" and "purge". Is it possible that one of these commands may be responsible of the file loss?

Or is there any way to trace the user's activity history to find out any deletion commands that they entered?

Best Regards...
17 REPLIES 17
Steven_101
Advisor

Re: possible reasons of a file loss

probably the easiest way is to setup something to enable auditing for successful deletion. You could setup a job to be called by backup to

$set audit/audit/enable=file=success=execute

then on Monday night to a

set audit/audit/disable=file=success=execute

But be careful ! if you have a lot file deletions this could cause your securiity file to grow very large and use up the disk space. I'd recommend moving the journal file to a disk with lots of free space

You can then go through accounting and check for that file being deleted
Ian McKerracher_1
Trusted Contributor

Re: possible reasons of a file loss

Hello Mustafa,

Could you tell us what version of OpenVMS you are running, what type of machine it is and whether there were any software changes (upgrades, patches, applications etc) prior to the first file disappearance.

Thanks,

Ian

Peter Quodling
Trusted Contributor

Re: possible reasons of a file loss

If you have done an anal/disk/repair, then check the syslost directory on that disk...

q
Leave the Money on the Fridge.
Robert_Boyd
Respected Contributor

Re: possible reasons of a file loss

A small correction to your suggestion Steven. After collecting Audit information in the Audit Journal file you will want to use ANALYZE/AUDIT instead of accounting to extract the information for the relevant activity.

Also, it would be more efficient use of resources to only enable security auditing for deletion of the files in directories that are being affected rather than system wide.

Robert
Master you were right about 1 thing -- the negotiations were SHORT!
Wim Van den Wyngaert
Honored Contributor

Re: possible reasons of a file loss

Check the output of anal/dis/rep.
It should mention the file if it moved it.

Wim
Wim
Willem Grooters
Honored Contributor

Re: possible reasons of a file loss

If there is a version limit on the directory, older versions are purged (deleted) automaticly on creation of a new version of the file, so you wll find only that number of versions.
Purge, without /KEEP qualifier, will leave the latest version only.

Of course, any user, or application activity, can have cause this. As stated by others, audit is a possibility to trace this. It may be that when examing accounting, it may give you a hint (on user or job)

Have you thought about the /DELETE qualifier in the BACKUP command? It could be a remainder of setting a symbol somewhere before backup...I know it's unlikely, but without knowing the backup qualifiers used, I cannot rule it out.

Willem
Willem Grooters
OpenVMS Developer & System Manager
mustafa_12
Frequent Advisor

Re: possible reasons of a file loss

First of all, thank you each of your for your kind help. Secondly, I forgot to send the OS details. These are:

DEC AXPVMS VMS V7.3-2
DEC AXPVMS VMS732_UPDATE V4.0
DEC AXPVMS VMS732_SYS V7.0
....
Fibre SCSI patch V3.0 (actually it is in update v4.0 patch)
....
... (and the others)

the disks are in HSG80 Compaq Storage and connected the cluster via SAN switch.

Peter, there is no directory named syslost in the relevant disk.
Win, in the outputs of analyze/disk/repair, there is no clue about deletion of the file. Is it possible that analyze/disk/repair deletes the file without logging any information in the output file?
Williem, the is no /delete qualifier in our backup scripts.

Dale A. Marcy
Trusted Contributor

Re: possible reasons of a file loss

I would edit the ACL list on one of the files using the following:

$ EDIT/ACL disk:[directory]filename1.ext

Replace the lowercase items with the values from your system for the file. Make the first entry look like the following:

(AUDIT=SECURITY,OPTIONS=PROTECTED,ACCESS=DELETE+SUCCESS)

I would then copy this ACL to the other files using the following command (This assumes that you do not currently have differing ACL lists on the files. If you do, then you will need to modify each list as above.):

$ SET SECURITY/ACL/LIKE=(NAME=disk:[directory]filename1.ext) disk:[directory]filename2.ext

After completing the above, wait for the file to disappear. Then do the following:

$ ANALYZE/AUDIT/EVENT_TYPE=DELETE/FULL/SINCE=dd-mmm-yyyy:hh:mm SYS$MANAGER:SECURITY.AUDIT$JOURNAL

Replace the lowercase with the date/time when the file was last known to be present. The above assumes the audit journal has the default filename and is in the default location.
Garry Fruth
Trusted Contributor

Re: possible reasons of a file loss

I like Dale's answer. It works well with the scenario you describe. It is narrowly focused, and does not risk adding a lot of data to the audit log. Don't forget to assign points.
Willem Grooters
Honored Contributor

Re: possible reasons of a file loss

Some more reasons I thought about.

Is it possible the file is renamed? Eventually to another directory on the smae disk (you cannot rename to another disk).

No experience, just deducting (and playing the devil's advocate; nor may this apply to your environment:
If you have GNV on the system and some user would use the Unix rm command (remove), it would remove all versions of the file he has DELETE access to (if no delete access, the protected ones will not be deleted - just checked). Same aplies to mv (DCL rename) - the file will be copies but the original is not deleted (might be my older GNV version (1.14.8(0))

Willem Grooters
OpenVMS Developer & System Manager
Phillip Thayer
Esteemed Contributor

Re: possible reasons of a file loss

I don't know if this helps, but I had a situation once where the previous system manager (who really didn't know what he was doing) had setup a symbol named PURGE to be equal to DELETE. (i.e. PURGE==DELETE). One of the first things I did when I got there was purge a users directory who had too many files and was shocked to notice the logging (I always do a PURGE/LOG) was sayind Deleted instead of Purged. I aborted the command immediately and found what was going on and corrected it.

Check for a symbol PUR*GE or something like that.

Phil
Once it's in production it's all bugs after that.
mustafa_12
Frequent Advisor

Re: possible reasons of a file loss

Dear Williem, there is no GNV in our system.
Dear Phillip, purge is not defined as some another command.

thanks...
Willem Grooters
Honored Contributor

Re: possible reasons of a file loss

Mustafa,

You may not even be aware that symbols have been set up or changed quite differently than you expect! I have seen following happening in a procedure, run in batch under the SYSTEM account, around 4AM.
As you know, queued files cannot be changed unless requeued, and this procedure was meant to overcome that "problem". The first parameter named the commandprocedure to be executed, but what exactly was defined during the day: Cleaning up certain directories:

$! Read in parameters.
$! P1 = extra commandfile
$! P2 = Condition_value
$ Subfile = p1
$ Condition_value2 = p2
$!
$ SET DEFAULT TEMPDIR: ! which is a SYSTEM logical
$!
$! do a thing or two
$ IF some_var .eqs. Condition_value
$ THEN
$ @'Subfile' ! do the changed procedure
$$ ENDIF
$!
$ DEL *.*;* ! to clean up.
$

Of course, "since it was a daily job of no real importance (TEMP ='can be deleted')", it was queued /NOLOG.....

However, one day this sub-procedure had been overwritten by the one used to clean up the user's login directory on logout. In a user environment, it would do no harm, but in this SYSTEM environment, is was devastating:

The procedure contained SET NOVERIFY and SET NOON for obvious reasons.
Some GLOBAL symbols were set from within an executable so you couldn't see it when scanning comfiles. For some this made sense in it's original environment, these were based on data from a database.
One of them was DELETE, normally set to be DEL* = DELETE/CONFIRM (in SYLOGIN.COM), but after deletion of the symbol, set to DELETE/NOLOG. Don't ask why. It was there.
The procedure started with SET DEFAULT SYS$LOGIN (which places SYSTEM in SYSMGR) and didn't restore the environment after run.
Obvoius for a logout procedure for a user, but you can imagine the panic the next morning....
Willem Grooters
OpenVMS Developer & System Manager
Antoniov.
Honored Contributor

Re: possible reasons of a file loss

Hi,
have you news from analyze/audit?

Antonio Vigliotti
Antonio Maria Vigliotti
mustafa_12
Frequent Advisor

Re: possible reasons of a file loss

I have designed an ACL for the suspicious file to be deleted, and on a regular basis, I run analyze/audit. However, I don't know it is good or bad luck :), there is no deletion since then.
Arch_Muthiah
Honored Contributor

Re: possible reasons of a file loss

Mustafa,

Have checked any file expiration date set for the lost file?. As we know the RETENTION period (min:max) specified on $SET VOLUME .... command is a delta time. I doubt as you lost that specific file two times in a exact interval (monday).

File expiration is a file system feature that is available only on Files-11 Structure Level 2 disks. Do you have this FS ?

The expiration dates aid the disposal of seldom-used files when you use the certain DCL cmd and utilities such as $DIRE, $BACKUP,,,etc.

My second doubt is that files occasionally lose their directory links because of disk corruption, hardware problems, or user error.

These lost files will be in the same locations on disk volume, but it won't have link with any directory. This can be found easily as our other forum members suggested using $ANALYZE...cmd. But you will have to specify the DISK name in your $ANALYZE... cmd on which SYSLOST directory will be created to record any lost file info.

$anal/disk_str/repair/confirm DKA700:
This cmd creates SYSLOST dire on DKA700. Have a another trial...

Archunan
Regards
Archie
mustafa_12
Frequent Advisor

Re: possible reasons of a file loss

when performed "dir /full" the "Expires:" attribute is "". I also performed "analyze /disk /repair", but it did not recover the file.

thnks...