Operating System - OpenVMS
1825771 Members
2160 Online
109687 Solutions
New Discussion

Deleting a File that is Open

 
SOLVED
Go to solution
HDS
Frequent Advisor

Deleting a File that is Open

Hello.

On OpenVMS V8.3 on Files-11 ODS-2 Disks.

I have one process that, using a Fortran executable, OPENs an indexed file as SHARED, ACCESS=READ. No CLOSE is performed [yet] and one record has been read as RRL/NLK, Query-Locking is DISABLED. Write-Through caching is enabled on that file.

Another process comes along (actually a subprocess of that parent who opened the file) and using DCL $DELETE, successfully deletes that file. This deletion occurs explicitly for the purpose of sort'a "restoring" a series of files by re-loading a bunch of RMS files from a backup area. We understand and accept that that file originally OPENed has been replaced, even though it may not be aware of it [yet].

We know and understand that this behavior works, albeit...it is somewhat curious. (I will get to a question about that momentarily.)

Now, instead of that subprocess using DCL to $DELETE those files, we use hardware based data replication to perform a SNAP RESTORE (EMC Technology). Forgoing a detailed description of that process, I will simply state that in order to do that, that target device needs to be dismounted.

However, with that file OPENed, although its DELETEion is possible, the housing device cannot be dismounted. The error, as one would expect is:
%DISM-W-CANNOTDMT, DISKX1 cannot be dismounted
%DISM-W-USERFILES, 1 user file open on volume

If it is any help, this behavior can be reproduced by simply having two sessions...
- On #1, DCL $OPEN/READ a file.
- On #2, DCL $DELETE that file
- On #1, read record(s).

Now, for my questions:
1) Using the DCL $DELETE to delete that file from under that process that has it open, I can still see the FCB, and can see the F11 arbitration lock. I can even see the file using a $SHOW DEVICE/FILES (without BYPASS, the file name is displayed as blank, with BYPASS, the filename is shown). If there is a way to easily explain how this "temporary file" and/or "file marked for deletion" behavior allows for that file name to be seen and that FCB to still exist with that lock, I would very much be appreciative to hear it. Is it just that the directory entry is deleted but the file itself remains until the CLOSE? (I see that using a OPEN WRITE, the file cannot be deleted.)

2) [My real question] ... Is there a way other than having that process that performed the OPEN actually CLOSE the file and/or have its image terminated or process deleted...is there another way where I could $DISMOUNT that device...say by pulling the rug out from under that now temporary file? In opther words, having that executing subprocess sense the file (maybe via $SHOW DEViCE/FILES and telling that process to let go?)

A tough one to explain. I am hoping that I did a good enough job at it to make sense.

Note that I am experimenting with simply adding a CLOSE of that LUN prior to that launching of that executing subprocess. This would be a simple solution, but may open the proverbial "can-of-worms" as this code is very old and very much in use throughout our application. SUch changes, no matetr how simple, are sometimes frowned upon.

Many thanks in advance.

-Howard-
14 REPLIES 14
Jan van den Ende
Honored Contributor

Re: Deleting a File that is Open

Howard,

At this time I can not experiment enough to be definitive, but is
$ DISMOUNT/ABORT ! suitibly priv'd

any help?

Worth a try?

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
HDS
Frequent Advisor

Re: Deleting a File that is Open

Hello.

Yes...that was very much worth a try. I had not even thought of that one.

Unfortunately, no luck.

$ dismount/cluster diskx1
%DISM-W-CANNOTDMT, DISKX1 cannot be dismounted
%DISM-W-USERFILES, 1 user file open on volume

$ dismount/cluster/abort diskx1
%DISM-W-CANNOTDMT, DISKX1 cannot be dismounted
%DISM-W-USERFILES, 1 user file open on volume


Thank you, though. I wish that would have worked; it would have made my task rather simple.

-H-
Hoff
Honored Contributor

Re: Deleting a File that is Open

Grabbing files from underneath active code is Not Good.

As for your earlier question, the request to delete the (open) file is completed upon the close or (upon system crash) upon a subsequent disk analysis and repair. Not before.

And with the latter question, is this your own local Fortran code running? If it is your code and as we have been discussing with regularity here recently, the "best" approach is to modify your Fortran code to assist with or to perform a consistent backup.
HDS
Frequent Advisor

Re: Deleting a File that is Open

Hello.

Yes. It is our home-grown code. And, yes...I completely agree that it would be certainly best to just modify that code...and I do agree that such is a best practice...but...

Maybe I am just looking for solutions that really either do not exist or just simply should never be done as a general practice. I admit that this is very possible, but I figured that it would never hurt to ask.

I had mentioned that I would/could just add a CLOSE of that file to the offending routine; unfortunately, sometimes making even the simplest of changes to pre-existing often-used modules causes issues that would never have been considered during a testing effort. I suppose that I would have liked to identify some "neat trick" or something that would allow me to forgo making changes that may have adverse affects elsewhere.

Oh well...

I do appreciate your information.

Thank you.

-H-
Jess Goodman
Esteemed Contributor

Re: Deleting a File that is Open

If a file is open for read-only access with shared-write allowed by one process, and another process deletes the file, what actually happens is that the file is "marked-for-delete".

The file appears to be deleted already because the directory entry for it is removed, but in fact the file still exists, as SHOW DEVICE/FILES will show. DFU UNDELETE /MARKED can be used to restore the directory and "unmark" it.

Otherwise when the file is closed by the read-only process then it is actually deleted. If the system crashes before the file is closed then the file becomes a "lost" file, along with still being marked-for-delete. ANALYZE/DISK/REPAIR or DFU VERIFY /FIX can be used to delete these lost files.
I have one, but it's personal.
Robert Gezelter
Honored Contributor

Re: Deleting a File that is Open

Howard,

I concur with Hoff and the others, modifying the code to close the file is the definitive option. In fact, the reason it is the "best practice" is that it is the only solution that reliably works as intended without side effects.

I have seen many situations where programs "deleted" files out from under running programs, and have untangled more than a few undesired side effects (e.g., lost database updates). An emergency measure? Perhaps, but so is just doing a STOP/ID=.

Additionally, if I were modifying the code to close the file, I would also verify that the code has a check that the file is actually opened. If that trap is missing, I would add it at the same time, and verify that it actually works as intended with a test jacket. The most likely problem from closing the file is an attempt to read a now-closed file, which would be dealt with by that trap.

- Bob Gezelter, http://www.rlgsc.com
Jon Pinkley
Honored Contributor
Solution

Re: Deleting a File that is Open

HDS,

What do you expect to happen to the application that has the file open if you somehow force a dismount? It isn't going to be able continue without error.

There are several approaches, the simplest, but most obtrusive to the application, being to kill the process and restart it after you have taken your snapshot.

If you must dismount the disk, that's the only choice I can think of without modifications to the application. Note that if the application only has the file open for read, that shouldn't cause a consistency problem with the file in question, but if you are trying to achieve a clean dismount to "flush", then you will need to have all files on the volume closed, cluster wide.

A cleaner approach would be to have a mechanism to notify the application that it needs to close the file. One method is a "doorbell" lock, essentially the application opening the file for read access would take a PR lock on an agreed upon resource name and have it specify a blocking AST to be delivered if an incompatible lock is requested for the resource. The application would then be responsible to save RMS context (so it can restore when the snapshot is complete), close the file, release the PR lock or convert it to NL, and then requeue another lock and wait for it before reopening the file.

Synchronizing AST level and non-AST level code isn't trivial in the general case. Robert Gezelter has several presentations that describe a method where everything is done at AST level and thus avoids synchronization problems, a type of event driven cooperative multitasking, with ASTs starting processing and the done processing event is $hiber.

Easier, but with more overhead is to have the application open/read/close the file every time it reads a record, with error handling to wait and try again if it gets a device unavailable error on open. It really depends on what the indexed file is used for, and how often it is read, that will determine if that is an acceptable solution.
The opens and repositioning can be optimized somewhat by opening by FID, and using RFA, but it is still expensive compared to just reading a record.

There isn't a magic bullet that will cause all applications to close their files, wait for a "go ahead", then re-open the files, restore the previous RMS state, and continue.

Application checkpointing was planned for VAX/VMS V4, but the problem was much more difficult than was originally envisioned, and it never happened. There probably is no way to do it in a backward compatible way. There have been many things that were planned but never released. Some examples: QIO server, VMS snapshot services.

Jon
it depends
Robert Brooks_1
Honored Contributor

Re: Deleting a File that is Open

There have been many things that were planned but never released. Some examples: QIO server, VMS snapshot services.

--

The world should be glad that the QIO Server was never released.

It was a classic example of a project that collapsed under its own weight, even after it was trimmed back to about 25% of the original plan.

-- Rob
Robert Gezelter
Honored Contributor

Re: Deleting a File that is Open

Howard,

The introductory session that Jon referenced is "Introduction to AST Programming". It was last presented at the 2000 Compaq Enterprise Technology Symposium in Los Angeles. The slides from that presentation are available at http://www.rlgsc.com/cets/2000/435.html

- Bob Gezelter, http://www.rlgsc.com
Hoff
Honored Contributor

Re: Deleting a File that is Open

In addition to Bob G's materials referenced earlier, there's http://labs.hoffmanlabs.com/node/617 that might get you going on ASTs, and there are links off to processor cache coherency and barriers and interlocked instructions and synchronization, and off to various code examples from there.

The usual distributed coordination and interprocess notification technique here would involve locks the lock manager and http://labs.hoffmanlabs.com/node/492 and such.

A distributed I/O mechanism (whether in the form of QIOserver or one of the more widely available grid engines such as MPI or BOINC or Xgrid) or access to features including checkpoint-restart would and related features be useful here, as would have been an API for synchronizing applications and caches with BACKUP activities or with a shadowset split DISMOUNT. That written, these mechanisms are not available with OpenVMS. RMS and the lock manager are available, however, and the application itself can be coded to take the necessary steps to ensure a reliable and consistent archive of the application data. If you're interested in adding transactional control and the ability to run a journal, there are both integrated (though separately licensed) options and there are add-on options and application protocols (eg: paxos) that can be used.
John Gillings
Honored Contributor

Re: Deleting a File that is Open

Howard,
Just winding back... I'm trying to understand what you're attempting to achieve.

You have process 1 with an open channel to a file.

Process 2 deletes the file, and you're expecting process 1 to magically look at a new file with the same name? Won't happen!

Answering your direct questions...

Looking at the deleted file. Yes, the directory entry has been removed and the file set erase on delete. Yes you can dump the header of the file, using the FID. Get the FID from SDA:

Channel CCB Window Status Device/file accessed
------- --- ------ ------ --------------------
0010 7FF06000 00000000 DSA1:
0020 7FF06020 88B03C40 DSA1:(2745,8382,0)

So, the index number is 2745. Simplest way to find the header is with DUMP/FILE. You need an offset into INDEXF.SYS. I thought there was a /INDEX qualifier? Use the FID as a first guess then find the offset as the error:

$ DUMP/FILE DSA1:[000000]INDEXF.SYS /BLOCK=(COUNT:1,START:2745)

look for resulting FID:
Virtual block number 2745 (00000AB9), 512 (0200) bytes File identification: (2183,5,0)

$ WRITE SYS$OUTPUT 2*2745-2183
3307

This is the block number of the header in INDEXF.SYS:

$DUMP/FILE DSA1:[000000]INDEXF.SYS /BLOCK=(COUNT:1,START:3307)
...
File identification: (2745,8382,0)
...
File characteristics: Marked for delete
...
Identification area
File name: DELETEME.TXT;1

You can work out the directrory the file was in by following the backlink.

DFU can do a lot of the hunting for you.

Question 2...

Terminating the image is simple. Use SHOW DEVICE/FILE to list the processes with open files. Use STOP/IMAGE to terminate the image, or STOP/ID to terminate the process. Simple matter to write a procedure to do it automatically, BUT you need to wait until the images run down and the files are closed before dismounting the disk. Killing just the image may also lead to other images executing, opening other files, etc...

Also, beware process permanent files. These will not be closed just by terminating the image. You'll need to take out the process to close them.
A crucible of informative mistakes
Hein van den Heuvel
Honored Contributor

Re: Deleting a File that is Open

Jon Pinkley wrote: "What do you expect to happen to the application that has the file open if you somehow force a dismount? It isn't going to be able continue without error."

IMHO this is the most important observation. You might as well shoot the process. Only if you know the file will never ever be used again, could you possibly get away with an outside tool, to put some code in the pool to close a file (dummy fab with BLN,COD and IFI filled in) , and queue and AST to the process to run that in its context. This _hinted_ to in http://www.openvms.compaq.com/doc/82final/5841/5841pro_013.html
" Code that must perform other operations in another process's context (for instance, to execute a system service to raise a target process's quotas) can be written as an OpenVMS Alpha or OpenVMS I64 executive image, as described in Section 4.7.2 "

As Howard himself indicated and many echoed... Just close the file after the operation, with the performance overhead of that (if you need to re-open the same file), and the software revision price to pay.

Async, on request, file close/reopen may be cute but overkill.


John Gillings wrote:

"I thought there was a /INDEX qualifier?"

There is. /IDENTIFIER

But the implemenation is broken. (IMHO!)
It uses F$FID_TO_NAME, which is cute but not asked for, and then refuses to make do if that effort fails. F$FID_TO_NAME visits then directory in vain.

So in the marked-for-delete case you get:

$ dump/head/id=72197 sys$login
%DUMP-E-READFIDHEADER, error reading file header for file ID (72197,179,0)
-SYSTEM-W-NOSUCHFILE, no such file

Or when a file is locked exclusively you'll get:

$ dump/head/id=72197 sys$login
%DUMP-E-OPENIN, error opening _EISNER$DRA3:[DECUSERVE_USER.HEIN]TMP.TMP;1 as input
-RMS-E-FLK, file currently locked by another user

The method for DUMP/FILE_HEADER from INDEXF work in both case. Below you'll find a little command file to facilitate the math.
Usage example:
$ @DUMP_FILE_HEADER_BY_ID.COM sys$login: 72197
:
File characteristics: Marked for delete
:

hth,
Hein.

$ type DUMP_FILE_HEADER_BY_ID.COM
$!
$! read_file_header.com Hein van den Heuvel, July 2006
$!
$! This command file (Hack!) dumps the header of file,
$! even if it is locked or deleted... if you have the file ID (SDA> SHOW PROC/CHAN )
$!
$!
$! 2) Use file Id to look up file corresponding file header in INDEXF.SYS
$!
$if p1.eqs.""
$then
$ write sys$error "Please provide a DEVICE and FILE-ID as arguments"
$ exit
$endif
$dev = f$parse(p1,,,"device")
$id = p2
$!
$! The file-ID is an offest into INDEXF.SYS
$! INDEX.SYS first starts out with 4 cluster, and then
$! a bitmap to hold a bit for each potential file header.
$! And as for every VBN, the count starts at # 1.
$!
$indexf_bitmap_vbn = (f$getdvi(dev,"cluster") * 4) + 1
$ibmapsize = (f$getdvi(dev,"maxfiles") + 4095) / 4096
$header_vbn = indexf_bitmap_vbn + ibmapsize + id -1
$dump/file_header 'dev'[000000]indexf.sys /blo=(start='header_vbn',count=1)


HDS
Frequent Advisor

Re: Deleting a File that is Open

Wow.

Such a wealth of information. Thank you all.

As a whole, you have been able to provide supporting information to my position that I am going to have to change the application. I figured that it was worth asking if there was some sort of "trick" that I could use here without having to modify the application, but...looks like there is not. That is okay.

To respond to some very valid points:
- If that file could be deleted, the application [as I have found] does not go back and attempt to re-read from that lun...so there is no attempt to "read that newly replaced file." I agree, that could never work. I had to tear down the application code (this section of the module is deep within well-nested subroutines) and have since found that there are no attempts to re-read from that lun without first checking to verify that the lun is open. So, the good thing (for me) is that the pointer does not need to be maintained. This very much simplified my getting "approvals" to add and test the simple adding of that single CLOSE statement. We are all a tad surprised that it was not there already. The performance aspect aside (this is not something that happens very often and, when it does, it is just once), we here internally believe also that it is the best approach.

- We (as a development department) are working more and more with AST synch methods...I agree...that would have been nice to have already in the code here. We are working towards more advanced and more flexible synch methods. The links to the presentations will be very useful to me and, likely, several others. Thank you for them. Unfortuantely, adding that technique here will require more mods to legacy code than that for which I am permitted to perform, so such could not be done here in this case.

I truly thank all of you you for all of this information. For me, this forum continues to be like "the wrench to the plumber."

-H-
HDS
Frequent Advisor

Re: Deleting a File that is Open

Hello.

I am grateful for all of the responses. In spite of the fact that no magic trick solution exists for me on this one, the responses were very informative and supported my initial instinctive hunch.

Thank you all.

-H-