Operating System - OpenVMS
1753857 Members
7760 Online
108809 Solutions
New Discussion юеВ

What process is writing to what big or many little files?

 
SOLVED
Go to solution
Clark Powell
Frequent Advisor

What process is writing to what big or many little files?

We've all had this problem, some process is filling up a disk and we have to find it and kill it before the disk is full. It might be writting one big file or it might be writing may little ones. I've been lucky and found the culprit before the disk was full but on a busy system this can be difficult. So I'm fishing for better ideas/programs for identifying the process that's filling the disk, on a system where MON PROC/TOPDIO shows a lot of processes most or all of which are not guilty and where SHOW DEVICE/FILE yields a list of a hundred files. Does anybody have a good trick for doing this?
11 REPLIES 11
Phillip Thayer
Esteemed Contributor

Re: What process is writing to what big or many little files?

I've had good luck with the

DIR/SIZE=ALL/DATE/SELECT=SIZE=MINIMUM=100000 [*...]*.*;*/OWNER

This will give a list of all the files larger than 100,000 blocks as well as when it was created and who the owner is. The number can be adjusted to whatever value you need. Aside from that I would suggest writing a command procedure that uses the SHOWDEV/FILES/OUT=tmp-file-name and then open the temp-file-name and read each file name looking for files larger than a value passed in P1 to the procedure.

As for creating a large number of small files if they are all the same name then you could use the SET FILE/VERSION=n to automatically keep more than that number of files from being present in the directory at a time. If the file names are not the same name...well, I will have to think on that one a little more.

Phil
Once it's in production it's all bugs after that.
Jan van den Ende
Honored Contributor

Re: What process is writing to what big or many little files?

W.

if many small files MIGHT be the issue,
$ DIR/SIZE/SIN="-0:10"
will show any files less than 10 minutes old.
Adjust time value to your needs, but realise that evaluating the dates of MANY files, especially if BIG directories, requires some time itself.

OTOH, this WILL in one pass eliminate the "many small file creation" option.

hth,

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Jess Goodman
Esteemed Contributor

Re: What process is writing to what big or many little files?

Using the DFU freeware utility is much quicker for searching a disk then using DIRECTORY [*...].

$ DFU SEARCH device: -
/SIZE=MINIMUM=10000 /ALLOCATED -
/CREATED=SINCE=-2 !large files last 2 hours

$ DFU SEARCH device: /CHARACTERISTICS=DIR -
/SIZE=MINIMUM=100 !large directories

$ DFU DIRECTORY device: /VERSION=1000 !many versions
I have one, but it's personal.
Jim_McKinney
Honored Contributor
Solution

Re: What process is writing to what big or many little files?

Will DFU locate and report on files not entered into a directory (f.ex. a file that results from opening a device spooled to an intermediate disk that is associated with a print queue)?
Hein van den Heuvel
Honored Contributor

Re: What process is writing to what big or many little files?

Large files are probably easy enough to spot, but lots of little ones would be harder to catch. Things like RMS stats are useless for that. As are DIRIO and BUFIO counts. The SDA LCK tool might catch volume lock activity?

A directory for recent files is a reasonable approach. DIR/SINCE [*...] is probably too slow for that though. DFU is recommended instead.

DFU is fast because it walks INDEXF.SYS and deals with directories as an afterthought (LIB$FID_TO_NAME)
Thus is will also catch temp file which are not entered into a directory, or perhaps move between directories.

Hein.
Robert Gezelter
Honored Contributor

Re: What process is writing to what big or many little files?

Clark,

The solution is quotas, in this particular case, disk quotas.

Disk quotas, stop the offending process dead. The balance is that it is possible to terminate a job which needed "just one more block". In order to prevent a runaway process from filling a disk, this is the cost of that safety.

I often recommend that clients consider this approach. In these relatively disk space rich days, I recommend that the quotas be somewhat high (say 2+ times the anticipated amount), but any reasonable quota short of infinity will have the desired effect.

My first encounter with this type of problem was back on about VAX/VMS Version 2.0. My officemate left something running overnight, and it produced a bit more printed output to the printer (spooled to the system device) than intended. In the morning, the Operations manager was, to put it politely, rather upset).

This problem occurs with both spooled files and disk resident files, and the solution is, admittedly, far older than VMS.

- Bob Gezelter, http://www.rlgsc.com
Hein van den Heuvel
Honored Contributor

Re: What process is writing to what big or many little files?

Yeah, Disk block Quotas, is the righth way to solve this.
I just have not been on a system that had those going for ages.

Even if you set them to near infinite, you can still use the report, and specifically DIFFERENNCE between a current report and 'yesterdays' report to find out heavy users (yesterday could also be last hour or last week of course).
You'd want a perl or DCL script to report the changes by username.

Of course this this is more (only) useful if there actually are distinct usernames being used like they should be.

Hein.
Clark Powell
Frequent Advisor

Re: What process is writing to what big or many little files?

Great Ideas! Thanks.
Mark Hopkins_5
Occasional Advisor

Re: What process is writing to what big or many little files?

You can also use the XFC SDA extension to
find a runaway process doing IO. This
requires recent versions, i.e. the XFC
remedials for V7.3-2 and following released last November.

In SDA -

SDA> xfc set trace/select=level=2

Since the display here is 132 columns, I've
included an example in an attachment.

A little explanation:

The level=2 trace level starts XFC recording
reads, writes, and file system calls into
XFC. Higher levels of tracing turn on more
detailed debugging trace points. For reads
and writes, the starting and ending point of
each IO is captured. In addition, the latency of the IO is reported. The trace
displays as much of the file name as possible
along with the PID of the process doing the
IO. We get the VBN and IO size as well. The
first number on the line is a sequence number
of the trace entry (not very interesting).
The second number is a sequence number which
is incremented for every IO (all IOs, not
just XFC IOs). This allows you to match up
IO starts and completions on very busy systems.

The latency is measured using the system
cycle counter and is not valid if the IO
completes on a different processor. The
latency only includes time within XFC itself.
It does not include overhead added by the
QIO call or RMS. Cache hits are noted with
an asterisk next to the operation description.

The overhead for the level 2 tracing is
small, but measurable. The trace entries
are kept in a ring buffer containing only
2000 entries. At this time, there isn't
any way to increase the size of the buffer
at runtime (maybe next version).

In SDA, the XFC SHOW VOLUME/BRIEF and XFC
SHOW FILE/BRIEF commands may also be useful
in this kind of situation.

To set the tracing back to the default,

SDA> XFC SET TRACE/SELECT=LEVEL=1


Mark