Operating System - HP-UX
1826114 Members
4847 Online
109690 Solutions
New Discussion

disk space "disappeared" witk memory-leak process

 
SOLVED
Go to solution
Wei_9
Occasional Advisor

disk space "disappeared" witk memory-leak process

Our 2G /opt file system was full. When we use
du -ks . on /opt, there are only 1GB for file.

We killed a process and the disk space under /opt is released.

What caused those /opt file system full?

We know there is memory leak problem with that process. And why the /opt directoy is full?

How we fix the problem?

12 REPLIES 12
Steve Bonds
Trusted Contributor

Re: disk space "disappeared" witk memory-leak process

Even if a file is deleted, UNIX doesn't release the disk space until the file is closed. Someone removed a file that was open and growing, and once the process was killed the system freed the space.

-- Steve
A. Clay Stephenson
Acclaimed Contributor

Re: disk space "disappeared" witk memory-leak process

You fix the problem by having the applications developers fix the code. It doesn't surprise me that sloppy programmers with memory leaks also place temporary files in non-standard directories. The "correct" location is /var/tmp (or at least /tmp but that is discouraged). I strongly suspect that they are making temp file in the current working directory so you could probably help the situation be cd'ing to another directory before starting the application.
If it ain't broke, I can fix that.
Michael Steele_2
Honored Contributor

Re: disk space "disappeared" witk memory-leak process

Memory leaks, by definition, consume physical memory. Occaisionally they get bad enough to consume swap.

Writing to disk is not a memory leak.

But if you're file system /opt is filling up, then attach the result of these commands and I'll help:

quot /opt

find /opt -xdev -ctime 0 -exec ll {} \;

du -k /opt | sort -rn | head -30

find /opt -name core
Support Fatherhood - Stop Family Law
Steven E. Protter
Exalted Contributor

Re: disk space "disappeared" witk memory-leak process

One thing, temporary processes should not be using space on /opt.

Even after you kill the processes, the space is still reserved in many instances.

The prior posts are right, the app needs to be dealt with. Seems there were not good rules for the programmers on this project.

A reboot will clear the space if I am right.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Wei_9
Occasional Advisor

Re: disk space "disappeared" witk memory-leak process

The proble is when I use

bdf /opt(2 GB), the file system is %100 full.

Thenk I run du -ks under /opt,
there is only LESS than 1GB.

Where is another 1GB gone?

We checked with developers about the prcoess,
the program has NO file action under /opt. The process is a daemon. We need to recycly this daemon every two weeks to realalse the disk space.

Michael Steele_2
Honored Contributor

Re: disk space "disappeared" witk memory-leak process

You have a runaway process. And because /opt is apart of the O/S only a reboot will clear this.
Support Fatherhood - Stop Family Law
Steve Bonds
Trusted Contributor
Solution

Re: disk space "disappeared" witk memory-leak process

The developers are wrong about the program not having any files open on /opt if killing this process releases the space.

The third-party program "lsof" can show you the files that a process has open, even if they have been deleted. (Obviously they won't have a filename any more, but you'll be able to tell which filesystem they were on.)

bdf shows the actual allocated blocks on the filesystem. "du" only shows allocations that still have a filename associated with them-- this is why you see the difference.
Wei_9
Occasional Advisor

Re: disk space "disappeared" witk memory-leak process

porcessName 29836 userId 2w VREG 64,0x5 1205529600 16392 /opt (/d
ev/vg00/lvol5)

above is the result of lsof. How can we determine why the developer is using the /opt file system. They are also using /opt/oracle and /opt/tuxedo for Oracle and Tuxedo libraries.
Wei_9
Occasional Advisor

Re: disk space "disappeared" witk memory-leak process

Also I just noticed our developers/programs do not have permission to write file on /opt.

The /opt is owned by bin:bin

Then how /opt be used by the process?
Vitek Pepas
Valued Contributor

Re: disk space "disappeared" witk memory-leak process

They may have permissions to write to one of the subdirectories, or the program may owned by root and have effective UID bit set.
Michael Steele_2
Honored Contributor

Re: disk space "disappeared" witk memory-leak process

lsof -p pid

-or-

lsof -p 29836 (* your runaway process pid *)

This will list all open files.

Alternately,

# lsof /opt (* this may provide too much info *)
Support Fatherhood - Stop Family Law
Steve Bonds
Trusted Contributor

Re: disk space "disappeared" witk memory-leak process

The "2w" in the lsof listing corresponds to file descriptor 2, which is stderr. My guess is when the process is started the stderr gets redirected somewhere in the /opt filesystem (not necessarily immediately under /opt), then this file is removed, perhaps to "save space". The offset reported by lsof ("1205529600") is also suspiciously close to the 1GB you report as "missing" in the du output.

Go look at the method by which this program is started and I bet you'll see the problem.

A more correct way to do this is to redirect stdout/stderr to /dev/null if not needed, or to a file in /var if needed.

The best way to do it is to follow the conventions for daemon programming described in the excellent book "Advanced Programming in the UNIX environment" by W. Richard Stevens. In my book, this is chapter 13. It contains all kinds of good information on how to avoid problems with daemon processes.