System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

why diskspace claimed in /var?

SOLVED
Go to solution

why diskspace claimed in /var?

Hi,

HP-UX 11iv1 and Oracle 10.2.0.4

We've noticed that the available space in /var is getting lower during the day. In the morning '%used' is around 11% but as time goes by:

Filesystem kbytes used avail %used Mounted on
/dev/vg00/lvol8 4718592 2470384 2233040 53% /var

Yesterday %used was 99%.

A look with lsof learned that it could be frmweb-processes.

excerpt form output of #/usr/local/bin/lsof /var
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
frmweb 27122 oracle mem REG 64,0x8 532 142 /var/spool/pwgr/status
frmweb 27122 oracle 28u unix 64,0x8 0t0 23953 /var/spool/sockets/pwgr/client27122 (0x566bf580)
frmweb 27122 oracle 45u REG 64,0x8 1268464 24115 /var (/dev/vg00/lvol8)
frmweb 27123 oracle mem REG 64,0x8 532 142 /var/spool/pwgr/status
frmweb 27123 oracle 28u unix 64,0x8 0t0 23957 /var/spool/sockets/pwgr/client27123 (0x83c57a00)
frmweb 27903 oracle mem REG 64,0x8 532 142 /var/spool/pwgr/status
frmweb 27903 oracle 28u unix 64,0x8 0t0 23803 /var/spool/sockets/pwgr/client27903 (0x72d2b340)
frmweb 27903 oracle 45u REG 64,0x8 18676144 23807 /var (/dev/vg00/lvol8)

The REG would mean a regular file but only /var is mentioned without a filename.

We are able to redirect the claim by these processes onto another filesystem.

Can anyone shine some light on what is happening here? Can we limit the claim (how)?

TIA,
Roland
15 REPLIES
Laurent Menase
Honored Contributor

Re: why diskspace claimed in /var?

you should take a
du -sk /var/*

every 20min and look at where the space is growing
usually it is more /var/adm/syslog which grows ...


then look at the evolutions of the spaces during the day.
TTr
Honored Contributor

Re: why diskspace claimed in /var?

Have you looked in /var/tmp? Processes create and delete temporary files in there. Also do you do printing on this server? If so /var/spool/lp can grow as well. The fact the /var usage goes up and down by itself is an indication that the environment is running quite smoothly on your server. The processes are cleaning up after themselves. Check with the du command as mentioned. You may still find some stuff left behind that you can clean up.

Re: why diskspace claimed in /var?

Hi Laurent,

we have checked for files in /var but nothing came out of it. As I've said: "We are able to redirect the claim by these processes onto another filesystem."

There we see the same behaviour then. If the frmweb-processes go away the %used drops down to the value as observed in the morning.

For clarification I should add this is an Oracle application server.

Re: why diskspace claimed in /var?

Hi TTr,

nothing unusual in /var/tmp and no printing from this server.
Our /var is about 4 GB large and usage is observed to rise to 99%.
If you look at my first post you can see PID 27903 has 18 MB's claimed.

We've checked /var while %free was getting smaller for growing / new files but nothing turned up.
James R. Ferguson
Acclaimed Contributor
Solution

Re: why diskspace claimed in /var?

Hi Roland:

A very common programming technique is to create a temporary file and immediately unlink() it. This leaves the file (and its space) available for the duration of the program but automatically causes its removal when the program using it terminates. One advantage is that no epilog (cleanup) code is necessary to write.

Do:

# lsof +D /var +L1

Look for any files with an NLINK value of zero (0). These would be files with a zero link count that will vanish when the last process terminates. The SIZE/OFFSET column will offer the character size of the file in question.

Regards!

...JRF...
Bill Hassell
Honored Contributor

Re: why diskspace claimed in /var?

/var is the fastest growing filesystem in HP-UX and requires constant monitoring and maintenance. The output from lsof is not useful because it does not indicate the size of the directories. Use this command:

# du -kx /var | sort -rn | head -20

Now you can see where the space is being allocated. You might have /var/spool/lp very large due to printer probelms. Or /var/mail might be very large due to mail messages. Most of the system logfiles are kept in /var/adm/. You need to find the biggest directories first, then look at the files in those directories. You may find that frmweb is not the problem at all.

Before making any changes, run a backup of vg00. Then start by looking at /var/adm and /var/adm/syslog to see if you have extremely large (dozens of MB) logfiles. Use the command:

# ll /var/adm | sortk -rnk5 | head 20

You need to decide how to trim those big files. Don't remove them! Many are required for proper operation of the system and require specific permissions and ownership. Instead, make a copy of the current file (/tmp is a possible location) and then truncate the file using something like this:

# bdf /var
# cp /var/adm/syslog/syslog.log /tmp/syslog.log.2009-10-02
# cat /dev/null > /var/adm/syslog/syslog.log

Now you will have recovered some space in /var:

# bdf /var

Most production systems have auditing requirements to keep system logfiles for a year or more so compress the old log:

# compress /tmp/syslog.log.2009-10-02

And this will reduce the size of the old log. Then move the compressed file back to /var/adm/syslog. The reason is that /tmp may be erased during a bootup and you want all the related log files in the same directory:

# mv /tmp/syslog.log.2009-10-02 /var/adm/syslog

Repeat the cp/cat/compress/mv commands for other big logfiles.

You may find that /var/aadm/sw is quite large (from the previous du command). Use the cleanup command to commit old patches that have been superseded several times. A good starting point is:

# cleanup -c 2

This will remove archive information about patches that have been replaced (superseded) more than 2 times.

There are other directories in /var that may require work. Use the du listing above to determine where the most space is being used. For help cleaning up /var, post a listing of the above du command and we can suggest fixes for the space. Note that maintaining free space on /var, /tmp, /usr and / are some of the most important tasks for a system administrator. Oracle will shutdown (sometimes with errors) when certain filesystems fill up.


Bill Hassell, sysadmin

Re: why diskspace claimed in /var?

Hi James,

I guess I have to talk to our DBA's if they can limit the claim from these frmweb-processes. Thanks for the push in the right direction.

@Bill: Hi Bill,
I'm afraid housekeeping is not going to do the trick here. Like I've said in the morning I can have more than 3 GB free and a couple of hours later my /var could be %used 99
TTr
Honored Contributor

Re: why diskspace claimed in /var?

I think sockets do not use disk space but rather memory buffers. I think you should look for actual files. Check periodically with "du" during the 99% var usage. You should still try and clean up so that you have more disk space available to be used by running processes.
Matti_Kurkela
Honored Contributor

Re: why diskspace claimed in /var?

/var/spool/pwgr/status and /var/spool/sockets/pwgr/client* are related to the pwgrd daemon, i.e. the Password and Group Hashing and Caching Daemon.

/var/spool/pwgr/status is the pwgrd status file: I think it should never grow in size significantly.

/var/spool/sockets/pwgr/client* are Unix sockets that allow the application to use the pwgrd's cache. These should only consume directory entries, not any real disk space.

The use of pwgrd is a feature of the PAM authentication libraries, so any program that uses the authentication facilities of the system may be accessing /var/spool/pwgr/status and creating its own /var/spool/sockets/pwgr/client* socket.

If you suspect pwgrd is consuming excessive disk space and you aren't using NIS or LDAP and don't have a very large number of local Unix user accounts, you can disable pwgrd entirely.

sh /sbin/init.d/pwgr stop
edit /etc/rc.config.d/pwgr, change PWGR value from 1 to 0.

MK
MK
Dennis Handly
Acclaimed Contributor

Re: why diskspace claimed in /var?

>The REG would mean a regular file but only /var is mentioned without a filename.

If these are the removed files, that may make sense?
I've see other cases where you need to do a find(1) to get the names:
find /var -inum 24115 -o inum 23807

Note the file offset of each is about 18676144.

>We are able to redirect the claim by these processes onto another filesystem.

You can, how? TMPDIR?
That is probably all you can do.

Re: why diskspace claimed in /var?

Hi TTr / Matti,

my question is limited to these processes:
frmweb 6531 oracle 45u REG 64,0x8 653744 0 8351 /var (/dev/vg00/lvol8)
frmweb 7315 oracle 45u REG 64,0x8 179501488 0 4898 /var (/dev/vg00/lvol8)
frmweb 8297 oracle 45u REG 64,0x8 426944 0 8355 /var (/dev/vg00/lvol8)
frmweb 8422 oracle 45u REG 64,0x8 16222640 0 6451 /var (/dev/vg00/lvol8)

Hi Dennis,

yes, you are right: redefining of TMPDIR is done for redirection.

Part of my question is still unanswered:
Can we limit the claim (how)?
TTr
Honored Contributor

Re: why diskspace claimed in /var?

If you removed files while they were in use and in the "open" state, their space was not released.

Two things usually happen, the process is still active and is using the files and is writing to them. You have to stop the process that is holding the files open. Then the files will disappear and the space will be released.
Or, the proces that uses the deleted open file is hung in which case you have to kill the process. Sometimes the process will not listen to any kill signal and the only way to kill the process and reclaim the space is to reboot the server.
Dennis Handly
Acclaimed Contributor

Re: why diskspace claimed in /var?

>Can we limit the claim (how)?

Possibly by using TMPDIR to point to a filesystem with quotas. But this may just abort those processes.
Have you found out what's in these files? If they are logs, perhaps you can configure them to not be so verbose?

Re: why diskspace claimed in /var?

Hi TTr,

please take the time to read James' post in this thread. He gave a very clear explanation of what is happening.

Hi Dennis,

we suspect this space is used by oracle forms processes to manipulate data. For example: they have a button in their form which selects all records on certain criteria and then they can sort on date or whatever. The sorting criteria needs the whole set of records and not just a subset.

The application programmers are going to redesign the forms by making use of temporary tables.

For now we're going to add an extra filesystem and redefine TMPDIR for those frmweb processes.

Thank you all very much for your time and thoughts.

Re: why diskspace claimed in /var?

Thank you all very much for your time and thoughts