Operating System - HP-UX
1819901 Members
2538 Online
109607 Solutions
New Discussion юеВ

Re: /var file system is full

 
SOLVED
Go to solution
Stephan Louw
Occasional Contributor

/var file system is full

Hi All

I'm having a bit of problems an my production database server. the /var file system reports as full when running the 'bdf' command, but the 'du' command reports otherwise.

I trimmed all possible log files, but the /var fs just keeps filling up as soon as I free up space. I am concerned that there might also be a "memory leak" somewhere, as 'glance' has been reporting memory utilization @ 99% for two weeks now. Additionally, I found some samx processes running which i cannot kill using straight'kill PID'.


Please find the command output below:


# bdf
Filesystem kbytes used avail %used Mounted on
/dev/vg00/lvol3 1572864 541872 1023000 35% /
/dev/vg00/lvol1 1835008 261032 1561784 14% /stand
/dev/vg00/lvol8 8912896 8912896 0 100% /var
/dev/vg00/lvol7 10682368 2640160 7979440 25% /usr
/dev/vg00/lvol5 2097152 1018272 1070512 49% /tmp
/dev/vg00/lvol6 11599872 4597496 6947696 40% /opt
/dev/vg00/lvol4 1048576 18544 1022048 2% /home


# du -sk /* |sort -n
0 /bin
0 /cdrom
0 /lib
0 /lost+found
0 /net
8 /AUTO
8 /history
1832 /home
150128 /etc
162224 /sbin
204928 /dev
244232 /stand
1001152 /tmp
1063128 /var ---> (1GB?)
2615520 /usr
6202938 /opt


# du -sk /var/* |sort -n
0 /var/X11
0 /var/empty
0 /var/home
0 /var/lost+found
0 /var/news
0 /var/uucp
8 /var/tombstones
16 /var/statmon
16 /var/tmp
24 /var/dt
40 /var/run
96 /var/yp
104 /var/preserve
112 /var/spool
544 /var/vx
1016 /var/sam
1440 /var/mail
21536 /var/stm
22448 /var/jail
147768 /var/opt
867952 /var/adm



more /var/adm/syslog/syslog.log:
.......
Oct 1 15:01:21 BWMAINRX vmunix: vxfs: NOTICE: msgcnt 77 mesg 001: V-2-1: vx_nospace - /dev/vg00/lvol8 file system full
(1 block extent)
.......



# ps -ef | grep sam
root 3803 3597 0 22:37:33 pts/4 0:00 grep sam
root 20860 20857 0 Oct 1 ? 0:00 sh -c /usr/sam/lbin/samx -C -s monitorConfiguration /usr/lib/X11/Xserver/misc/sam/C/sam.ui
root 20861 20860 255 Oct 1 ? 34822:44 /usr/sam/lbin/samx -C -s monitorConfiguration /usr/lib/X11/Xserver/misc/sam/C/sam.ui
root 23489 1 0 Oct 1 ? 0:00 /usr/sbin/stm/uut/bin/tools/monitor/dm_memory_asama
root 23630 1 0 Oct 1 ? 0:00 /usr/sbin/stm/uut/bin/tools/monitor/msamon
root 23595 1 0 Oct 1 ? 0:00 /usr/sbin/stm/uut/bin/tools/monitor/ia64_corehw_asama


I would like to fix the problem without rebooting the Server

- Will it help, and can I unmount /var and remount it while the Server is in Production?

- How will I determine which process is filling up all my RAM (My Database Software is configured to use only 17GB at max - Total is 20GB)

thx






8 REPLIES 8
Ivan Krastev
Honored Contributor

Re: /var file system is full

Check with find for:
- larger files
- last modified files

Check /var/adm/syslog/* for any large files and truncate/remove it.

regards,
ivan
James R. Ferguson
Acclaimed Contributor
Solution

Re: /var file system is full

Hi Stephan:

> The /var file system reports as full when running the 'bdf' command, but the 'du' command reports otherwise. I trimmed all possible log files, but the /var fs just keeps filling up as soon as I free up space.

OK, there is probably a process with an open temporary file (see below):

> I am concerned that there might also be a "memory leak" somewhere, as 'glance' has been reporting memory utilization @ 99% for two weeks now.

That's another issue unto itself. I doesn't take a memory leak, either, to consume all of your memory.

> Additionally, I found some samx processes running which i cannot kill using straight 'kill PID'.

That too isn't an issue. A simple 'kill' can be trapped by a process. It may take a particular signal (e.g. 'kill -HUP') to terminate the process. If a 'kill -9' won't remove a process, then that process is probably waiting for an I/O or other kernel event.

Try regaining some space by using 'cleanup -c 1' to remove rollback copies of superseded patches.

# cleanup -c 1

This is the principal safe way to regain space in '/var/adm/sw' and it can return significant amounts.

If a process opens a file and immediately 'unlink's it (removes it), then you won't see any evidence of the file in a simple 'ls' listing. The process can then write to the file for the extent of the processes life, consuming disk space as it goes. This is actually a common technique. When the last process using the file closes the file, all allocated disk blocks are returned to the system for reuse. Using the 'lsof' utility available from the HP-UX Porting Center can help find files and processes if this is the case.

Regards!

...JRF...
Johnson Punniyalingam
Honored Contributor

Re: /var file system is full

Hi Stephan Louw,

Step1 :- Find some Large files under "var"

find /var -xdev -type f -size +5000000c -exec ll {} \; | sort -nk 5

step 2 :- How about any "undelivred or failed emails" in you sendmail services .?

for this you need stop "Sendmail Service" clear those un-deleivred emails .. than monitor /var file system if still it grows.?

step 3 :-

to clean up /var/adm
=================

#cleanup -c2

it prompts you ..Would you still like to commit these patches? y

thus how you clean up .. the /var/adm..

#/var/adm/syslog
try deleting some old.sys

lof files which are very old

hope this could give some ideas

Rgds,
Johnson
Problems are common to all, but attitude makes the difference
Trng
Super Advisor

Re: /var file system is full

Hi Stephen,

check this two directories.

147768 /var/opt
867952 /var/adm

and check for any crash files inside and check for OVO logs inside /var/opt and clear them.


rgds
skr
administrator
Bill Hassell
Honored Contributor

Re: /var file system is full

> du -kx /var/* | sort -n

You need to change your command to:

du -kx /var | sort -rn | head -20

The /var/* will only summarize each top level directory. You need to have du walk down through all the sub-directories. sort -rn will sort largest at the top of the list. heada -20 shows just the top 20.

You first task is to go after /var/adm -- it is the logfile location. Start with:

du -kx /var/adm | sort -rn | head -20

Then if /var/adm is much larger than one below, there are very large logfiles in /var/adm. Find them with:

ll | sort -rnk5 | head -20

A very large logfile indicates two possibilities:

1. The file has never been trimmed and represents years of logging;

2. or a program has run wild and created a massive logfile.

Look at the largest logfiles to see if something bad is happening. Then trim the logs you have looked through. If you are required to keep logs, always do this:

gzip logfile > archive/logfile.20081026.gz
cat /dev/null > logfile

This will make a compressed copy of the log along with today's date and zero the logfile.

> samx


Bill Hassell, sysadmin
Dennis Handly
Acclaimed Contributor

Re: /var file system is full

>reports as full when running the 'bdf' command, but the 'du' command reports otherwise.

Obviously du(1) is lying to you, that's why you usually don't use/trust it.

>I am concerned that there might also be a "memory leak" somewhere, as 'glance' has been reporting memory utilization @ 99% for two weeks now.

As JRF said, memory leaks don't cause excessive disk space growth, only swap growth.

>I found some samx processes running which i cannot kill using straight 'kill PID'.

If you aren't using sam and haven't had it up for 25 days, you should kill -9 those two processes. You first might also want to find out who is PID 20857? It seems to be doing "monitorConfiguration", something useful and long running?

>can I unmount /var and remount it while the Server is in Production?

Of course not. If you could unmount it, you won't have the problem. I.e. you would have killed the runaway process.

>How will I determine which process is filling up all my RAM (My Database Software is configured to use only 17GB at max - Total is 20GB)

Filling up RAM is a good thing and most likely unrelated to your /var/ issue. What are your disk cache kernel parms? Also provide the "swapinfo -tam" output.

So as JRF said, you need to use download lsof and find the process.
Stephan Louw
Occasional Contributor

Re: /var file system is full

Thx James

your part on how processes use files was very informative.

I managed to resolve the /var issue by taking the following actions:


- ran 'fuser -c /var' to determine what processes have files open in /var

- I then used 'pf -ef' to check what the processes are (samx was the culprit!)

- then I used 'kill -9 PID' to kill the two running samx processes

- bdf then reported /var to be only 13% utilized

Points assigned
Stephan Louw
Occasional Contributor

Re: /var file system is full

thx all