Operating System - HP-UX
1847885 Members
4291 Online
104021 Solutions
New Discussion

du and bdf doesn't match, 70 gb space missing

 
Robert Tang
Occasional Contributor

du and bdf doesn't match, 70 gb space missing

Hi, there:

Anyone had this problem before?

# bdf /u26
Filesystem kbytes used avail %used Mounted on
/dev/vg18/lvol1 106020864 93894000 12090048 89% /u26

there are 2 directories under /u26,
/u26/oracle has 13 GB, /u26/patches has 1.6 GB, but bdf is showing 93GB used.

Response center said it is caused by deleting open files when the Oracle process is running, but this is Oracle home and only listeners are running from it, it has no open files to be deleted.

Any ideas anyone?

Thanks a lot in advance! points ganranteed!




The end is near
17 REPLIES 17
Michael Tully
Honored Contributor

Re: du and bdf doesn't match, 70 gb space missing

This will occur when a running process has a file or files open, but they have been physically been removed by another. The space will not be returned until the process, which had the file(s) open in the first place is terminated.
Anyone for a Mutiny ?
Michael Tully
Honored Contributor

Re: du and bdf doesn't match, 70 gb space missing

One further thought, you could use the 'lsof' command to assist you. You can get it from here, and it is easy to download and install without an outage.

# lsof -p

http://hpux.connect.org.uk/hppd/hpux/Sysadmin/lsof-4.71/
Anyone for a Mutiny ?
Thierry Poels_1
Honored Contributor

Re: du and bdf doesn't match, 70 gb space missing

Hi,

differences between du and bdf are generaly caused by deleting files which were still in use. The free space will only be reclaimed when the process exits (normally or after a kill).
To find out who/what:
- fuser can show you which processes are active in that filesystem (might be many)
- lsof

good luck,
Thierry.
All unix flavours are exactly the same . . . . . . . . . . for end users anyway.
KapilRaj
Honored Contributor

Re: du and bdf doesn't match, 70 gb space missing

I would use

lsof |grep '' and check which processes have opened files from there and then try restarting them ...

It happened 'cause someone deleted a file which opened a process

Kaps
Nothing is impossible
A. Clay Stephenson
Acclaimed Contributor

Re: du and bdf doesn't match, 70 gb space missing

Do a man 2 unlink. This is the underlying system call that is invoked when a rm is done. It will explain your situation perfectly.
If it ain't broke, I can fix that.
Robert Tang
Occasional Contributor

Re: du and bdf doesn't match, 70 gb space missing

Thanks to all the gurus who replied.

One quick question, there are lot of Oracle trace logs which is deleted periodically, are those files "open files"?

I was told Oracle process opens it, write to it, and close it right away. It will not cause problem if we delete them. Is is true?

Thanks
The end is near
Bill Hassell
Honored Contributor

Re: du and bdf doesn't match, 70 gb space missing

You can always delete a file that is still open. The file handle inside the program still knows where the file is located and can read/write to it for hours (months too). But the file really doesn't disappear and become usable free space until all programs have closed the file. This is standard Unix behavior.

Now you can recover space from a large file by replacing the contents with nothing (/dev/null). This is regardless of whether the file is open or not. Now what such a command will do to the program is unknown -- that must be defined by the program's author. There will indeed be open files if there are executables that are running from those directories, and it doesn't matter if the files are open for read-only.

So to recover the space, you need to download lsof (the fuser command is useless in this situation) from http://hpux.connect.org.uk/ This command will discover exactly what processes have open files on a given filesystem (or file).


Bill Hassell, sysadmin
Robert Tang
Occasional Contributor

Re: du and bdf doesn't match, 70 gb space missing

I tried to locate any files being refered to by a process identified by lsof, but all those files ( on /u26) are not deleted. so where are those deleted files?

Pls see the output:


# lsof -p 13600
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
oracle 13600 oracle cwd DIR 64,0x180001 8192 110114 /u26/oracle/OraHome3/dbs
oracle 13600 oracle txt REG 64,0x180001 75751280 110177 /u26/oracle/OraHome3/bin/oracle
oracle 13600 oracle mem REG 64,0x8 532 2191 /var/spool/pwgr/status
oracle 13600 oracle mem REG 64,0x7 12794 1578 /usr/lib/tztab
oracle 13600 oracle mem REG 64,0x7 141016 8635 /usr/lib/pa20_64/libxti.2
oracle 13600 oracle mem REG 64,0x7 703232 7793 /usr/lib/pa20_64/libnsl.1
oracle 13600 oracle mem REG 64,0x7 1860512 4349 /usr/lib/pa20_64/libc.2
oracle 13600 oracle mem REG 64,0x7 227000 4373 /usr/lib/pa20_64/libm.2
oracle 13600 oracle mem REG 64,0x7 24032 4356 /usr/lib/pa20_64/libdl.1
oracle 13600 oracle mem REG 64,0x7 20472 7796 /usr/lib/pa20_64/libnss_dns.1
oracle 13600 oracle mem REG 64,0x7 168272 4379 /usr/lib/pa20_64/libpthread.1
oracle 13600 oracle mem REG 64,0x7 49056 4383 /usr/lib/pa20_64/librt.2
oracle 13600 oracle mem REG 64,0x7 1044464 4351 /usr/lib/pa20_64/libcl.2
oracle 13600 oracle mem REG 64,0x180001 7761240 113596 /u26/oracle/OraHome3/lib/libjox9.sl
oracle 13600 oracle mem REG 64,0x180001 10240 110109 /u26/oracle/OraHome3/lib/libskgxn9.sl
oracle 13600 oracle mem REG 64,0x180001 6144 99038 /u26/oracle/OraHome3/lib/libodmd9.sl
oracle 13600 oracle mem REG 64,0x7 294056 4340 /usr/lib/pa20_64/dld.sl
oracle 13600 oracle 0u CHR 3,0x2 0t0 66 /dev/null
oracle 13600 oracle 1u CHR 3,0x2 0t0 66 /dev/null
oracle 13600 oracle 2u CHR 3,0x2 0t0 66 /dev/null
oracle 13600 oracle 3u CHR 3,0x2 0t0 66 /dev/null
oracle 13600 oracle 4u CHR 3,0x2 0t0 66 /dev/null
oracle 13600 oracle 5u CHR 3,0x2 0t0 66 /dev/null
oracle 13600 oracle 6u REG 64,0x180001 38430 114240 /u26 (/dev/vg18/lvol1)
oracle 13600 oracle 7u REG 64,0x180001 38430 114240 /u26 (/dev/vg18/lvol1)
oracle 13600 oracle 8u CHR 3,0x2 0t0 66 /dev/null
oracle 13600 oracle 9u REG 64,0x180001 1038 73813 /u26 (/dev/vg18/lvol1)
oracle 13600 oracle 10u unix 64,0x8 0t0 3576 /var/spool/sockets/pwgr/client13600 (0x54d60340)
oracle 13600 oracle 11u REG 64,0x180001 24 114254 /u26/oracle/OraHome3/dbs/lkQA01
oracle 13600 oracle 12u REG 64,0x180001 667648 95386 /u26/oracle/OraHome3/rdbms/mesg/oraus.msb
oracle 13600 oracle 13u REG 64,0xe0002 1073750016 9 /ic01_13/qa01/undotbs01.dbf
oracle 13600 oracle 14u REG 64,0xe0002 536879104 8 /ic01_13/qa01/system01.dbf
oracle 13600 oracle 15u REG 64,0x500006 8589942784 6 /sb09_02/qa01/stard01.dbf
oracle 13600 oracle 16u REG 64,0x500005 4294975488 7 /sb09_05/qa01/starx01.dbf
oracle 13600 oracle 17u REG 64,0x500005 8589942784 6 /sb09_05/qa01/undotbs02.dbf
oracle 13600 oracle 18u REG 64,0x500005 8589942784 5 /sb09_05/qa01/starx02.dbf
oracle 13600 oracle 19u REG 64,0x500003 8589942784 5 /sb09_03/qa01/stard02.dbf
oracle 13600 oracle 20u REG 64,0x500004 8589942784 5 /sb09_04/qa01/stard03.dbf
oracle 13600 oracle 21u REG 64,0x500003 1073750016 8 /sb09_03 (/dev/vg50/lvol3)
oracle 13600 oracle 22u REG 64,0x500004 1073750016 6 /sb09_04 (/dev/vg50/lvol4)
oracle 13600 oracle 23u REG 64,0x500003 4294975488 7 /sb09_03/qa01/temp01.dbf
The end is near
Jdamian
Respected Contributor

Re: du and bdf doesn't match, 70 gb space missing

Try the following command line:

fuser -c /u26 2>/dev/null | sed 's/ */ -p /g' | xargs ps -x
Bill Hassell
Honored Contributor

Re: du and bdf doesn't match, 70 gb space missing

For this particular process, look for files that are open in the program but do not appear in the directory. These files cannot be seen with typical Unix commands because their names have been removed from the directory (but not the space that they occupy). The easiest way to recover this disk space is to stop the process(es) that have these files open. The second way is to unlink the file--but as mentioned before, some process(es) still have these files open so it may crash the program. A reboot will also recover all the space from deleted files.


Bill Hassell, sysadmin
Ted Buis
Honored Contributor

Re: du and bdf doesn't match, 70 gb space missing

I would have thought that there might be a different cause. The default in HP-UX used to be that root always reserved 10% that was just a available in a file system for root access. This is how one time I managed to get bdf to report 111% utilization (1/.9 as a percentage). So bdf didn't really report the what was really available, but what the user could access. I don't know enough about how du handled this. Does bdf still work the same way? What about du?
Mom 6
Smaran
Occasional Advisor

Re: du and bdf doesn't match, 70 gb space missing

I have my program which has the following code.
fflush(stdout);
fflush(stderr);
rename(debug_filename,debug_oldfilename);
fd = open(debug_filename, O_RDWR|O_CREAT|O_TRUNC, PERM);
close(2);
dup(fd);
close(1);
dup(fd);
does this code can create this type of problem ??
If yes, what is the solutions ??

Thanks in advance
Fred Ruffet
Honored Contributor

Re: du and bdf doesn't match, 70 gb space missing

For this thread, I would have think to a problem similar to this thread :
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=629399

For your question Smaran, it shouldn't cause problem. You close stdout and stderr and this won't waste any space.

Regards,

Fred
--

"Reality is just a point of view." (P. K. D.)
Smaran
Occasional Advisor

Re: du and bdf doesn't match, 70 gb space missing

Thanks Fred,
but my probelm is in my program i rename the current file AAA to say XXX.
and then other process is moving the file XXX
into some different file system.
Then I am not getting the space back, which i should get.
anyway if i terminate the process, then only
i am getting the space back.
How can I ensure that no process is using
the file XXX before moving it to diff. file system thru another process?

Prashant Zanwar_4
Respected Contributor

Re: du and bdf doesn't match, 70 gb space missing

use fuser -u /xxx/XXX before transferring. If any process are using it just do a kill on the process and then transfer.
Thanks and regards
Prashant
"Intellect distinguishes between the possible and the impossible; reason distinguishes between the sensible and the senseless. Even the possible can be senseless."
Smaran
Occasional Advisor

Re: du and bdf doesn't match, 70 gb space missing

As a constraint I can not shutdown my application or kill the process.
but, As i believe I am closing the file
descriptor with the close system call.
So, why this problem is still coming ?
how can i ensure that the file is not opened
by any of the process .
I am putting the code if it helps.
fflush(stdout);
fflush(stderr);
close(2);
close(1);
close(fd);
rename(debug_filename,debug_oldfilename);
fd = open(debug_filename, O_RDWR|O_CREAT|O_TRUNC, PERM);
close(2);
dup(fd);
close(1);
dup(fd);
Prashant Zanwar_4
Respected Contributor

Re: du and bdf doesn't match, 70 gb space missing

use lsof in your program to see the file is not in use.
Rgdz
Prashant
"Intellect distinguishes between the possible and the impossible; reason distinguishes between the sensible and the senseless. Even the possible can be senseless."