1833757 Members
2876 Online
110063 Solutions
New Discussion

Re: vx_nospace

 
SOLVED
Go to solution
Ron Irving
Trusted Contributor

vx_nospace

Good morning gents!!

Here's what I got: Attempted to do a patch run on an 11.31 systems yesterday. The swinstall failed the analysis, due to the / filesystem not having enough space. That is neither here nor there. I have to work with the filesystem to get more space allocatied to /. Now, bdf shows / at 100% full. I am not in a position to extend / at this time. Where did the space go? Right now there's 1024MB allocated to /. Any ideas where the space went?

Regards,

Ron
Should have been an astronaut.
25 REPLIES 25
Dennis Handly
Acclaimed Contributor

Re: vx_nospace

Have you tried using "du -kxs /*" to find the big directories?
Does that add up to 1 Gb?
Ron Irving
Trusted Contributor

Re: vx_nospace

Hi Dennis, and thanks!!

Here's what I got on that:

root@fobapp / =#du -ksx /
1027184 /

That's it. Any ideas?
Should have been an astronaut.
Manix
Honored Contributor

Re: vx_nospace

check the large files in /

# find / -size +10000c -xdev -exec ll {} \; | sort -rn -k 5

or try replacing "ll to ls -l"

HP-UX been always lovable - Mani Kalra

Re: vx_nospace

Ron,

>> root@fobapp / =#du -ksx /
>> 1027184 /
>>
>> That's it. Any ideas?

Yes, you missed off the asterisk (*) from the end of Dennis' command... try

du -ksx /*

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Ron Irving
Trusted Contributor

Re: vx_nospace

Guys!!

Thanks for your prompt responses. Unfotunately, I had to come back to my room. No vpn here, so I'll have to go at it again later when I'm recovered from whatever plague I have contracted.

Thank you again...I will update this tomorrow, (Sunday)

Ron
Should have been an astronaut.
Dennis Handly
Acclaimed Contributor

Re: vx_nospace

>Here's what I got on that: 1027184 /

How much space does "bdf /" show?
Ron Irving
Trusted Contributor

Re: vx_nospace

Hey all....I dragged my sick ass back in.

#bdf shows

/dev/vg00/lvol3 1024 1024 0 100% /
/dev/vg00/lvol1 1792 181 1598 10% /stand
/dev/vg00/lvol8 15360 4499 10778 29% /var
/dev/vg00/lvol7 5792 2873 2896 50% /usr
(first few lines.)

The output of the find command shows the first entry,
-rw-r--r-- 1 root root 5037965144 Feb 5 17:28 nwmgr_apa.log

Now, that is 'supposed' to be in the /tmp directory, but the fund command shows it being in the root directory(?) Should I try moving it temporarily to see?

Should have been an astronaut.
Dennis Handly
Acclaimed Contributor

Re: vx_nospace

>#bdf shows
>/dev/vg00/lvol3 1024 1024 0 100% /

What bdf gives you megabytes?

>The output of the find command shows the first entry,
>-rw-r--r-- 5,037,965,144 Feb 5 17:28 nwmgr_apa.log

That shows 5 Gb, is it a sparse file?
What does "ll -e nwmgr_apa.log" show?

>but the find command shows it being in the root directory(?) Should I try moving it temporarily to see?

If you copy it to another filesystem and it is still open, you'll lose the handle on the file. You'll need to stop that process before you move that file.
You could use: /usr/sbin/fuser -u nwmgr_apa.log
to find the process that has it open.
Bill Hassell
Honored Contributor

Re: vx_nospace

To fix the / directory, you need to show the largest directories in irder:

du -kx / | sort -rn | head -20

This will show you big directories which is much more useful. The two larges directories must be /sbin and /etc and they should add up to about 90% of the total for /. An average system might have 100 about MB for /sbin and 100-300 MB for /sbin. That means for your system, more than 500 MB is in the wrong location. To check just the / directory for a junk file:

ll / | sort -rnk5 | head

The output of the find command shows a 5 GB file which couldn't fit into /, so the file is sparse -- but regardless, the file does NOT belong in /. Indeed, no log files ever go into /. This is a common symptom when the root user's HOME directory is / (a very bad place for it to be).

Fixes:
1. Change the logging for APA to /var/adm
2. Move root's HOME to /root
3. Clean out all files from /. The / directory should contain nothing but directories.

You may need to add another 1 GB to /usr.


Bill Hassell, sysadmin
Ron Irving
Trusted Contributor

Re: vx_nospace

Hi!!

Thanks for your responses!! I'll be going through them today and try to clean things up.

Here's what caused the issue. I ran a swinstall from SMH, which failed, due to insufficient space in the / directory. Before I launched the install, / was sitting pretty at around 50 - 60%. When the analysis finished, we're at 100% and complaining. Is this a clue? I'm no Sherlock Holmes, but maybe you detectives out there can read between the lines. Is there a particular logfile or something?

Another question I have is why would a swinstall be dumping files into / anyway?

Crazy but curious,

Ron
Should have been an astronaut.

Re: vx_nospace

Ron,

I have *no idea* why nwmgr_apa.log is in / - but I do know there is a fix in the B.11.31.40 release of APA (Auto-Port Aggregation) to stop this debug file growing at such a rate.

Maybe try that as a starter...

HTH

Duncan

I am an HPE Employee
Accept or Kudo

Re: vx_nospace

oh, and I doubt nwmgr_apa.log is going to contain much of interest to you, unless you are trying to debug issues in the network stack, so I would also probably just run a "fuser" on the file, and if no-one has it open, delete it.

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Ron Irving
Trusted Contributor

Re: vx_nospace

nwmgr_apa.log is located in the /tmp directory, so it has no bearing on the issue at hand. The issue is, why, when running an swinstall through system management homepage, did the / directory fill up, and with what? During the analysis, it complained that there was not enough space in /. Before I started the install, there was at least 50% available in /, and after the analysis, it had shot up to 100%.

The main question is, why would a swinstall would need / anyway? If I knew the processes involved, I could delete the files and regain the space.

The / mount point for lvol3 had 1GB allocated to it. That should be sufficient, shouldn't it?

Thank you!

Ron
Should have been an astronaut.
Dennis Handly
Acclaimed Contributor

Re: vx_nospace

>nwmgr_apa.log is located in the /tmp directory, so it has no bearing on the issue at hand.

You said you found it with the above find(1), that means it is in the / filesystem. And your above incomplete bdf shows you probably don't have a /tmp filesystem.

>there was at least 50% available in /, and after the analysis, it had shot up to 100%.

Have you used tail(1) on nwmgr_apa.log to see if there are lots of recent messages?
Also, just copying that sparse file will fill up the disk.

>why would a swinstall would need / anyway?

Because that's where /tmp is?

>The / mount point for lvol3 had 1GB allocated to it. That should be sufficient, shouldn't it?

Perhaps not for /tmp/.
Dennis Handly
Acclaimed Contributor

Re: vx_nospace

Oh, I may have confused you with "du -kxs /*". This will print out other filesystems and take forever.
Use Bill's version.
Ron Irving
Trusted Contributor

Re: vx_nospace

Sorry Dennis. Here's the complete bdf:

root@fobapp / =#bdf
File-System Mbytes Used Avail %Used Mounted on
/dev/vg00/lvol3 1024 1024 0 100% /
/dev/vg00/lvol1 1792 181 1598 10% /stand
/dev/vg00/lvol8 15360 4500 10777 29% /var
/dev/vg00/lvol7 5792 2873 2896 50% /usr
/dev/vg03/lvu04 139264 122996 15407 89% /u04/oracle
/dev/vg01/lvu02 139264 126045 12413 91% /u02/oracle
/dev/vg00/lvoracle 76800 60660 15311 80% /u01/oracle
/dev/vg00/lvol4 8192 5250 2920 64% /tmp
/dev/vg00/lvol6 8800 5032 3739 57% /opt
/dev/vg00/lvol5 8192 13 8115 0% /home
nfs:/home/appbackup 247636 57208 177849 24% /appbackup
nfs:/backup 50397 2057 47828 4% /backup
fobdb:/u02/oracle/StageR12 122880 93515 27530 77% /u03/oracle/StageR12

/tmp is it's own mountpoint of lvol4. I attempted to mv the mwmgr_apa.log to another directory, and it only affected /tmp, and not /. As I said before, the / directory ONLY filled up after I ran the swinstall.

Here's the tail of the mwmgr_apa.log:
root@fobapp /tmp =#tail mwmgr_apa.log
321:success perform ioctl HACR_GET ppa=0
321:success perform ioctl HACR_GET ppa=0
321:success perform ioctl HACR_GET ppa=0
321:success perform ioctl HACR_GET ppa=1
321:success perform ioctl HACR_GET ppa=1
321:success perform ioctl HACR_GET ppa=1
321:success perform ioctl HACR_GET ppa=3
321:success perform ioctl HACR_GET ppa=3
321:success perform ioctl HACR_GET ppa=3
Exit apa_netmgr_main, ret=0
Should have been an astronaut.
Ron Irving
Trusted Contributor

Re: vx_nospace

oops...and here's the tail of nwmgr_apa.log:
root@fobapp /tmp =#tail nwmgr_apa.log
321:success perform ioctl HACR_GET ppa=0
321:success perform ioctl HACR_GET ppa=0
321:success perform ioctl HACR_GET ppa=0
321:success perform ioctl HACR_GET ppa=1
321:success perform ioctl HACR_GET ppa=1
321:success perform ioctl HACR_GET ppa=1
321:success perform ioctl HACR_GET ppa=3
321:success perform ioctl HACR_GET ppa=3
321:success perform ioctl HACR_GET ppa=3
Exit apa_netmgr_main, ret=0

Please let me know if you need anything else.

Ron
Should have been an astronaut.
Dennis Handly
Acclaimed Contributor

Re: vx_nospace

>/tmp is it's own mountpoint of lvol4.
>the / directory ONLY filled up after I ran the swinstall.

Ok, start all over:
1) What was the result of Manix's find?
2) What was the result of Bill's du?
3) Find open & deleted files: lsof +aL1 /
Ron Irving
Trusted Contributor

Re: vx_nospace

Hi and thanks again!!

I've attached a couple of files. One is Manix's find, find.out, and Bill's du is du.out. I have to install lsof on this to run it.

Thank you!!
Should have been an astronaut.
Ron Irving
Trusted Contributor

Re: vx_nospace

Oops...and here's the du output.

Thanks!!
Should have been an astronaut.
Dennis Handly
Acclaimed Contributor
Solution

Re: vx_nospace

>One is Manix's find and Bill's du

They both point right to it:
---------- 1 root sys 837,074,944 Feb 4 07:17 /u03/Stage
817496 /u03

This doesn't belong under /.
Since there are no permissions, it could have been in the process of copied by ftp when it was aborted?

>I have to install lsof on this to run it.

No need now.
Bill Hassell
Honored Contributor

Re: vx_nospace

> The output of the find command shows the first entry,
> -rw-r--r-- 1 root root 5037965144 Feb 5 17:28 nwmgr_apa.log

You must not have copy-pasted the find command. "find / -xdev" will not search mounted filesystems such as /tmp. It's always useful to show the complete bdf output, at least for vg00.

So zero the nwmgr file and put in the patch for APA. There is no reason to save it, especially at 5GB.

So the /u03 is the problem and SMH wasn't the cause. You (or another root user) created the /u03 directory but did not mount an lvol to the directory. So when files were created/copied into this directory, it was actually / that filled up. This might have been an ftp or scp copy or maybe restored from tape. In fact, it may have been full before you started SMH. That's why you always need to monitor filesystems and be very careful about restoring files and directories.

To fix this, hopefully you haven't been running Oracle in this state. If you are, shut it down. Then create the appropriate lvol (2GB? whatever). Mount it to a temporary directory, perhaps /mnt1 (create it). Then move all the files and directories from /u03 to /mnt1. This is a good (and fast way):

mkdir /mnt1
lvcreate -L 2000 -n lvu03 vgsoemthing
newfs /dev/vgsomething/rlvu03
mount /dev/vgsomething/lvu03 /mnt1
cd /u03
find . | cpio -pudlmv /mnt1
find /u03 -type f | wc -l
find /mnt1 -type f | wc -l

find /u03 -type d | wc -l
find /mnt1 -type d | wc -l

You run the wc counts to verify that the source and destination counts are the same. Note that /mnt1 will have one additional directory: /mnt1/lost+found because /u03 in the / directory was not a mountpoint.

Once the copying is done and verified, you can remove the files and directories in /u03:

rm -rf /u03/*

If there are dot-files (like .ora3 or .myfile, etc), move these manually to /mnt1. Now /u03 is an empty directory that will become your /u03 mountpoint.

umount /mnt1
mount /dev/vgsomething/lvu03 /u03

Now your / directory is back to normal. Be sure to edit /etc/fstab to add the lvu03 lvol.


Bill Hassell, sysadmin
Dennis Handly
Acclaimed Contributor

Re: vx_nospace

>Bill: created the /u03 directory but did not mount an lvol to the directory.

Perhaps nothing needs to be done except for a few directories? There is a NFS mount in a subdirectory:
fobdb:/u02/oracle/StageR12 122880 93515 27530 77% /u03/oracle/StageR12

So clean up:
ll -d /u03/!(oracle) /u03/oracle/!(StageR12)

If you do this partial cleanup, you may want to make them read/only so nobody is tempted to fill up root.
Ron Irving
Trusted Contributor

Re: vx_nospace

That was it guys!! I spun everything up this morning, and unmounted the nfs share. Sure enough...there was that Stage file sitting there...staring at me with it's evil grin. Deleted file, and bdf now shows 22%. (I overestimated)

Thanks everyone for your help!!!
Should have been an astronaut.