1832598 Members
2929 Online
110043 Solutions
New Discussion

Re: vmunix

 
SOLVED
Go to solution
Keith Bevan_1
Trusted Contributor

vmunix

I am looking for advice on the source and resolution to the following error that is occuring occassionally on our HP9000 L2000 running HPUX 11.11 :-

vmunix: file: table is full

I think it may be a kernel parameter (hopefully a configurable one) that has been exceeded when a large number of users are connected to the server, and using a maximum number of open files

Any advice would be greatly appreciated.

Thanks

Keith
You are either part of the solution or part of the problem
8 REPLIES 8
Jeff Schussele
Honored Contributor

Re: vmunix

Hi Keith,

Yep it's either:

1) NFILE - most likely
2) maxfiles - soft limit
3) maxfiles_lim - hard limit

Probably #1 above.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Sanjay Kumar Suri
Honored Contributor

Re: vmunix

Check this parameters:

nfile, ninode, nproc.

Also run

#sar -v 5 5

to get some info.

sks
A rigid mind is very sure, but often wrong. A flexible mind is generally unsure, but often right.
Kent Ostby
Honored Contributor

Re: vmunix

The file table is sized by the kernel parameter "nfile".

Best regards,

Kent M. Ostby
"Well, actually, she is a rocket scientist" -- Steve Martin in "Roxanne"
Robert-Jan Goossens
Honored Contributor
Solution

Re: vmunix

Hi Keith,

Take a look at this doc,

Document description: vmunix: file: table is full
Document id: HPUXKBRC00008909

http://www5.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&docId=200000064128770

Reagrds,

Robert-Jan
David Burgess
Esteemed Contributor

Re: vmunix

Keith,

It sounds like you need to up the value of nfile in the kernel.

You can run

sar -v 5 5

to look at the file table now, but the problem may have gone away. Under the adm users cron we run 2 jobs to collect stats

0,5,10,15,20,25,30,35,40,45,50,55 * * * 0-6 /usr/lib/sa/sa1
5 23 * * 1-5 /usr/lib/sa/sa2 -s 1:00 -e 23:40 -i 300 -A

(See man sa1)

Then you can run

sar -v

to get historical data and see exactly when the problem occurred and what filled up.

HTH

Regards,

Dave.
Sriram Narayanaswamy
Occasional Advisor

Re: vmunix


If you're talking about a message printed in syslog,
it is indeed printed when the global file descriptor
table is found to be full.

You can probably improve the situation by increasing
a kernel tunable like nfile.

Geoff Wild
Honored Contributor

Re: vmunix

Yes - as others have said - increase nfile.

Here's some info on Kernel Problems:


KERNEL PROBLEMS

Common Kernel Parameters which need to be modified & associated errors

nfile, ninode, nproc, maxuser:

when these parameters need to be increased you will see errors in
/var/adm/syslog/syslog.log in the format:
"vmunix: file table is full" (or "proc table is full", or "inode table is full")
users may see errors such as "file table overflow"

nproc: maximum number of system wide processes
nfile: maximum number of files that can be open simultaneously at any given
time. 3 nfile entries will be used for each process
(stdin, stdout, stderror), 2 entries for each pipe (stdin, stdout)
ninode: maximum number of inodes kept in main memory

these parameters can be monitored with:
# sar -v 5 5 (sample 5 times at 5 second interval)
which will produce output in the format:

16:45:12 text-sz ov proc-sz ov inod-sz ov file-sz ov
16:45:17 N/A N/A 131/276 0 476/476 0 420/800 0

In this example we are using 131 out of 276 entries in the nproc table,
420 out of 800 entries in the nfile table. The inode table is used for
DNLC (directory name lookup cache). Since the inode table is used for
cache, sar and glance will typically show usage at 90 - 100 %. If the
customer is not seeing errors: "inode table is full" they probably do not
need to increase ninode. There is a utility dnlcount which will show
actual ninode usage, see GLP211-2 for more information.

As shipped, nfile, ninode, and nproc are formulas which depend on a variable
called maxusers. Increasing maxusers will increase the values of nfile,
ninode, nproc, npty, and nstrpty. To eliminate errors:
a. double maxusers (shotgun approach, not recommended)
b. double the appropriate parameter

kernel cost (ie, RAM used per entry):
nfile: 30 bytes nproc: 180 bytes ninode: 286 bytes


maxdsiz, maxtsiz, maxssiz

maxdsiz: maximum size of a data segment (all data which a program accesses)
(maximum size 944 MB)
common errors indicating maxdsiz may be underconfigured:
out of memory; ENOMEM from malloc
maxtsiz: maximum size of a text segment (the compiled program)
maxssiz: maximum size of the instruction stack (grows dynamically as a program
executes, maximum size 80 MB)

note: These values are entered in bytes. SAM will display a hexadecimal value,
however you can enter values in decimal format.

semmns, semmni

semmns: maximum number of user-accessible semaphores
- errors indicating underconfigured: semget ENOSPC errors in syslog, or
messages such as not enough semaphores or can't allocate semaphores from app
semmni: maximum number of semaphore identifiers

to view current usage:
# ipcs -as you will see a column labeled "ID", this corresponds to semmni
(each number counts as one). To the right the column labeled NSEMS corresponds
to semmns (each number counts as its value).
symptom: customer can run 1 or 2 instances of a database, fails when
attempting to open nth instance - check semmns

npty, nstrpty
npty: Specifies the maximum number of pseudo-tty data structures
available on the system. (used by telnet at 10.x, at 11.0 telnet
uses nstrpty)
nstrtel: Specifies the number of telnet device files that the kernel can
support for incoming telnet sessions.
nstrpty: maximum number of streams based pseudo-tty data structures available
on the system (used by rlogind)

to create more device files (500 total in this example:)
# cd /dev
# insf -n 500 -C pseudo
- or -
# insf -d ptys -n 500
# insf -d ptym -n 500
# insf -d pts -s 500 -e -v (11.0 only)

# ls /dev/pty | wc -w
# ls /dev/ptym | wc -w

common errors: "connection refused" from telnet
"maximum number of users already logged in" from telnet
"unable to allocate pty" from remsh

dbc_max_pct, dbc_min_pct, nbuf, bufpages

dbc_min_pct: minimum ram used for buffer cache (default 5%)
dbc_max_pct: maximum ram used for buffer cache (default 50%)
on systems with > 2 GB RAM it is recommended dbc_max_pct be
set to 10 - 20 %
nbuf: number of static buffer headers, provided for backward
compatibility, should be set to 0
bufpages: pages ot static buffer cache, provided for backward
compatibility, should be set to 0





maxfiles, maxfiles_lim
maxfiles: soft limit for number of files a process can have open simultaneously
maxfiles_lim: hard limit for number of files a process can have open
simultaneously

maxuprc: maximum number of processes for an individual user
symptom: user receives error: no more processes, or cannot fork
no error in syslog

maxswapchunks, swchunk
maxswapchunks: maximum number of swap chunks
swchunk: swap chunk size default = 2048 (this should not be modified)
maximum swap = maxswapchunks * swchunk * DEV_BSIZE
DEV_BSIZE = 1024 bytes, so on a system with swchunk at the default of 2048
and maxswapchunks = 256:

maximum swap = 256 * 2048 * 1024 = 536,870,912 bytes = 512 MB

to increase the amount of swap that can be configured increase maxswapchunks


maxvgs: the maximum number of volume groups which can be configured


shmmax: the maximum shared memory segment size (system wide)
limits are as follows:
10.01, 10.10: 1.75 GB (quadrants 3 & 4)
10.20: 2.75 GB (quadrants 3 & 4, quadrant 2 w/phkl_16751)
application must be relinked as type EXEC_MAGIC then chatr'd to type
SHMEM_MAGIC. An individual segment cannot exceed 1 GB, however the
application can use several contiguous segments which are treated as one.
11.0: 32 bit: 1 GB per individual segment, 2.75 GB total
11.0: 64 bit: 1 TB per individual segment, 4 TB total

errors: shmget: not enough space, application may hang, or give errors
such as not enough memory, or insufficient table space
Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.
Keith Bevan_1
Trusted Contributor

Re: vmunix

Thanks for all the recommendations made.

Just waiting for some down time to make the changes to 'nfile' & kernel rebuild, following the current review of the sar stats.

** No more posts thank-you **

Keith
You are either part of the solution or part of the problem