Operating System - HP-UX
1834498 Members
2346 Online
110067 Solutions
New Discussion

Re: nfile keep on increasing

 
nico_20
Occasional Advisor

nfile keep on increasing

Hi,

I need help on my DB machine, rp4440.

I face the "file table overflow" problem and crash the oracle. I increased the nfile to solve the problem. However, when I check the sar, I find the file opening keeps on increasing , but inode and process is stable.

yesterday
14:01:04 text-sz ov proc-sz ov inod-sz ov file-sz ov
14:01:06 N/A N/A 658/24020 0 2622/194208 0 134296/362358 0
14:01:08 N/A N/A 658/24020 0 2624/194208 0 134303/362358 0
14:01:10 N/A N/A 659/24020 0 2624/194208 0 134422/362358 0
14:01:12 N/A N/A 658/24020 0 2624/194208 0 134590/362358 0
14:01:14 N/A N/A 658/24020 0 2624/194208 0 134772/362358 0


today:
09:54:05 text-sz ov proc-sz ov inod-sz ov file-sz ov
09:54:07 N/A N/A 653/24020 0 2615/194208 0 140862/362358 0
09:54:09 N/A N/A 653/24020 0 2615/194208 0 140861/362358 0
09:54:11 N/A N/A 653/24020 0 2615/194208 0 140861/362358 0
09:54:13 N/A N/A 654/24020 0 2615/194208 0 140862/362358 0
09:54:15 N/A N/A 656/24020 0 2615/194208 0 140862/362358 0

Any solution, or I increase the nfile to million?

my kernal setting is
max_users= 3000
nproc=24020
nfile=362358

oracle max datafile = 1200


pls help.

Nico
12 REPLIES 12
RAC_1
Honored Contributor

Re: nfile keep on increasing

It depends on how database being used. We have a setting of 600000, but in my case database is very heavily used. Consult your DBAs and monitor the real time usage and set it accordingly.
There is no substitute to HARDWORK
nico_20
Occasional Advisor

Re: nfile keep on increasing

Hi RAC,

do u have this situation?
nfile increase steadily, not decrease?

I concern about it reach the limit, and It is hard to reboot the machine to tune up the parameter.
RAC_1
Honored Contributor

Re: nfile keep on increasing

The answer is it depends. What apps/programs/databases your system run? Depending on that values may change. oracle also has a setting that also specicifies max. no. of oracle processes. Consider that and set nfile accordingly.
There is no substitute to HARDWORK
nico_20
Occasional Advisor

Re: nfile keep on increasing

now, the nfile value is depends on nproc

nfile = 15*nproc + 2048


for Oracle database

I think the nfile should depend on max datafile

nfile = oracle max no. of datafile + max process

am I correct?
nico_20
Occasional Advisor

Re: nfile keep on increasing

(revised)

now, the nfile value is depends on nproc

nfile = 15*nproc + 2048


for Oracle database

I think the nfile should depend on max datafile

nfile = oracle max no. of datafile * max process

am I correct?
Frank de Vries
Respected Contributor

Re: nfile keep on increasing

Hi Nico

You're values look sufficiently high to me.
Maxusers = 3000 and nproc is 24020.
That is really quite large a resource you
allocate to the unix kernel for usage by
calling apps.

I would be very surprised when the db's and apps need to use all these resources ???
(Unless your specific documentation for your apps say otherwise, I would need to see proof)

I think it is more likely that there is something not right with the way filehandles
are opened and then not closed. This looks more like a bug or bad code then a matter of not having tuned your kernel correctly.

If that assumption is right, then it is not
advisable to increase your kernel resources,
as you would only allocated more waste to the
problem and it could lead to an instable kernel. So I would not touch the kernel side.

Did you install all the patches for your
apps and Oracle or even unix itself?
If I were you I would concentrate on this,
rather then tuning the kernel.
It is not normal your system needs some many
nfiles, to my experience that looks wrong.
And increasing it could have the opposite effect.

Do you notice any other symptons that would give you a clue ?? CPU usage ? forks ? , paging ? Client connections ?

Which version of Oracle do you use ?
What parameters in the spfile (of pfile) ?

What are the apps running ?
Is is SAP, WebSphere or something else ?

Hope this inspires you to start looking in the right direction :)
Look before you leap
Bill Hassell
Honored Contributor

Re: nfile keep on increasing

The number of open files is increasing because one or more programs are requesting these files. Whether this is normal for those programs will require help from the programmer. It is a simple task to create a bad program that simply loops around endlessly creating new files or opening existing files. Finding this bad program will take some work. Since the open files are about 140,000 in your example and there are only 656 programs running, there is a bad program (or programs) really messing around with your system.

Since this is a production machine and using all the file handles from the nfile limit is a big problem, you can try changing nfile to 500,000 but I suspect that you have a runaway process so it will still be a problem. So I would watch sar -v while you shutdown the Oracle instances. See if there is a dramatic drop in file usage. If not, then whatever is left running is probably the problem. Note that these files may not be disk files -- they could be network ports and a bad network application is causing all the issues.


Bill Hassell, sysadmin
nico_20
Occasional Advisor

Re: nfile keep on increasing

Thanks all,

I suspect it is oracle problem, becuase we use MTS for all applications.

We tried to shrink the shared server and find the nfile reduced.

Before our testing, the nfile usage is as follow:

15:36:20 text-sz ov proc-sz ov inod-sz ov file-sz ov
15:36:23 N/A N/A 659/24020 0 2637/194208 0 143279/362358 0
15:36:26 N/A N/A 659/24020 0 2636/194208 0 143277/362358 0
15:36:29 N/A N/A 659/24020 0 2636/194208 0 143277/362358 0
15:36:32 N/A N/A 659/24020 0 2636/194208 0 143277/362358 0
15:36:35 N/A N/A 659/24020 0 2636/194208 0 143277/362358 0

After changing the number of shared servers from 480 to 240 and then back to 480, the sar output is as follow:

15:45:27 text-sz ov proc-sz ov inod-sz ov file-sz ov
15:45:30 N/A N/A 498/24020 0 2477/194208 0 105652/362358 0
15:45:33 N/A N/A 523/24020 0 2501/194208 0 105773/362358 0
15:45:36 N/A N/A 536/24020 0 2515/194208 0 105843/362358 0
15:45:39 N/A N/A 560/24020 0 2538/194208 0 105960/362358 0
15:45:42 N/A N/A 579/24020 0 2556/194208 0 106039/362358 0

SQL Statements are as follow:

ALTER SYSTEM SET SHARED_SERVERS=240 SCOPE=BOTH;

after 10 secs,

ALTER SYSTEM SET SHARED_SERVERS=480 SCOPE=BOTH;


Nico
Bill Hassell
Honored Contributor

Re: nfile keep on increasing

That seems like a good correlation. Unless there are some guidelines in the Oracle knowledge library, I would say that the large file count must be normal. So increasing nfile to 500k or larger should be fine. However, your nproc value is far too large -- you're using only 654 out of 24000. I would drop nproc to 2000 or so. I also see that maxusers is set to a large value. This pseudo parameter -- it is never used in the kernel but instead is part of several formulae used by SAM to size some parameters. It is seldom an accurate value in sizing large systems so most sysadmins will replace the formula in a parameter with a simple number and adjust aas needed.


Bill Hassell, sysadmin
nico_20
Occasional Advisor

Re: nfile keep on increasing

without the formula, it is hard to set a reasonable value for particular kernel parameter.

For example, nfile, the max value depends on the memory limitation. I dont know how large for this parameter which is enough for our system.

We set a large value of nproc because nfile is depends on nproc

nfile = (15*nproc) + 2048


Nico
Bill Hassell
Honored Contributor

Re: nfile keep on increasing

Actually, the maxusers value is quite inaccurate for several parameters. It is true that nfile somewhat depends on nproc and in the formula, it assumes that each process will open 15 files. But in your case, your 650 processes have opened 140,000 files or 215 files per process -- the formula fails miserably. Now your nproc is far too large at 24,000. My guess is that setting maxusers to 3000 has forced nproc far too high. That is why most sysadmins replace each formula with a fixed number. In your case, setting nproc to 2000 and nfile to 500,000 should be quite reasonable. Also note that your ninode value probably way too high. It should be less than 2000 or so as it refers only to HFS in-core inodes. But due to a very outdated formula, it may tens of thousands in your system.

Now having some of these parameters extremely large is not really a problem except that the kernel will be quite a bit larger in RAM. If you have plenty of RAM, you can just increase nfile to keep ahead of actual file usage. But don't increase maxusers -- it will just keep making other parameters far too large.


Bill Hassell, sysadmin
Frank de Vries
Respected Contributor

Re: nfile keep on increasing

I agree with most of the things that Bill is saying and makes a lot of sense. Especially
his comments about maxusers. Totally agree !

However,
I can't understand why nfile has to be
so large 500.000 ?? I would opt
for something much smaller.
If maxusers = 2000
then nproc = 20.000
then nfile = 250.000 should more then suffice !

I personally subscribe to a different philosophy. Do no sacrifice your system resources for one particular apps code.
You need to be strict.
If one has to give so many resources to keep
things working, to me that only increases the problem sooner or later.

One of the functions of resource limits,
to my mind, is to keep a lit on bad code,
or looping progs. That to me is part of good
tuning practise.
You just tell every programmer , yeah it is
okay to right shitty code, but don't worry
we will tune it somehow !!
(Not when I am in charge :)

The other interesting paradox I came accross,
if one has so much RAM to allocate huge resource, then why are they using MTS (Multi-threaded services). This is an offspring from the days that memory was tight,and to share processes as much as possible.
I am surprised people are still using this
technology.

I would keep it simple (But I suspect
your application is demanding this kind of
configuration, then I feel for you :)
That was a tough purchase then.

Anyway that is the opinion of one person,
which happens to be mine.

Still it may be interesting for you
to learn about different point of views.

Look before you leap