Operating System - HP-UX
1834508 Members
2490 Online
110068 Solutions
New Discussion

Database crashed due to open files

 
SOLVED
Go to solution
Coolmar
Esteemed Contributor

Database crashed due to open files

We have a server (hpux 11i) with multiple databases running. Our production database shut down due to the SMON process terminating with an error. The following is on the Oracle site in relation to the error:

This can be caused because the system has either exceeded the number of open files at the user level or system wide.

Sar -v shows the files open no where near the limit...but that would just be "system wide". How can I find out if we hit the limit at the user level? I imagine with lsof, but how can I tell what my user level maxfiles is?
Any other ideas as it may not be open files at all...that is just what Oracle is saying.

Thanks,
17 REPLIES 17
James R. Ferguson
Acclaimed Contributor

Re: Database crashed due to open files

Hi Sally:

You need to consider the soft and hard limits for open files otherwise known as 'maxfiles' and 'maxfiles_lim' along with the globl limitation imposed by 'nfile'.

# kmtune -l

...will report your current kernel settings.

You can use 'glance' and select a process and toggle "F" for a Files view to see the files associated with a process, too.

Regards!

...JRF...
Coolmar
Esteemed Contributor

Re: Database crashed due to open files

Is it maxfiles and maxfiles_lim that give the user level file limits?
Jeff_Traigle
Honored Contributor
Solution

Re: Database crashed due to open files

There's no per-user open file limit as far as I know. The system-wide limit is controlled by nfile and the per-process limit is controlled by maxfiles and maxfiles_lim. So you need to watch the open files for the smon process to see if it's reaching the per-process limit.
--
Jeff Traigle
James R. Ferguson
Acclaimed Contributor

Re: Database crashed due to open files

Hi Sally:

'maxfiles_lim' is the maximum level to which a non-root user can increase their 'maxfiles' value. Thus, 'maxfiles_lim' is called a "hard" limit and 'maxfiles' is called the "soft" limit.

Regards!

...JRF...
James R. Ferguson
Acclaimed Contributor

Re: Database crashed due to open files

Hi (again) Sally:

Jeff makes a good point. I used the word "user" to refer to a *process* -- a poor, non-rigorous choice of words.

Regards!

...JRF...

Coolmar
Esteemed Contributor

Re: Database crashed due to open files

So our maxfiles and maxfiles_lim are both at 2048, therefore, oracle (for example)can have no more than 2048 processes running at a time?

Thanks for all the responses!
Patrick Wallek
Honored Contributor

Re: Database crashed due to open files

No.

maxfiles and maxfiles_lim control the number of FILES each process can have open.

There is a different kernel variable which controls the number of processes each user can. That is maxuprc.
James R. Ferguson
Acclaimed Contributor

Re: Database crashed due to open files

Hi Sally:

Per your last post, *not Oracle*, but rather per *process*, no more than 2,048 open files at a time.

Regards!

...JRF...
A. Clay Stephenson
Acclaimed Contributor

Re: Database crashed due to open files

The key to knowing which limit you are bumping into is to eexamine the errno value set by the open() system call. This value should be logged in your Oracle alert logs and will be a small integer value.

If the system-wide limit is reached, errno is set to ENFILE (23); if the per-process limit is reached, errno is set to EMFILE (24).

Man 2 open for details. Learning to know what errno values systems calls can set will go a long way towards understanding many UNIX problems.

If it ain't broke, I can fix that.
TwoProc
Honored Contributor

Re: Database crashed due to open files

I'm thinking you hit the "nfile" limit.

This is the TOTAL number of files open for the database for all users for all processes. The problem is that each user session will go off and open files for reading, and this will happen PER USER connection. So, if you've got really big processes that traverse data on 20 tablespaces, each with an average of 10 data files, and 5 index data files. that's 20x10+20x5=300 connections for that ONE user. Have 500 users? You could reach 150 thousand open files at once during peak periods, and this will definitely begin to stop Oracle processes when you exceed it. I've been totally amazed by how fast this resource can be consumed, and I've had to adjust up quite heavily on the nfile resource.

Increase the number of nfile resources you've got (try another 10 or 15% for starters) and you should fix your problem.

We are the people our parents warned us about --Jimmy Buffett
Coolmar
Esteemed Contributor

Re: Database crashed due to open files

John,

Thanks for the response, but if nfile was exceeded, wouldn't that show up in the "sar -v" data? I checked it out for the whole day and it wasn't even close to hitting the max.

Sally
Raynald Boucher
Super Advisor

Re: Database crashed due to open files

Hello Sally,

Can you show me what the actual error message was?
When you say many databases, do you mean many Oracle instances or many Oracle schemas within an Oracle instance (how many smon processes run at one time)?
This almost looks like the Oracle db_files parameter value (in the init.ora file) was exceeded.
Coolmar
Esteemed Contributor

Re: Database crashed due to open files

Here is the error we got, and then the DB shutdown...

ORA-00474: SMON process terminated with error
Coolmar
Esteemed Contributor

Re: Database crashed due to open files

Here is more of the error:

Errors in file /u1/oracle/admin/h21q/bdump/pmon_27192_h21q.trc:
ORA-00474: SMON process terminated with error
Tue May 2 10:00:15 2006
PMON: terminating instance due to error 474
Instance terminated by PMON, pid = 27192

Instance name: db
Redo thread mounted by this instance: 1
Oracle process number: 2
Unix process pid: 27192, image: (PMON)

*** 2006-05-02 10:00:15.392
*** SESSION ID:(1.1) 2006-05-02 10:00:14.452
error 474 detected in background process
ORA-00474: SMON process terminated with error
TwoProc
Honored Contributor

Re: Database crashed due to open files

Yes, you're right your sar output shows you nfile. And IF you had it running the whole while, then that is not the problem. Excuse the posting.

In my reading of multiple errors experienced by ORA-474, it seem that indicates ONLY that SMON shut you down, not why SMON shut you down.

I'm wondering what's in the trace file mentioned in your last post, can you post it?
We are the people our parents warned us about --Jimmy Buffett
Coolmar
Esteemed Contributor

Re: Database crashed due to open files

Hi John,
What I posted above is what was in that trc file....can't find anything else.
Here is the output of "sar -q" around the time the db crashed:

00:00:03 runq-sz %runocc swpq-sz %swpocc
08:42:00 5.0 99 101.0 100
08:44:00 3.4 93 105.8 100
08:46:01 4.2 97 118.5 100
08:48:00 4.4 96 132.2 100
08:50:00 3.9 95 126.8 100
08:52:01 3.9 95 139.1 100
08:54:00 4.6 99 151.8 100
08:56:01 4.9 98 155.0 100
08:58:00 4.9 99 149.9 100
09:00:01 4.4 95 149.3 100
09:02:01 5.0 95 145.1 100
09:04:02 3.1 87 147.1 100
09:06:01 2.6 82 155.0 100
09:08:02 2.1 65 162.4 100
09:10:01 2.6 78 156.8 100
09:12:02 3.1 82 166.2 100
09:14:02 4.5 96 159.0 100
09:16:01 5.3 97 160.9 100
09:18:01 5.5 99 152.3 100
09:20:01 4.5 98 150.6 100
09:22:02 5.3 99 156.4 100
09:24:01 5.1 99 155.3 100
09:26:01 5.0 98 157.8 100
09:28:01 4.8 99 147.7 100
09:30:02 5.0 98 138.5 100
09:32:01 4.7 98 120.2 100
09:34:01 4.6 99 129.8 100
09:36:01 5.1 98 131.5 100
09:38:01 3.6 88 123.4 100
09:40:02 2.5 77 123.0 100
09:42:01 5.3 99 123.7 100
09:44:01 6.1 100 125.8 100

I am thinking the issue was more due to a lack of resources and the SMON died and therefore, the db then shutdown. An lsof also showed over 4000 processes for that particular instance.
TwoProc
Honored Contributor

Re: Database crashed due to open files

Well, I've got nothing here. That error is just saying that SMON shut you down, but nothing from before. You're going to have to call Oracle and open a support call on this one.

This is analagous to the idiot light on your car dashboard going off and telling you that "your engine just burned up from excessive heat."

:-)

We are the people our parents warned us about --Jimmy Buffett