1837531 Members
3675 Online
110117 Solutions
New Discussion

Commands not working!!!

 
Yogesh M Puranik
Valued Contributor

Commands not working!!!


Hi All,

I Have a HP-UX 11.11 box.Well, I am not able to login.One I trying to telnet the screen is disappearing but i am able to ping the IP of the server.Fortunately, I had one session of the same.But not able to get the out put of any command, showing error..

sh: The fork function failed.Too many processes are alrady running.

Please Help me out as this server comes under critical server list.Many thanks in advance!!

Rgds
Yogesh
13 REPLIES 13
Dave Hutton
Honored Contributor

Re: Commands not working!!!

You ran out of processes. You need to adjust a kernel param.

maxuproc or nproc.

what does sar -v show?

If you can even run that command?

Any processes you can kill to free up a few?
A. Clay Stephenson
Acclaimed Contributor

Re: Commands not working!!!

Your problem is that the system-wide process limit, nproc, has been reached. You must shutdown and rebuild the kernel with a larger nproc or determine why so many processes are running.

For example, if you had a script that did this:

while [[ 1 -eq 1 ]]
do
sh &
done

you would be spawning large numbers of processes very quickly (although in this case you would hit the per-user process limit, maxuprc, first.

If it ain't broke, I can fix that.
Yogesh M Puranik
Valued Contributor

Re: Commands not working!!!


Dave and Clay, thanks for reply!! But I alredy told that I am not able to get any commands output.And kernel parameter defined by You are static parameter.

Actually , without reboot, is there any way to come out of this issue???


Rgds

Yogesh
Sandman!
Honored Contributor

Re: Commands not working!!!

See if your users can help you out by logging out of their sessions. This might give you just enough wiggle room to do stuff.

~cheers
Tim Nelson
Honored Contributor

Re: Commands not working!!!

Yogesh,

As Sandman mentioned you need to somehow get some processes to stop so you can login and start troubleshooting.

If asking users to exit or if the issue is a runaway process(s) then you only last resort is to crash.

Keep trying until you have no time or option except to crash it. Hopefully once the system is back up and running you do not end up with the same issue.

A. Clay Stephenson
Acclaimed Contributor

Re: Commands not working!!!

Generally when this happens unless a user is able to terminate some processes then the only fix is to yank the power.
If it ain't broke, I can fix that.
TwoProc
Honored Contributor

Re: Commands not working!!!

If you're out of process space, then you've got a whole lot of something running. Probably your main apps or app. I'm pretty sure whatever app you're running has a way to shutdown things. Like if you've got an operator on already, can he shutdown job queues or schedules for your main application?
Can you get enough people off to shutdown the database running on the server? You should be able to find a connection/admin screen to something and kill off some jobs and/or jobs requests. Do that until you've got enough procs left to get an admin logon powerful enough to take down your application or database, web servers, whatever. With that you should have enough room to do a full shutdown of all users and apps. At that point you should be able to increase nproc , build a new kernel, and reboot.

The point is, at this point - try to avoid just TOC or plug-pulling.
We are the people our parents warned us about --Jimmy Buffett
Matti_Kurkela
Honored Contributor

Re: Commands not working!!!

You would have to do something - anything - that causes some processes to shut down. You can only execute shell's internal commands: those don't require creating a new process with a fork() system function.

If the problem is caused by a "fork bomb" (i.e. anything that behaves like Clay's example script), you must get all the copies of that fork bomb process stopped at once.

This may be possible with a command:
kill -STOP -2
If you manage to execute this as root, it should freeze all processes owned by anyone other than root. You can then use
kill -CONT
to selectively unfreeze those processes that are not part of this problem. The kill command is one of the shell's internal commands, so you should be able to use it.

But after freezing the processes, your process table is still full and you still can execute the shell's internal commands only. The next step would then be to identify, unfreeze (if necessary) and kill some system process. Sendmail would be good for this, as it would be easy to identify by reading /var/run/sendmail.pid.

(Does anyone have an idea how to read a file using only the internal commands of the POSIX shell, and without needing to fork() any processes? )

When you have even one free slot in your process table, you can run most commands normally. The first command should be "ps -ef", then you can see what's going on and find out some other processes to kill, to get yourself a safety margin.

If this is not doable for some reason, the last resort is to crash the machine. You should use the TOC button at the back of the machine or the TC command of the GSP/MP, not the power switch: the TOC creates a crash dump, which can be analyzed to find out what processes caused this problem.

MK
MK
Yogesh M Puranik
Valued Contributor

Re: Commands not working!!!

Thanks all,

I have collected crash through GSP>Tc command and took reboot.Now going through the crash dumps!!!

Thanks everybody for advice!!

Rgds
Yogesh
Bill Hassell
Honored Contributor

Re: Commands not working!!!

This is a classic runaway process or script (assuming you have a normal value for nproc). This is the very reason that maxuprc exists -- it limits a single user's ability to run more than maxuprc processes at the same time. Of course, if the mistake was made by a root user, there is no prevention as root has all privileges. This is one of the very important reasons not to use the root login for testing, especially for critical servers.

The crash dump will have the process table and that will be the answer as to what process ran wild and used up all the entries in the process table.


Bill Hassell, sysadmin
Dennis Handly
Acclaimed Contributor

Re: Commands not working!!!

>MK: (Does anyone have an idea how to read a file using only the internal commands of the POSIX shell, and without needing to fork() any processes?)

Try:
kill $(< /etc/mail/sendmail.pid )
Some others?
/etc/opt/ipf/ipmon.pid
/etc/syslog.pid
/etc/mail/sendmail.pid
/etc/sfd.pid
/var/opt/sfmdb/pgsql/postmaster.pid
/var/run/syslog.pid
/var/run/sshd.pid
/var/run/hpvmmonlogd.pid
Raynald Boucher
Super Advisor

Re: Commands not working!!!

This has happened here and it was caused by a developper trying to execute cobol source code by mistake.

The guy typed "program.pco" instead of "vi program.pco"
The file had execute privileges and contained many comments (ie * in column 7) which in turn tried to execute every other file in the directory and so on.

Problem was temporarily resolved by asking him to terminate his session and then fixing umasks and permissions for everyone.

Hope this helps.
Otherwise reboot!
Bob E Campbell
Honored Contributor

Re: Commands not working!!!

(Does anyone have an idea how to read a file using only the internal commands of the POSIX shell, and without needing to fork() any processes? )


I assume you mean how to do that without using cat? Something like:

while read LINE
do
echo $LINE
done < file

Should work just fine.