Operating System - HP-UX
1846879 Members
4003 Online
110256 Solutions
New Discussion

Re: ksh spawns another ksh process - RECURSIVELY

 
SOLVED
Go to solution
B. Chapman
Frequent Advisor

ksh spawns another ksh process - RECURSIVELY

Help! Has anyone ever seen this situation?

.
.
.
anair2 8397 8396 0 04:13:57 ? 0:00 -ksh
anair2 8223 8222 0 04:13:56 ? 0:00 -ksh
anair2 8087 8086 0 04:13:55 ? 0:00 -ksh
anair2 8475 8474 0 04:13:58 ? 0:00 -ksh
anair2 7634 7633 0 04:13:53 ? 0:00 -ksh
anair2 6761 6760 0 04:13:47 ? 0:00 -ksh
anair2 8129 8128 0 04:13:56 ? 0:00 -ksh
anair2 8159 8158 0 04:13:56 ? 0:00 -ksh
anair2 8432 8431 0 04:13:58 ? 0:00 -ksh
anair2 8308 8307 0 04:13:57 ? 0:00 -ksh
anair2 8496 8495 0 04:13:58 ? 0:00 -ksh
anair2 8417 8416 0 04:13:58 ? 0:00 -ksh
anair2 8392 8390 0 04:13:57 ? 0:00 -ksh
anair2 8455 8454 0 04:13:58 ? 0:00 -ksh
anair2 8042 8041 0 04:13:55 ? 0:00 -ksh
anair2 8435 8434 0 04:13:58 ? 0:00 -ksh
anair2 8487 8486 0 04:13:58 ? 0:00 -ksh
anair2 8387 8386 0 04:13:57 ? 0:00 -ksh
anair2 6743 6742 0 04:13:47 ? 0:00 -ksh
anair2 8310 8309 0 04:13:57 ? 0:00 -ksh
anair2 8422 8421 0 04:13:58 ? 0:00 -ksh
.

.
.

It's resulting in a full process table - and then I get nothing but "fork" errors like:

.profile[10]: The fork function failed. Too many processes already exist.

So - I'm in a bit of a quandary. My question - could this be something originating from INSIDE the server (a running process) - or from OUTSIDE the server (eg: an application persistently logging in every second).

Thanks in advance for any insight/help.

Later,
Ben.
bchapman@telcordia.com
16 REPLIES 16
Pupil_1
Trusted Contributor
Solution

Re: ksh spawns another ksh process - RECURSIVELY

The First thing is to have a script kill all the current process.

Then check for the profile for the user for any incorrect sourcing !!
There is always something new to learn everyday !!
Steven E. Protter
Exalted Contributor

Re: ksh spawns another ksh process - RECURSIVELY

Shalom,

Check and see of a script is running that spawns a copy of itself.

Also make sure /etc/inittab was not used to spawn the script.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
B. Chapman
Frequent Advisor

Re: ksh spawns another ksh process - RECURSIVELY

I have been able to limp into the system via the console, and get a dump of all of those processes - AND KILL THEM. But, as fast as I can kill them, they're back - with new PIDs. Also - I can only submit commands with "exec " - which, after they run, I get booted out of my console (and then I have to wait 5 minutes for a re-login process).
A. Clay Stephenson
Acclaimed Contributor

Re: ksh spawns another ksh process - RECURSIVELY

My best GUESS is that user anair2 has a problem in his .profile. I would knock the box down to single user then manually mount /usr and /var and then disable anair2's login. You can then allow the system to boot normally and investigate the problem because anair2 will no longer be able to login.
If it ain't broke, I can fix that.
Peter Nikitka
Honored Contributor

Re: ksh spawns another ksh process - RECURSIVELY

Hi,

you have to kill the processes of the user 'anair2' as root in a way like that

ps -uanair2 | awk 'NR>1 {print $1}' | xargs kill

- call with differnt signals, in your situation kill -9 as secaond choice
- call more than once

Cause of your problem: sometimes a single
*
in a bin-directory is enough...

mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"
spex
Honored Contributor

Re: ksh spawns another ksh process - RECURSIVELY

Hi Ben,

Give this a shot:

# exec echo '/usr/bin/false' >> /etc/shells && passwd -e /usr/bin/false anair2 && UNIX95= kill $(ps -u anair2 -o pid=)

Change 'kill' to 'kill -9', if need be.

PCS
B. Chapman
Frequent Advisor

Re: ksh spawns another ksh process - RECURSIVELY

All - I appreciate the ideas - and I'm following up - SLOWLY but SURELY. Here's the problem - I've killed inetd, and kill sshd (so now only console/RS-232 login allowed) - and every time I login to the console, it successfully logs in - but takes over 10 minutes. Here are the messages (so you get the idea of my pain):

Console Login: root
Password:
Last successful login for root: Mon Aug 21 11:36:43 EST5EDT 2006 on console
Last unsuccessful login for root: Mon Aug 21 11:47:05 EST5EDT 2006 on console
Please wait...checking for disk quotas
could not execute quota command
/etc/profile[31]: The fork function failed. Too many processes already exist.
/etc/profile[48]: The fork function failed. Too many processes already exist.
/etc/profile[72]: The fork function failed. Too many processes already exist.
/etc/profile[143]: The fork function failed. Too many processes already exist.
/etc/profile[143]: test: Specify a parameter with this command.
.profile[10]: The fork function failed. Too many processes already exist.
.profile[20]: The fork function failed. Too many processes already exist.
WARNING: YOU ARE SUPERUSER !!

profit9# exec /usr/bin/ps -ef > /UTIL/p9.ps-ef.lst



That's the problem - is only being able to run one command every 10 minutes. Any ideas? Can I string them together w/ ";"? BTW - /UTIL is a common NFS mountpoint that I have on another HP-UX system - so I can then analyze the file and build a killscript to run thru "exec" again.
spex
Honored Contributor

Re: ksh spawns another ksh process - RECURSIVELY

Ben,

This will kill processes owned by anair2 in ascending order of pid:

# exec ps -u anair2 | awk '!/PID/{print $1}' | sort -n | xargs kill

(UNIX95 variable left unset on purpose.)

PCS
A. Clay Stephenson
Acclaimed Contributor

Re: ksh spawns another ksh process - RECURSIVELY

I don't think you have any choice but to shutdown and bring the box up in single user mode.
If it ain't broke, I can fix that.
B. Chapman
Frequent Advisor

Re: ksh spawns another ksh process - RECURSIVELY

PCS,

I tried - but to no avail (note - my $PATH isn't even set upon login to console - so I have to use explicit paths to all commands):

profit9# exec /usr/bin/ps -u anair2 | /usr/bin/awk '!/PID/{print $1}' | /usr/bin
/sort -n | /usr/bin/xargs kill
sh: The fork function failed. Too many processes already exist.

(and - that error - is generic for any commands that I try)

Any other ideas? I'm pretty close to "exec reboot" - and dealing w/ the fallout upon restart (Oracle db's recovering). Luckily this is only devel/test server - but still - how do I ensure this situation doesn't arise again? Somebody spoke earlier about a "*" (bad filename) in a "bin" directory - what's that about?
A. Clay Stephenson
Acclaimed Contributor

Re: ksh spawns another ksh process - RECURSIVELY

This really has the look of a script that is calling itself over and over. It could even be /etc/profile although I doubt it. Kill the box, bring it up in single user and check anair2's .profile.
If it ain't broke, I can fix that.
Bill Hassell
Honored Contributor

Re: ksh spawns another ksh process - RECURSIVELY

The reason that you did not have a correct PATH is due to the process table overflowing and therefore your profiles (/etc/profile and .profile) did not run. Your system is very unstable at this point and Oracle will soon terminate with a similar error (no more processes can be run). You can easily add a minimum PATH after you login with the command:

export PATH=/usr/bin:/usr/sbin

Make sure you DO NOT run exec reboot!!! You want a graceful shutdown and reboot will kill everything ungracefully. Run the command shutdown -y 0 then interrupt the boot process and issue hpux -is when you get to the ISL prompt. Once on the system,

1. rename the user's $HOME directory to something like /home/anair2BAD

2. remove any entry in /usr/spool/cron/crontabs/ with this user's name.

3. Lock the user's account so no login can take place from anywhere - the easiest is to rename the passwd file's line with anair2 to something like anair2BAD.

Now boot up and see that everything starts correctly. Finally, examine the anair2BAD directory to see if there is a .sh_history file and if so, look at the end of the file for the possible reason(s) for the runaway script.

Another thing to do is to limit the number of processes a user can run with the kernel parameter maxuprc to prevent a future mess like this, especially for a production machine. NOTE: maxuprc is used to count the total number of prcesses owned by one user. This means that it must be set high enough to accomodate Oracle and other applications, but low enough to prevent a runaway like this.


Bill Hassell, sysadmin
B. Chapman
Frequent Advisor

Re: ksh spawns another ksh process - RECURSIVELY

Many thanks to everyone that helped me out on this - I'll assign points after I give the final scoop on this.

I did end up having to punt and:
-exec reboot (sorry Bill - saw your post one minute too late)
-hpux -is
-mount /var /usr /home
-lockout "bad" user
-examine "bad" user .sh_history
-shutdown -r 0

It did turn out to be a rogue program (well, accidentally). A new developer executed an Oracle PRO*Cobol program right at a shell prompt. No problem - except that the file starts out like this:

IDENTIFICATION DIVISION.
PROGRAM-ID. AR0129.
AUTHOR. JOHN DOE.
DATE-WRITTEN. DEC, 2003.
*********************************************************
** COPYRIGHT * 2003 XXXX. ALL RIGHTS RESERVED *
*********************************************************
** XXXX PROPRIETARY - INTERNAL USE ONLY *
.
.
.

So - now I know what happens when the shell interprets all those asterisks (*) - it launches a new ksh for all files in the current directory - for all asterisks (and then I think it's trying to execute all of the other .pco files in the dir - which recursively digs the hole deeper). I was able to recreate this by executing the code myself (sh -x) BEFORE brining up all the db's/appservers. Sure enough - another SSH session's "ps -ef" showed massive ksh processes by my ID. C *did* seem to stop the madness immediately though. (Why couldn't that developer have done that earlier???) Oh well.

Gotta love Oracle though - 6 DB's that I had to slam down - all started/recovered w/ no problems. Call it a day...

Again - many thanks for all the help on this.

Later,
Ben Chapman.
bchapman@telcordia.com
B. Chapman
Frequent Advisor

Re: ksh spawns another ksh process - RECURSIVELY

points submitted - closing thread
A. Clay Stephenson
Acclaimed Contributor

Re: ksh spawns another ksh process - RECURSIVELY

One minor (well maybe not so minor) point: The file you posted is not a program but rather the source file for a program where '*' is a comment and is perfectly legal. The problem is that this user tried to execute this source file which means that the execute bit was set on a source file (or that this user explicitly did something like "sh xx.cbo". Setting the execute bit on a source file, as opposed to the actual executable, is state-of-the-art stupid and this developer needs to go to UNIX kindergarten (even if taught by you) so that other equally unintentionally dangerous events are avoided. Of course, being in a beginning UNIX class may not help because I was sitting in on one where the instructor told the class to just set everything to 777 to "make things easier". After she finished I politely begged to differ.

If it ain't broke, I can fix that.
Bill Hassell
Honored Contributor

Re: ksh spawns another ksh process - RECURSIVELY

Clay echoed one of my hot buttons for all sysadmins: 777 is a bad and dangerous number!!! Find them, fix them and chastise anyone who tries to 'fix' problems. Setting the execute bit on a sourcecode file just took your system down. That's why umask=0 is also braindead. Users should be forced to use umask 077 so they can unlearn their bad habits.

But 666 is no better!! When you set the global write bit, you have told every user on your system that they have permission to totally trash the contents of your file at any time.

Don't get me started on -9.


Bill Hassell, sysadmin