1827890 Members
1655 Online
109969 Solutions
New Discussion

Re: script hangs

 
SOLVED
Go to solution
Troyan Krastev
Regular Advisor

script hangs

Hi Everibody,

I have a "simple" scripting problem:
these are 2 scripts - troy_app and troy_start:
util01:/tmp/ha_app #cat troy_app
#!/usr/bin/sh
sleep 99999
exit 0

util01:/tmp/ha_app #cat troy_start
#!/usr/bin/sh
echo "Starting Troy Application"
/tmp/ha_app/troy_app &
echo "Started Troy Application"
exit 0

If I run troy_start - everithing is normal:
util01:/tmp/ha_app #./troy_start
Starting Troy Application
Started Troy Application
util01:/tmp/ha_app #ps -ef | egrep 'sleep|troy'
root 10608 1 0 23:05:57 pts/ta 0:00 /usr/bin/sh /tmp/ha_app/troy_app
root 10610 10608 0 23:05:57 pts/ta 0:00 sleep 99999

But if I run it like this:
util01:/tmp/ha_app #out=`./troy_start`
it simply hangs.

Can you please help me?
Thanks in advance,
Troy
22 REPLIES 22
RAC_1
Honored Contributor

Re: script hangs

backticks are used for excution of command. so when you do `something`, first it will expand and execute. In your case, `./troy_start` will never expand. So what you are getting is correct.
There is no substitute to HARDWORK
Jan de Haas_3
Frequent Advisor

Re: script hangs

Hi Troy,

I remember there is a patch that resolves incorrect behaviour when processes are started in the background from a script.
Instead of starting the process in the background, the script actually waits for the background process to finish, instead of proceeding further after starting the process in the bg.

Don't recall which patch it was, but you should be able to see if this is the case by just reducing the sleep 99999 line in you script.

You can also check where the script 'hangs' by starting your script with a 'set -x' as it's very first line.

If I recall which patch it was, I'll let you know...
Troyan Krastev
Regular Advisor

Re: script hangs

Jan,
You are right, if I get rid of sleep it works. The problem is it fails on SUN also, is it really a bug or it is a "feature"? I'll search for the patch.

RAC,
This failed script comes from RAC - Real Application Cluster from Oracle :-)
Troyan Krastev
Regular Advisor

Re: script hangs

Jan,

for set -x:
/ha_app #out=`/ha_app/troy_start`
+ echo Starting Troy Application
+ /ha_app/troy_app
+ sleep 99999999999
+ echo Started Troy Application
+ exit 0

And it hangs there :-(
Muthukumar_5
Honored Contributor
Solution

Re: script hangs

When you are using with `command | script` or $(command|script) then that action will be completed upon the command execution is done.

With your example problem in a changed manner:

# cat > script1
sleep 5
# cat > script2
echo "Starting Troy Application"
./script1 &
echo "Started Troy Application"
exit 0
# chmod u+x script?
# ./script2
Starting Troy Application
Started Troy Application
#
# echo "sh -x ./script2"
+ echo Starting Troy Application
+ ./script1
+ echo Started Troy Application
+ exit 0
Starting Troy Application Started Troy Application

It is displaying contents after sleep operation. It is not meant that it is hanging. `./script2` is executed in foreground process to ` ` operation.

Hang means if you are giving ctrl+d or ctrl+c then process will not end up. That is hang.

Hope you got the point. Else revert back again.
Easy to suggest when don't know about the problem!
Troyan Krastev
Regular Advisor

Re: script hangs

Thanks Muthukumar,

This explains why it happens. You are right, it is not true "hang", but in my case I need return right away, not waiting for the program to finish.

Thanks again,
Troy
Howard Marshall
Regular Advisor

Re: script hangs

Here is what I think you have going on.

You are starting your shell script, which based on the first line calls a new shell.
You then start the app script in the background in that new shell but it is in the background for that new shell so the new shell canâ t end until the background process finishes.

It works from your command line because you are still in the same shell that has the process running in the background.

Change you script to use nohup /tmp/ha_app/tryo_app &

And it will start the app script in its own shell that will not keep the current shell locked up and the startup script will be allowed to end normally.

Hope this makes sense and helps some.

H
Troyan Krastev
Regular Advisor

Re: script hangs

Thanks Howard, it doesn't help. The script still "hangs".

I simply changed the template script that comes from Oracle. This is the original:

if [ "$START_APPCMD2" != "" ]; then
out=`$START_APPCMD2`
if [ $? -ne 0 ]; then
postevent $ERROR_PRIORITY "start 2: $out"
exit 1
fi
fi
;;

I replaced out=`$START_APPCMD2` with $START_APPCMD2
Howard Marshall
Regular Advisor

Re: script hangs

I tested something very similar that I wrote quickly to make sure it would work and it does on my machine. I am running 10.20 though. However it worked just fine with and without the nohup.

Perhaps give it a shot running ksh instead of sh and see if that helps things along.

Howard
Troyan Krastev
Regular Advisor

Re: script hangs

Hi Howard,

I tested the script on HP-UX11.0 and SUN 9. No success.

-Troy
Sandman!
Honored Contributor

Re: script hangs

Troy,

How about running your command in the background within a subshell...

util01:/tmp/ha_app # (out=`./troy_start`) &

cheers!
Howard Marshall
Regular Advisor

Re: script hangs

Here are the scripts I wrote and my test method

I started two xterminal windows and had both on the screen at the same time
In one window I started a tail â f on out.out so I could watch it run time
In the other window I started go1 from the command line

It printed starting application and app started and I got the prompt back almost immediately. The file go2.log has the date in it which is the only output from go2 that would come to the screen. And in the first window I was able to watch the out.out file grow by a b c d e f g and program ended after I got the prompt back in the 2nd window

If your system doesnâ t work the same way, barring a patch or some configuration I donâ t know anything about you would have to try to get someone else who has the same level of OS to try it.


go1

#!/usr/bin/sh

#echo "starting application"

nohup ./go2 > ./go2.log 2>&1 &

#echo "app started"

exit 0


go2

#!/usr/bin/sh

date
echo program begins >> out.out
for i in a b c d e f g
do
echo $i >> out.out
sleep 2
done
echo program ended >> out.out
echo "\n\n" >> out.out
Troyan Krastev
Regular Advisor

Re: script hangs

Thanks Howard,

This works great. Let me go back to oracle and do it without modifying the original script.

Thanks again,
Troy
Troyan Krastev
Regular Advisor

Re: script hangs

Hi Howard,

Can you please try one more think for me? In "go1" replace
nohup ./go2 > ./go2.log 2>&1 &
with
nohup ./go2 &
and try out=`./go1`
It seems that this is causing the problem.
Howard Marshall
Regular Advisor

Re: script hangs

Oh,

Now I see. Very simple answer there. No. Howâ s that for simple.

What you have going on here is you are calling a sub shell to resolve from the output of a command. The problem is, as long as any process that was started from that sub shell is still running, there is still potential for more resolution so it wont end till all child process are ended. This is not a bug, itâ s the way its supposed to work.

The nohup command dis-associates a process with its login terminal, not its parent process id. Same for any child processes it spawns. The only way a process started with nohup will become the top of its process tree is if the login shell goes away (you log out) then it will hand ownership over to what ever process started that shell (usually init)

I know I am not explaining this very well, someone else may be able to do a better job of it, perhaps they will.

Either way, a huge explanation about why it does what it does isnâ t what you need, what you need is how to make it go.

I am not sure why you would want to do this from the command line sense all it would do if it did work is set the variable out to what ever go1 would print to the screen but not to let my lack of vision hamper you.

You could start go1 redirecting its output to a log file then set the contents of that file to out

go1 > go1.log 2>&1
out=`cat go1.log`


The only other thing that I can think of that will let you do this is not really pretty and involves the inittab file. You set an init level of a letter like b and have init call the startup script when you start init b and that will fall through because after you call init its done and there are no more child processes to wait on but thatâ s complex. Ill help you with it if you must though.

It occurs to me though that you may be running go1 from a script. Well, if thatâ s the case, you can change it to something similar to the above with a short delay to let go1 do its thing or some complicated like a touch file to let you know its done. Of course, if your calling go1 from a script and all its really doing is calling go2, why not just call go2 directly from that script?

I donâ t know, if you can give me more detail about what your trying to do and why you need out to be equal to what I will try to help more. If itâ s just a case of the scripts that came with oracle and you want to use them as is I would say donâ t bother. I have had to modify many vender scripts, epically startup scripts, some even to make them work at all, much less work correctly and even more to work from cron or init on system startup.

H

Troyan Krastev
Regular Advisor

Re: script hangs

Thanks Howard,

I don't know if you are familiar with MC/ServiceGuard. Oracle are releasing SW in 10gR2 that is suppose to replace MC/SG. And this is the way they start the application you define to be High Available.
It works like that: they run your "start_script" and start monitoring processes that you define. There is a timeout for the start script and if it doesn't return they initiate recovery. This is where I am getting problem. The script never returns. See the history of the thread to see the original Oracle script

-Troy
Howard Marshall
Regular Advisor

Re: script hangs

I am familiar with ServiceGuard but not with the Oracle software.

Let me see if I understand it.

There is a startup script in one of the RC directories that starts the Oracle HA software on this system and presumably one that starts the same software on a backup server and those two try to communicate and decide which system has the primary responsibility to host the Oracle Database.

The HA software on that system then executes what ever startup script you specify and monitors the processes (among other things I hope) you tell it to monitor.

The catch is that this your startup script has to complete and end with the correct exit code, otherwise the HA software doesn't know the database has started correctly, assumes the worst and tries to start it on the backup server.

If I have that about right let me pose a thought process, though I don't know how to apply it to the real world without a test box that is completely under your control.

Some processes, the oracle database engine included I think, are written so that they will attach themselves to a new parent ppid. Thus freeing up the old ppid and not holding it open. That allows the shell program that initiated the process to end without any interruption to the process its self.

The test program that you are testing with, sleep, as the main program will not automatically re-attach its self without zombie-ing to allow the calling script to fall through. I can't think of a way to force it either, not that there isn't a way, just I don't know it.

I can't think of any other unimportant process that will attach them selves to init (pid 1) off the top of my head, you may be able to play around with cron but I would certainly backup all the crontab files before I started playing with it and restore them and restart cron cleanly when done. Best case is to have a play box with oracle installed so you can see how it behaves, not something similar.

Someone else may be able to think of a better process or heck, even write you one (I'm not that good at programming)

I don't know if I am making that clear or not.


H
Troyan Krastev
Regular Advisor

Re: script hangs

Howard, take a look at this. It is not related to the Database at all. They are making external application High Available:
http://download-east.oracle.com/docs/cd/B19306_01/rac.102/b14197/crschp.htm#i999391
Howard Marshall
Regular Advisor

Re: script hangs

Hay Troy,

I read that page you linked to (though not as carefully as you probably have). It looks like some pretty neat stuff. What I didn't see is anywhere that it referenced running anything like the out=`script` format.

Mostly what it seems to me is that it wants something very much like a regular rc start up script with a few enhancements like the check option and making sure that the stop option returns a 0 error code even if the app isn't running any more.

It also says not to start, monitor, or stop oracle with the scripts. Apparently it handles the oracle database engine on its own or in some other manner.

If there is something in particular I should read give me a paragraph or line number.

H
Troyan Krastev
Regular Advisor

Re: script hangs

Hi Howard,

Sorry I didn't respond immediatly. I was on vacation.
So, I am attaching the script generated by oracle to control my application. Again, it is not a database, it is compleatly indipendant external application that can be clustered with oracle software.
Serach for START_APPCMD2 and see how they execute it. I changed START_APPCMD to make it work.

-Troy
Howard Marshall
Regular Advisor

Re: script hangs

What it looks to me like they are doing (only in start) is setting out to some text string that means something to the reader (that would probably be you) indicating the failure to start correctly. Thatâ s what the if $? Is all about. If you want to be able to dynamically alter what it says you may be able to try something like

$START_APPCMD | read out

I think stating the command that way will allow it to fall through on success. Although you may have to capture the exit code sense I don't know which exit code you would pick up, the one from the start command or the one from read. A little quick testing can solve that one.

The cmd2 line where it says out=`$STARTAPPCMD2` would have to be changed also if it did the same thing. I am reasonably sure they only did it that way because its set to nothing and thatâ s the simplest way to script it.

Bottom line. All it really needs is if the script succeeds in starting the software it falls through with an exit code of 0. if it doesn't it needs to set the variable out to something that might give you a clue as to why and exit with a non 0 exit code. You don't have to follow their code exactly. They suggest that it may have to be modified or optimized early in the comments when they suggest changing or adding environment variables.

Pretty much you can take out anything that happens between the 'start') line and the ;; line and put in what ever it takes to start your app and perhaps some calls to their postevent logger and the whole thing should work fine.

H
Troyan Krastev
Regular Advisor

Re: script hangs

This is what I did. I changed the script to make it work.

Thanks again,
Troy