1753428 Members
4780 Online
108793 Solutions
New Discussion юеВ

Re: script hangs

 
SOLVED
Go to solution
Troyan Krastev
Regular Advisor

Re: script hangs

Hi Howard,

I tested the script on HP-UX11.0 and SUN 9. No success.

-Troy
Sandman!
Honored Contributor

Re: script hangs

Troy,

How about running your command in the background within a subshell...

util01:/tmp/ha_app # (out=`./troy_start`) &

cheers!
Howard Marshall
Regular Advisor

Re: script hangs

Here are the scripts I wrote and my test method

I started two xterminal windows and had both on the screen at the same time
In one window I started a tail ├в f on out.out so I could watch it run time
In the other window I started go1 from the command line

It printed starting application and app started and I got the prompt back almost immediately. The file go2.log has the date in it which is the only output from go2 that would come to the screen. And in the first window I was able to watch the out.out file grow by a b c d e f g and program ended after I got the prompt back in the 2nd window

If your system doesn├в t work the same way, barring a patch or some configuration I don├в t know anything about you would have to try to get someone else who has the same level of OS to try it.


go1

#!/usr/bin/sh

#echo "starting application"

nohup ./go2 > ./go2.log 2>&1 &

#echo "app started"

exit 0


go2

#!/usr/bin/sh

date
echo program begins >> out.out
for i in a b c d e f g
do
echo $i >> out.out
sleep 2
done
echo program ended >> out.out
echo "\n\n" >> out.out
Troyan Krastev
Regular Advisor

Re: script hangs

Thanks Howard,

This works great. Let me go back to oracle and do it without modifying the original script.

Thanks again,
Troy
Troyan Krastev
Regular Advisor

Re: script hangs

Hi Howard,

Can you please try one more think for me? In "go1" replace
nohup ./go2 > ./go2.log 2>&1 &
with
nohup ./go2 &
and try out=`./go1`
It seems that this is causing the problem.
Howard Marshall
Regular Advisor

Re: script hangs

Oh,

Now I see. Very simple answer there. No. How├в s that for simple.

What you have going on here is you are calling a sub shell to resolve from the output of a command. The problem is, as long as any process that was started from that sub shell is still running, there is still potential for more resolution so it wont end till all child process are ended. This is not a bug, it├в s the way its supposed to work.

The nohup command dis-associates a process with its login terminal, not its parent process id. Same for any child processes it spawns. The only way a process started with nohup will become the top of its process tree is if the login shell goes away (you log out) then it will hand ownership over to what ever process started that shell (usually init)

I know I am not explaining this very well, someone else may be able to do a better job of it, perhaps they will.

Either way, a huge explanation about why it does what it does isn├в t what you need, what you need is how to make it go.

I am not sure why you would want to do this from the command line sense all it would do if it did work is set the variable out to what ever go1 would print to the screen but not to let my lack of vision hamper you.

You could start go1 redirecting its output to a log file then set the contents of that file to out

go1 > go1.log 2>&1
out=`cat go1.log`


The only other thing that I can think of that will let you do this is not really pretty and involves the inittab file. You set an init level of a letter like b and have init call the startup script when you start init b and that will fall through because after you call init its done and there are no more child processes to wait on but that├в s complex. Ill help you with it if you must though.

It occurs to me though that you may be running go1 from a script. Well, if that├в s the case, you can change it to something similar to the above with a short delay to let go1 do its thing or some complicated like a touch file to let you know its done. Of course, if your calling go1 from a script and all its really doing is calling go2, why not just call go2 directly from that script?

I don├в t know, if you can give me more detail about what your trying to do and why you need out to be equal to what I will try to help more. If it├в s just a case of the scripts that came with oracle and you want to use them as is I would say don├в t bother. I have had to modify many vender scripts, epically startup scripts, some even to make them work at all, much less work correctly and even more to work from cron or init on system startup.

H

Troyan Krastev
Regular Advisor

Re: script hangs

Thanks Howard,

I don't know if you are familiar with MC/ServiceGuard. Oracle are releasing SW in 10gR2 that is suppose to replace MC/SG. And this is the way they start the application you define to be High Available.
It works like that: they run your "start_script" and start monitoring processes that you define. There is a timeout for the start script and if it doesn't return they initiate recovery. This is where I am getting problem. The script never returns. See the history of the thread to see the original Oracle script

-Troy
Howard Marshall
Regular Advisor

Re: script hangs

I am familiar with ServiceGuard but not with the Oracle software.

Let me see if I understand it.

There is a startup script in one of the RC directories that starts the Oracle HA software on this system and presumably one that starts the same software on a backup server and those two try to communicate and decide which system has the primary responsibility to host the Oracle Database.

The HA software on that system then executes what ever startup script you specify and monitors the processes (among other things I hope) you tell it to monitor.

The catch is that this your startup script has to complete and end with the correct exit code, otherwise the HA software doesn't know the database has started correctly, assumes the worst and tries to start it on the backup server.

If I have that about right let me pose a thought process, though I don't know how to apply it to the real world without a test box that is completely under your control.

Some processes, the oracle database engine included I think, are written so that they will attach themselves to a new parent ppid. Thus freeing up the old ppid and not holding it open. That allows the shell program that initiated the process to end without any interruption to the process its self.

The test program that you are testing with, sleep, as the main program will not automatically re-attach its self without zombie-ing to allow the calling script to fall through. I can't think of a way to force it either, not that there isn't a way, just I don't know it.

I can't think of any other unimportant process that will attach them selves to init (pid 1) off the top of my head, you may be able to play around with cron but I would certainly backup all the crontab files before I started playing with it and restore them and restart cron cleanly when done. Best case is to have a play box with oracle installed so you can see how it behaves, not something similar.

Someone else may be able to think of a better process or heck, even write you one (I'm not that good at programming)

I don't know if I am making that clear or not.


H
Troyan Krastev
Regular Advisor

Re: script hangs

Howard, take a look at this. It is not related to the Database at all. They are making external application High Available:
http://download-east.oracle.com/docs/cd/B19306_01/rac.102/b14197/crschp.htm#i999391
Howard Marshall
Regular Advisor

Re: script hangs

Hay Troy,

I read that page you linked to (though not as carefully as you probably have). It looks like some pretty neat stuff. What I didn't see is anywhere that it referenced running anything like the out=`script` format.

Mostly what it seems to me is that it wants something very much like a regular rc start up script with a few enhancements like the check option and making sure that the stop option returns a 0 error code even if the app isn't running any more.

It also says not to start, monitor, or stop oracle with the scripts. Apparently it handles the oracle database engine on its own or in some other manner.

If there is something in particular I should read give me a paragraph or line number.

H