Operating System - HP-UX
1834156 Members
2204 Online
110064 Solutions
New Discussion

Re: premature script end with at

 
SOLVED
Go to solution
Vladimir Tihu
Advisor

premature script end with at


I have a script that calls other scripts; it is programmed to run every day with at. But since a couple of days it started to fail after one of these calls (on the point of returning from one of the called scripts): one of the called scripts performs well but the control is NOT returned back to the calling script after the called script ends. When I run this at the prompt it runs fine; when I try it again with at (eg at -f script now) it fails in the same way.
I have an HP-UX 11 OS and use the sh shell.
Anybody who can help? It seems very unusual to me this behaviour.

Thanks
19 REPLIES 19
John Meissner
Esteemed Contributor

Re: premature script end with at

Have you tried Cron ? I schedule all my jobs/scripts with Cron.
All paths lead to destiny
Bill Douglass
Esteemed Contributor

Re: premature script end with at

Could be a path problem. Verify the path to each command in the failing script, as well as the path to the script itself. If necessary, you can set and export PATH at the start of your script. Alternaively, you can source your .profile

. /home/username/.profile

in your main script.

Are you capturing all output from your scripts to a log file? This could help too.

Finally, check mail file for the user that the script is running as. It should have any output from the script there.
Pete Randall
Outstanding Contributor

Re: premature script end with at

Vladimar,

No answers, only questions:

1) What changed the other day - any idea?

2) Does it always fail on the same called script?



Pete

Pete
Vladimir Tihu
Advisor

Re: premature script end with at


All paths are given as absolute, so it cannot be a path problem; also, I captured the output and noticed that the inner script reaches it's return point but the calling script doesn't continue after this. I stress that this happens ONLY when I program it with at because when I run it manually (at the prompt) it doesn't exhibit the problem.
For Pete, basically nothing changed except perhaps that the main script grew a little bigger with some new lines and I inserted a capture of the return value after the call to the inner script (let ret=$?) and yes, it happens at the same point.

Pete Randall
Outstanding Contributor

Re: premature script end with at

Vladimir (Sorry for the mis-spelling last time),


Since I'm still without answers, I would suggest undo-ing the changes to the calling script and seeing if that cures the problem. If you have a backup of the script and that works then we know it was something in the changes and we can start going through them one by one.


Pete

Pete
Dario_1
Trusted Contributor

Re: premature script end with at

Hi!

Are you using the same user when you execute the script manually than when you execute the script using at?

If not, check the permissions on the executable.

Regards,

DR
Vladimir Tihu
Advisor

Re: premature script end with at


I removed the problem call and it worked.
I put it back again and it didn't work.
Here is the relevant sequence in the main script:

...
/scripts/backup_files_v1 r
echo bebe>/scripts/temp/titia
/scripts/backup_files_v2 r
...

The problem call is /scripts/backup_files_v2 r.

Here is /scripts/backup_files_v2:

#backup_files_v2_AA_01
chost=`hostname`
if [ "$chost" = "v2" ]; then
if [ -f /scripts/locks/backup_files_v2.lock ]; then
/scripts/utils.lib.cmd delay_secs 1200;
if [ -f /scripts/locks/backup_files_v2.lock ]; then return 1; fi;
fi
else
if [ `remsh v2 -l root "if [ -f /scripts/locks/backup_files_v2.lock ]; then echo
0; else echo 1; fi"` -eq 0 ]; then
/scripts/utils.lib.cmd delay_secs 1200;
if [ `remsh v2 -l root "if [ -f /scripts/locks/backup_files_v2.lock ]; then echo
0; else echo 1; fi"` -eq 0 ]; then return 1; fi
fi
fi
if [ "$chost" = "v2" ]; then
touch /scripts/locks/backup_files_v2.lock
else
remsh v2 -l root "touch /scripts/locks/backup_files_v2.lock"
fi
cat /var/spool/cron/atjobs/* | grep \#backup_files_v2_AA_01
if [ $? -eq 1 ]; then case $1 in r) :;; n) at -f /scripts/backup_files_v2 4:10am `date | cut -f 1 -d " "` + 1 days;; *) at -f /scripts/backup_files_v2 now + 1 days; esac;
fi
if [[ "$2" = "" || "$2" = "s" ]]; then
/scripts/backup_files v2 /var/adm/syslog nb4 /BACKUP_ORA/temp "-r" /BACKUP_ORA/b
ack_fils
/scripts/backup_files v2 /var/adm/wtmp nb4 /BACKUP_ORA/temp "" /BACKUP_ORA/back_
fils
/scripts/backup_files v2 /var/adm/btmp nb4 /BACKUP_ORA/temp "" /BACKUP_ORA/back_
fils
fi
if [[ "$2" = "" || "$2" = "l" ]]; then
if [ "$chost" = "v2" ]; then
ctl=`/scripts/prepare_logs;echo $?`
else
ctl=`remsh v2 -l root "/scripts/prepare_logs;echo \\\$?"`
fi
ctl3=`echo $ctl | cut -d" " -f3`
ctl2=`echo $ctl | cut -d" " -f2`
if [[ "$ctl3" = "" && "$ctl2" = "1" ]]; then
if [ "$chost" = "v2" ]; then
rm /scripts/locks/backup_files_v2.lock
else
remsh v2 -l root "rm /scripts/locks/backup_files_v2.lock"
fi
return 1;
fi
if [ "$ctl3" = "2" ]; then
/scripts/backup_files v2 $ctl2 nb4 /BACKUP_ORA/temp "-r" /BACKUP_ORA/back_fils
if [ "$chost" = "v2" ]; then
rm -r $ctl2
else
remsh v2 -l root "rm -r $ctl2"
fi
if [ "$chost" = "v2" ]; then
ctl=`/scripts/prepare_logs;echo $?`
else
ctl=`remsh v2 -l root "/scripts/prepare_logs;echo \\\$?"`
fi
ctl3=`echo $ctl | cut -d" " -f3`
ctl2=`echo $ctl | cut -d" " -f2`
fi
if [ "$ctl3" = "0" ]; then
/scripts/backup_files v2 $ctl2 nb4 /BACKUP_ORA/temp "-r" /BACKUP_ORA/back_fils
if [ "$chost" = "v2" ]; then
rm -r $ctl2
else
remsh v2 -l root "rm -r $ctl2"
fi
fi
fi
if [ "$chost" = "v2" ]; then
rm /scripts/locks/backup_files_v2.lock
else
remsh v2 -l root "rm /scripts/locks/backup_files_v2.lock"
fi
echo bebeb>/scripts/temp/titia

After calling it in the main script in /scripts/temp/titia I have "bebeb"; nevertheless, the main script is not continued after this point!
Pete Randall
Outstanding Contributor

Re: premature script end with at

Vladimir,

You "echo bebeb>/scripts/temp/titia" before you call /scripts/backup_files_v2, and echo it again at the end of/scripts/backup_files_v2, so you should see bebeb twice. Do you?


Pete

Pete
Vladimir Tihu
Advisor

Re: premature script end with at


Before calling backup_files_v2 I issue echo bebe>/scripts/temp/titia (NOT bebeb), and at the end of backup_files_v2 I echo bebeb>/scripts/temp/titia, so /scripts/temp/titia is overwritten and it's contents change from bebe to bebeb; I see this bebeb and draw the conclusion that backup_files_v2 was ended succesfully.
Pete Randall
Outstanding Contributor

Re: premature script end with at

Sorry - my mistake, Vladimir. At this point, I don't see anything that seems wrong to me.

Have you tried my suggestion about running the backup (pre-changes) version of the calling script?


Pete

Pete
Dave La Mar
Honored Contributor

Re: premature script end with at

Just a note of what has burned me in the past -
Is there an environment variable the executing user has that is not set when the script is run via the at command?
As Pete mentioned, we don't know what changed from the last time it did work using the at scheduler.

Best of luck and hang in there; we'll get it.

Regards,
dl
"I'm not dumb. I just have a command of thoroughly useless information."
Vladimir Tihu
Advisor

Re: premature script end with at


Yes, the backup script runs into the same problem now: when I remove the call works fine, when I put it back get error. Environment variables I don't use, only parameters (e.g. the r in calling backup_files_v2). To me it seems more like a subtle bug in the shell in the at context on this particular machine (?!). I remember that in the past I had the same problem with another script with the same called-script-ending-well-but-calling-script-not-continuing problem and the only way I could find to get rid of it was to put the whole stuff in a C program that performed system calls-so it worked even with at(?!). Really have no idea about it. Maybe there is a patch to the OS for this?
Pete Randall
Outstanding Contributor

Re: premature script end with at

Vladimir,

I would accept that theory except for the fact that it did work. Unless you applied a patch that caused it to not work, I don't think a patch is suddenly going to make it start working again. Because it was working before, I believe that something has to have changed and the only thing we've identified so far has been the script itself. Run the old version.


Pete

Pete
Vladimir Tihu
Advisor

Re: premature script end with at


Pete, my turn: I would accept your explanation except it simply doesn't work. I tested now and wrote a C program where I put simply a call: system("/scripts/backup_files_all"); then I made another script called "a" where I put a call to the C executable: /scripts/a.out; then I program the a script like before the main script: "at -f /scripts/a now" and the result is that it WORKS. Go back again to "at -f /scripts/backup_files_all" (which is my main scripts calling backup_files_v2) and it DOESN'T WORK. So it seems there is something strange and I repeat, not for the first time. There were no patches applied in between script calls and no other changes in the environment. The old version of the script is simply not relevant; the relevant thing is that the called script finishes well but when returning to the calling script something happens I suppose IN THE SHELL OR A SYSTEM MODULE OR STACK or whatever; and the same script call embedded in a C program works. What can be? Nobody experienced something like this before?!
Pete Randall
Outstanding Contributor

Re: premature script end with at

OK, Vladimir, I give up. I will be forever puzzled by the fact that it used to work and suddenly stopped, but I'm obviously without an explanation. Have you tried opening a call with the Response Center?


Pete

Pete
Vladimir Tihu
Advisor

Re: premature script end with at


Can it be a problem with the system stack? Because the old version had simply fewer variables and calls and maybe the shell stack in at context is too small and the returning address could be destroyed, whilst in the context of the C program the things are different with the stack allocation?
About the fact that the old version worked, I think if I go to the previous versions of ALL the scripts called and implied it should work again, but I cannot do this because I developed new versions of many scripts and going back to simpler versions would not be acceptable for all.

Thanks

Hai Nguyen_1
Honored Contributor
Solution

Re: premature script end with at

Vladimir,

1. Can you change the order of the called scripts (assuming that they are independent of one another) to see how the main script behaves?

Or:
2. Can you run the troubling called script in the background and add the "wait" command right after it?

Hai
Vladimir Tihu
Advisor

Re: premature script end with at


Hai, YES, it works with wait! Finally an idea that seems to get around this mistery even if doesn't explain it and without requesting a C program (the change in the call order is useless). Thanks.
Now another problem related to this: which is the best way to capture the return value of the called scripts when they run in background?

Good practical solution, thanks a lot.

Vladimir


Peter Nikitka
Honored Contributor

Re: premature script end with at

Hi,

I suspect a problem with your 'remsh'-calls. If you do not need to pipe to/from the remsh in/output, I would always use 'remsh -n' to close I/O.
Then many possibly open streams get closed and cannot produce hard to find errors.

Concerning the retvals of bg-processes, you can do something like this:
do_bg1 &
bgpid1=$!
processing
...
wait $bgpid1
ret_of_bg1=$?


mfG Peter
The Universe is a pretty big place, it's bigger than anything anyone has ever dreamed of before. So if it's just us, seems like an awful waste of space, right? Jodie Foster in "Contact"