Operating System - HP-UX
1748225 Members
4620 Online
108759 Solutions
New Discussion юеВ

Re: preremove script problem in HP-UX 11.31

 
SOLVED
Go to solution
Jdamian
Respected Contributor

preremove script problem in HP-UX 11.31

Hi

My product is packaged in a SD depot. It has a preremove script to kill the product running agent before removing the product binaries.

The preremove script just runs the ordinary script to stop the running agent.

I installed and removed sucessfully the product in a lot of nodes running HP-UX 11v1 and v2.

But I have problems when the preremove script is executed in HP-UX 11v3 launched by SD daemons. If I run the preremove script manually it kills the product agent properly. If the preremove script is run from the swremove command, it does not kill the product agent.

I debugged and found the critical lines in the script.

if [ ! "${1}" -o "${1##+([0-9])" ]
then
return 2 #
fi
kill "$1"

These lines check if the first arg $1 is a numeric value (if it is not, then cancel; otherwise continue and kill).

In the debugging session (setting -x) the condition shown is

+ [ ! 4551 -o ]
+ return 2

Those lines run fine when I execute the preremove script manually or when I execute in the session command line:

T=23
[ "${T}" -o "${T##+([0-9])}" ] || echo false


If I replace

-o "${1##+([0-9])" ]

by

-o "x${1##+([0-9])" != "x" ]

then the preremove script runs fine.

Then I have some questions:

why [ ! 4551 -o ] is evaluated as TRUE?
why this only ocurrs in HP-UX 11v3 and only when run from swremove?


Thanx in advance
15 REPLIES 15
Dennis Handly
Acclaimed Contributor

Re: preremove script problem in HP-UX 11.31

>These lines check if the first arg $1 is a numeric value (if it is not, then cancel; otherwise continue and kill).

This seems broken. They should use "UNIX95=EXTENDED_PS ps ..." in such a way that the only output is the correct PID and nothing else.

>In the debugging session (setting -x) the condition shown is
+ [ ! 4551 -o ]
+ return 2

Any trailing spaces or newlines? Can you echo this to a file? Does the script use the same shell as you? Can you check the setting of UNIX95?

>Those lines run fine when ...
T=23
[ "${T}" -o "${T##+([0-9])}" ] || echo false

Are you missing: ! "${T}"

>If I replace: -o "${1##+([0-9])" ]
>by: -o "x${1##+([0-9])" != "x" ]
>then the preremove script runs fine.

That's easier to understand. It seems to imply no trailing spaces.

>why [ ! 4551 -o ] is evaluated as TRUE?
why this only occurs in HP-UX 11v3 and only when run from swremove?

That's probably [ ! 4551 -o "" ].
You need to match up swremove's environment and shell.
Dennis Handly
Acclaimed Contributor

Re: preremove script problem in HP-UX 11.31

>if [ ! "${1}" -o "${1##+([0-9])" ]

You seemed to be missing a "}":
if [ ! "${1}" -o "${1##+([0-9])}" ]; then

I don't have any problems with this "if" unless $1 has trailing spaces:
#!/sbin/sh
function foo {
set -x
echo "Passed in: $1"
if [ ! "${1}" -o "${1##+([0-9])}" ]; then
echo "true : $1 (not numeric)"
else
echo "false: $1"
fi
}

foo 123
foo "123 "
foo abc

>ME: They should use "UNIX95=EXTENDED_PS ps ..." ...

That should be: You should use ... ;-)
Jdamian
Respected Contributor

Re: preremove script problem in HP-UX 11.31

I apologyse for my mistakes in the script lines posted

I think there is a shell bizarre behaviour.
I added the following line in the preremove script:

test ! "345" -o "" && echo true1 || echo false1
[ ! "345" -o "" ] && echo true2 || echo false2
/usr/bin/test ! "345" -o "" && echo true3 || echo false3

"true1", "true2" and "true3" are displayed when the preremove script is run.
But if I type the same command lines in a interactive session "false1", "false2" and "false3" are displayed instead.

I guess the environment is the root cause.

I found a variable PRE_U95 that it is used by SD and by /etc/rc.config.d/namesvrs

P.D: the UNIX95 var is used by my preremove script to get an customize output of the ps command.
Dennis Handly
Acclaimed Contributor

Re: preremove script problem in HP-UX 11.31

>I guess the environment is the root cause.

What is your preremove environment? What is your shell?

>the UNIX95 var is used by my preremove script to get an customize output of the ps command.

How? The only way to use UNIX95 safely is to use a form similar to this:
UNIX95=EXTENDED_PS ps ....

Do NOT export UNIX95.

(When I did it, there were no differences.)
Jdamian
Respected Contributor

Re: preremove script problem in HP-UX 11.31

The variables when preremove script is run are

variables
DDFA=0
DHCPV6CLNTD_ARGS=''
DHCPV6D=0
DHCPV6SRVRD_ARGS=''
ERASE=^H
ERRNO=25
FCEDIT=/usr/bin/ed
HOME=/
IFS='
'
INETD=1
INETD_ARGS=-l
INIT_STATE=2
LANG=C
LINENO=24
MAILCHECK=600
MROUTED=0
MROUTED_ARGS=''
NOKILLED=0
NTPDATE_SERVER=''
OPTARG
OPTIND=1
PATH=/usr/bin:/usr/sbin:/sbin
PPID=9057
PRE_U95=1
PS2='> '
PS3='#? '
PS4='+ '
PWD=/
RANDOM=2035
RWHOD=0
SDU_DEBUG_PRINT_MSGID=0
SECONDS=0
SENDMAIL_RECVONLY=0
SENDMAIL_SENDONLY=0
SENDMAIL_SERVER=0
SENDMAIL_SERVER_NAME=''
SHELL=/usr/bin/sh
SNMP_HPUNIX_START=1
SNMP_MASTER_START=1
SNMP_MIB2_START=1
SNMP_NAA_START=0
SNMP_TRAPDEST_START=1
SW_ADMIN_DIRECTORY=/var/adm/sw
SW_CATALOG=/var/adm/sw/products
SW_CMD_NAME=swremove
SW_COMPATIBLE=''
SW_CONTROL_DIRECTORY=/var/adm/sw/products/VFES/VFES-RUN/
SW_CONTROL_TAG=preremove
SW_DEFERRED_KERNBLD=''
SW_DLKM_REPLACEMENT=''
SW_HW_SUPP_BITS=64
SW_KERNEL_PATH=/stand/vmunix
SW_KERN_BIT_MODE=64
SW_LOCATION=/
SW_MODIFY_TMPDIR=/var/adm/sw/swmodify_rtmp/
SW_PATCH512B=1
SW_PATH=/usr/lbin/sw/bin:/var/adm/sw/sbin:/sbin:/usr/bin:/usr/ccs/bin
SW_POSE_AS_1123_AND_UP=1
SW_POSTDSA_TEMPFILE=''
SW_ROOT_DIRECTORY=/
SW_SESSION_IS_KERNEL=''
SW_SESSION_IS_REBOOT=''
SW_SESSION_OPTIONS=/var/tmp/BAA009056
SW_SOFTWARE_SPEC=VFES.VFES-RUN,l=/,r=1.4,a=HP-UX_B.11.00_32/64,v=,fr=1.4,fa=HP-UX_B.11.00_32/64
SW_SOURCE_DIR=''
SW_SOURCE_HOST=''
SW_SYSTEM_FILE_PATH=/stand/system
SW_VERBOSE=0
TERM=unknown
TMOUT=0
TZ=MET-1METDST
UNIX95=''
XNTPD=1
XNTPD_ARGS=''
_='\nvariables'
+ echo

The shell used is /sbin/sh
+ /usr/local/bin/lsof -p 9058
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
sh 9058 root cwd DIR 64,0x3 8192 2 /
sh 9058 root txt REG 64,0x3 1403704 1758 /sbin/sh
sh 9058 root 0u CHR 3,0x2 0t0 111 /dev/null
sh 9058 root 28u REG 64,0x5 2793 54 /opt/VFES/product/agent/bin/stop.sh

The UNIX95 variable is only used in this way:

UNIX95= ps -o pid= -o ppid= -o args= -U "${VFES_LOGIN}" -u "${VFES_LOGIN}" -G "${VFES_GROUP}"
Bob E Campbell
Honored Contributor

Re: preremove script problem in HP-UX 11.31

There is a shell library delivered at /usr/lbin/sw/control_utils. You might fine the "kill_named_procs" function of interest.

While not an API, it is not going away any time soon...
Dennis Handly
Acclaimed Contributor

Re: preremove script problem in HP-UX 11.31

>The variables when preremove script is run are:
IFS='
' # I hope this is ok
SHELL=/usr/bin/sh # this seems odd
UNIX95='' # Lots of smoke here!

>The UNIX95 variable is only used in this way:

Someone already hosed you over by setting UNIX95. Try unsetting it.
I suppose SHELL was set that way (instead of /sbin/sh) because you restarted the demon from a user login?
It would be good to check the values of the 3 chars in IFS.
echo "$IFS" | vis -n
Jdamian
Respected Contributor

Re: preremove script problem in HP-UX 11.31

Hi

I debugged this issue by myself and put clear:

1. The problem was highlighted by a SD preremove but it appears also in interactive shell sessions.

2. Only found in /usr/bin/sh and /sbin/sh shells. Not in Korn shell.

3. Only root user is affected.

4. The issue is related to UNIX95 variable. It is "congenital" -- Once the shell process is created, no changes in UNIX95 will correct the problem.

5. These are the tests:

a. # login as root
b. unset UNIX95 # remove UNIX95 if it exists
c. if [ ! 23 -o "" ]; then echo TRUE; else echo FALSE; fi # test command line... it should display FALSE always.

d. export UNIX95=1
e. /usr/bin/sh # creating a subshell
f. echo $UNIX95 # to check that variable is found

g. if [ ! 23 -o "" ]; then echo TRUE; else echo FALSE; fi # TRUE is displayed !!!

h. unset UNIX95
i. if [ ! 23 -o "" ]; then echo TRUE; else echo FALSE; fi # TRUE is displayed !!! even if UNIX95 is not found.

j. exit # exiting from the subshell
k. if [ ! 23 -o "" ]; then echo TRUE; else echo FALSE; fi # FALSE is shown

I infer that the UNIX95 variable is the cause of the wrong behaviour in the moment of the creation of the subshell process ("congenital") and it cannot be corrected even when UNIX95 variable is removed.

Thanx in advance
Jdamian
Respected Contributor

Re: preremove script problem in HP-UX 11.31

I apologise for a mistake:

Paragraph 3 is wrong. This issuem is affecting any user using POSIX shell.