Operating System - HP-UX
1758362 Members
2399 Online
108868 Solutions
New Discussion юеВ

application dumping core during system reboot

 
SOLVED
Go to solution
Srimalik
Valued Contributor

Re: application dumping core during system reboot

Yes, I have the core file(fortunately), Please find the packcore attached.
The setup on the machine has changed, I will try to recreate the setup and try tusc.

Thanks
Sri
abandon all hope, ye who enter here..
Dennis Handly
Acclaimed Contributor

Re: application dumping core during system reboot

>I have the core file(fortunately), Please find the packcore attached.

Thanks.

>I will try to recreate the setup and try tusc.

Basically there is something wrong with the kernel or the shell. There is no way dld can figure out the path of the executable:
argv[ 0]: vxsvc
argv[ 1]: -r
argv[ 2]: /etc/vx/isis/Registry
argv[ 3]: -e
argc: 4
envp[ 0]: _=/opt/VRTSob/bin/vxsvc
...
envp[ 5]: PATH=/usr/bin:/bin:/sbin

While "_" is set, this wouldn't be true or trusted in all shells. I have no idea why argv[0] doesn't have the absolute path, especially if it isn't in PATH.

Using tusc on the case where it works may tell you something, argv[0] is full path?
Dennis Handly
Acclaimed Contributor

Re: application dumping core during system reboot

Some other variables:
envp[14]: INIT_STATE=3
envp[22]: TERM=unknown
envp[23]: PWD=/
envp[24]: TZ=PST8PDT
envp[28]: SHLIB_PATH=/opt/VRTSob/lib/compat:/opt/VRTSob/lib:/opt/VRTSob/
sig/bin:/opt/VRTSob/lib
Srimalik
Valued Contributor

Re: application dumping core during system reboot

Hi, Dennis

While I was waiting for a 11.23 machine to send you the logs,
I saw similar problem on 11.31 machine.

Few things to note.

previously vxsvc was dumping during a reboot. But now it dumps every time(even when we try to start it after the machine is up)

We are calling execvp with first arg as "/opt/VRTSob/bin/vxsvc" in our code(vxsvc). and the exec seems to be failing.

If we do not call this exec function it runs without a dump.

if we include /opt/VRTSob/bin to $PATH vxsvc runs without problems. ( tusc logs attached)

The core is same as the core we saw on 11.23.

gdb) x /s *(void**)($sp-0x40) shows "vxsvc"

(gdb) x /s *(void**)($sp-0x3c) shows ""

(gdb) p /x $ret0 shows 0x0

detailed output
================
# gdb -c /opt/VRTSob/bin/core.vxsvc /opt/VRTSob/bin/vxsvc
Detected PA executable.
Invoking /usr/ccs/bin/gdbpa.
HP gdb 5.7 for PA-RISC 1.1 or 2.0 (narrow), HP-UX 11.00
and target hppa1.1-hp-hpux11.00.
Copyright 1986 - 2001 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 5.7 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
Core was generated by `vxsvc'.
Program terminated with signal 11, Segmentation fault.
SEGV_UNKNOWN - Unknown Error
#0 0xc4d6365c in get_origin+0x40 () from /usr/lib/dld.sl

warning: core file might be corrupted.
(gdb) bt
#0 0xc4d6365c in get_origin+0x40 () from /usr/lib/dld.sl
#1 0xc4d4ee4c in map_shlib+0x11ac () from /usr/lib/dld.sl
#2 0xc4d4ca78 in form_load_graph+0x1a4 () from /usr/lib/dld.sl
#3 0xc4d4d58c in form_load_graph+0xcb8 () from /usr/lib/dld.sl
#4 0xc4d5c020 in finish_dld_main+0x1024 () from /usr/lib/dld.sl
#5 0xc4d5f9d4 in _dld_main+0x1c8 () from /usr/lib/dld.sl
#6 0xb73c in __map_dld+0x404 ()
#7 0xaf34 in $START$+0x114 ()
#8 0xc4d63658 in get_origin+0x3c () from /usr/lib/dld.sl
(gdb) x /s *(void**)($sp-0x40)
0x7b03b064: "vxsvc"
(gdb) x /s *(void**)($sp-0x3c)
0x7b008748: ""
(gdb) p /x $ret0
$1 = 0x0
(gdb)
#####################################

The first argument to execvp is full absolute path. but the args to get origin is only vxsvc. Is this the problem?

Is it a problem in exec or our code?
As the man page of execvp says that the first argument to this function should be file name and the path prefix is found out from PATH env variable.
But we are passing full path of the exe.

please confirm if this is a problem with the dld code, I will also ask my colleagues the channel to raise this with HP. Before that I want to make sure that that our code is clean. :)
####################
I am attaching the tusc logs for three cases

1. /opt/VRTSob/bin is not in PATH on an IA machine( core dump) filename:
tusc_logs/tusc_vxsvc_bad_NOPATH_IA.txt


2. /opt/VRTSob/bin is in PATH on an IA machine ( no core) filename: tusc_logs/tusc_vxsvc_good_PATH_IA.txt

3. same binary on a PA machine where the problem is not seen.
filename: tusc_logs/tusc_vxsvc_good_PA.txt

I think the problem is independent of IA/PA as we saw it on PA on 11.23.

I have access to both the machines now and am not going to release them before this issue is fixed.
Let me know if you want additional info, I will provide that in our morning.

I am still searching for a 11.23 machine to try with.

Thanks
Sri

abandon all hope, ye who enter here..
Dennis Handly
Acclaimed Contributor

Re: application dumping core during system reboot

>But now it dumps every time

That's helpful to duplicate the problem.

>We are calling execvp with first arg as "/opt/VRTSob/bin/vxsvc" in our code (vxsvc). and the exec seems to be failing.

You aren't invoking it from the shell?
If you are calling execvp, you must provide the absolute path, both in "file" and in argv[0], if you expect $ORIGIN to work.

>if we include /opt/VRTSob/bin to $PATH vxsvc runs without problems.

Yes, that would make it work.

>The first argument to execvp is full absolute path. but the args to get origin is only vxsvc. Is this the problem?

Yes, you must pass that full path twice.

>Is it a problem in exec or our code?

If you are calling execvp(2), it is your problem. (Or lack of documentation or that new error message I suggested. :-)

>As the man page of execvp says that the first argument to this function ...

You forgot to read more:
argv is ... By convention, argv must have at least one member, and must point to a string that is identical to path or path's last component.

Here the documentation is missing the fine print that says: But if you don't use the exact same string and that path isn't in PATH, then dld will abort if you use $ORIGIN. ;-)

>please confirm if this is a problem with the dld code, I will also ask my colleagues the channel to raise this with HP. Before that I want to make sure that that our code is clean. :)

The only problem with dld is a nicer error message. You need to fix your code.

But please contact HP so that message can be added.

>I am attaching the tusc logs for three cases

You need to add -a to get the exec args and -p to print the PID.

>1. /opt/VRTSob/bin is not in PATH on an IA machine: tusc_vxsvc_bad_NOPATH_IA.txt

It shows a good exec and a bad.

>2. /opt/VRTSob/bin is in PATH on an IA machine: tusc_vxsvc_good_PATH_IA.txt

It shows the stats for each PATH.

>3. same binary on a PA machine: tusc_vxsvc_good_PA.txt

It seems the same as 1. But the path is ./vxsvc and it knows how to look that up.
Srimalik
Valued Contributor

Re: application dumping core during system reboot

I was not able to post any message to this forum during the whole day. It always failed saying " some error(which i dun remember - please try again"

The root cause of problem is:
We were using basename with argv[0] of original program and passing the same argv[0] to execvp.

The fact that man page of basename warns about this situation was ignored.

basename was changing ./vxsvc to "vxsvc" and /opt/VRTSob/bin/vxsvc "vxsvc"

passing the dup of string to basename and preserving original argv[0]. solves the problem

I am yet to find out that whether argv[0] is changed or kept unchanged by basename on the system on which this problem is not reproducible.

the problem with basename is across all the platforms, Is it smthing which is difficult to fix(I do not think so..and may be wrong)?

The man page of basename say smthing like
basename may change the argument passed.

Why is this behaviour inconsistent in case of single threaded programes ?

Dennis, I have raised the suggestion given by you in previous post to HP.

I will request them to preserve the setup on the machine on which this is reproducible, In case if u or somebdy from HP want to have a look at it.

Thanks
Sri
abandon all hope, ye who enter here..
Dennis Handly
Acclaimed Contributor

Re: application dumping core during system reboot

>The root cause of problem is:
We were using basename with argv[0] of original program and passing the same argv[0] to execvp.

Ah, you weren't doing it on purpose.

>I am yet to find out that whether argv[0] is changed or kept unchanged by basename on the system on which this problem is not reproducible.

It should be simple to test. Use tusc -pa ...

>the problem with basename is across all the platforms, Is it something which is difficult to fix?

It won't be changed because the Standard says it can do that. Though I would only expect dirname(3) to modify the string.

>Why is this behaviour inconsistent in case of single threaded programs?

It shouldn't. Single threaded apps use a separate buffer but threaded uses the input buffer.

>I have raised the suggestion given by you in previous post to HP.

Thanks.

>I will request them to preserve the setup on the machine on which this is reproducible, In case

No need, we know the problem now.
Srimalik
Valued Contributor

Re: application dumping core during system reboot

The points :)
abandon all hope, ye who enter here..
Srimalik
Valued Contributor

Re: application dumping core during system reboot

Dennis, any idea why it didn't fail on all the installations?

I am trying a sample program which can repro this issue.
abandon all hope, ye who enter here..
Dennis Handly
Acclaimed Contributor

Re: application dumping core during system reboot

>any idea why it didn't fail on all the installations?

Perhaps it had "." in PATH?