Operating System - HP-UX
1753785 Members
7488 Online
108799 Solutions
New Discussion юеВ

application dumping core during system reboot

 
SOLVED
Go to solution
Srimalik
Valued Contributor

Re: application dumping core during system reboot


>What are those paths
see chatr.txt in attached gz file

> and what does ldd show for them?

ldd_after_start.txt in attcahed gz

>Is this on a file system that isn't mounted >until later?
>Is there anyway you can run ldd just before >you start your application up??

see info_during_reboot.txt in attached gz

Sri
abandon all hope, ye who enter here..
Srimalik
Valued Contributor

Re: application dumping core during system reboot

Hi, Dennis

Was the data useful ?

Thanks
Sri
abandon all hope, ye who enter here..
Dennis Handly
Acclaimed Contributor

Re: application dumping core during system reboot

>Was the data useful?

Nothing obvious.
If you still have your core file can you do:
(gdb) x /s *(void**)($sp-0x40)
(gdb) x /s *(void**)($sp-0x3c)
(gdb) p /x $ret0

This should print out the input string to get_origin. And the result of dld_realpath and the result of dld_strrchr.

In your chatr(1) output I see:
dynamic /usr/lib/libstd.2
dynamic /usr/lib/libstream.2
dynamic /usr/lib/libCsup.2

But some of the ldd output has the wrong order:
/usr/lib/libCsup.2 => /usr/lib/libCsup.2
/usr/lib/libstream.2 => /usr/lib/libstream.2
/usr/lib/libstd.2 => /usr/lib/libstd.2

You may have to use "ldd -v" or chatr(1) on every shlib until you find this bad order.
(Or perhaps ldd doesn't have any ordering in its output??)

I do see some strange relative paths:
/usr/lib/libdld.2 => ../lib/libdld.2

But this is in the "after" list too.
Comparing the RHS of each "=>" I get matches for before and after.

Your files that use $ORIGIN all appear to be in /opt/VRTSob, which appears to be:
/opt on /dev/vg00/lvol5 ioerror=mwdisable,largefiles,delaylog,dev=40000005

I did find out that one of your shlibs is illegally being built with -AA. You can't mix -AA and -AP. You have one with libCsup_v2.2 and the rest with libCsup.2.
rajdev
Valued Contributor

Re: application dumping core during system reboot

Hi,

Not sure what the issue is but since you say that its working after system is booted i have a couple of questions :
---> how are you starting it ie from command prompt
---> does this require any terminal ( what is the stdin/stdout/stderr )
---> or is this a daemon program
---> have you tried running with nohup

Regards,
RD
Srimalik
Valued Contributor

Re: application dumping core during system reboot

Thanks, Dennis

Output of gdb commands given by you:
(gdb) x /s *(void**)($sp-0x40)
0x77ff0000: "vxsvc"
(gdb) x /s *(void**)($sp-0x3c)
0x77fce36c: ""
(gdb) p /x $ret0
$1 = 0x0
(gdb)

I can not make out much from the output. :(

I would be able to work on all other points on Monday only. :(


Rajdev,
This exe does not use teminal for stdot/in. We don't have to use nohup.
The command used is exactly same as the command used after reboot and as I have mentioned earlier, it works without problem on most of the machines.

Both of you have a great weekend. :)
abandon all hope, ye who enter here..
Dennis Handly
Acclaimed Contributor

Re: application dumping core during system reboot

>Output of gdb commands given by you:
(gdb) x /s *(void**)($sp-0x40)
0x77ff0000: "vxsvc"
(gdb) x /s *(void**)($sp-0x3c)
0x77fce36c: ""

Basically this is saying that argv[0]? is not an absolute path so that dld's realpath has to sweat. And something is going wrong there.
And unfortunately dld doesn't have any checking for this case.

Years ago I found problem using dirname/basename on argv[0] and I had a case of a non-absolute path and the function aborted. Later I was never able to duplicate it because ksh always provided the absolute path every time I tried to duplicate it. Perhaps that's happening to you with the before and after the boot? Something to do with shells and exec(2)?

So the workaround may be simple, provide an absolute path to vxsvc. (I assume that's what you were running?)

In the meantime, you should contact the Response Center with your problem so they can put more checking into dld. Unfortunately if realpath doesn't work or getcwd(3) fails, dld may give you an error that it can't compute $ORIGIN and you'll just abort with a nicer message.

(You're not in a directory that gets removed out from under it?)

(It would have been helpful if your attachment was .tar.gz or .tgz instead of just .gz. It took me awhile to realize it was a tarfile.)
Srimalik
Valued Contributor

Re: application dumping core during system reboot


>So the workaround may be simple, provide an >absolute path to vxsvc. (I assume that's >what you were running?)
Yes, I am running vxsvc.
Where do I need to give the absolute path? we are already using the absolute path in the rc script which starts vxsvc.



>(You're not in a directory that gets removed >out from under it?)

I think no, as the core is in / I think the working dir for a rc script is / by default.

(It would have been helpful if your attachment was .tar.gz or .tgz instead of just .gz. It took me awhile to realize it was a tarfile.)

Will keep that in mind for future.

Sri
abandon all hope, ye who enter here..
Dennis Handly
Acclaimed Contributor

Re: application dumping core during system reboot

>Where do I need to give the absolute path? we are already using the absolute path in the rc script which starts vxsvc.

Well, that's where I would expect it.
If not there, perhaps it forks and execs itself?
Srimalik
Valued Contributor

Re: application dumping core during system reboot

Hi Dennis,


> In the meantime, you should contact the
Response Center with your problem so they
can put more checking into dld.
Unfortunately if realpath doesn't work or
getcwd(3) fails, dld may give you an error
that it can't compute $ORIGIN and you'll
just abort with a nicer message.

Can't we fix this i.e cann't we prevent the abort instead of giving a nicer message? either in the application or the OS.

Thanks
Sri
abandon all hope, ye who enter here..
Dennis Handly
Acclaimed Contributor

Re: application dumping core during system reboot

>can't we prevent the abort instead of giving a nicer message? either in the application or the OS.

You can't fix something that hasn't been reported. Nor can you fix something if you can't duplicate. Currently we can only check for the symptoms of the lack of error detection in get_origin.

It might be interesting to see if exec(2) passed the absolute path to vxsvc. (Do you still have that corefile?) Or get the output of tusc to see what's different when it fails vs when it works.