Operating System - HP-UX
1755652 Members
3327 Online
108837 Solutions
New Discussion юеВ

Re: dld fails due to missing library, find out which library is missing?

 
Srimalik
Valued Contributor

dld fails due to missing library, find out which library is missing?

Hi,
I have a core which shows this stack

#0 0xc0041b20 in kill+0x10 () from /usr/lib/dld.sl
#1 0xc004173c in raise+0x24 () from /usr/lib/dld.sl
#2 0xc0036814 in fatal_error+0x50 () from /usr/lib/dld.sl
#3 0xc002814c in finish_dld_main+0x103c () from /usr/lib/dld.sl
#4 0xc002b9dc in _dld_main+0x1c8 () from /usr/lib/dld.sl
#5 0xbb0c in __map_dld+0x4e4 ()
#6 0xb14c in $START$+0xd4 ()
#7 0xc004173c in raise+0x24 () from /usr/lib/dld.sl


This generally happens when any library required by the binary is missing.Is there a way to find out which library is missing using only the core file(I do not have the error message printed by dld when this happened and I am not able to duplicate this now).

Are there any other cases in which we can get this trace?

Thanks
Sri
abandon all hope, ye who enter here..
9 REPLIES 9
Dennis Handly
Acclaimed Contributor

Re: dld fails due to missing library, find out which library is missing?

It is illegal to look at the corefile if dld aborts due to missing shlibs. In fact, dld now does a kill -9 on itself so you don't get one, so you just rely on the error message.

>Is there a way to find out which library is missing using only the core file (I do not have the error message printed by dld when this happened and I am not able to duplicate this now).

It isn't worth the effort. You are suppose to save the error message. The stack frame with the error is no longer there. (Quickly remove that core file, so you don't try. ;-)

>Are there any other cases in which we can get this trace?

No clue but the intention is that you won't get corefiles for dld startup errors that can be explained by the error message.
Srimalik
Valued Contributor

Re: dld fails due to missing library, find out which library is missing?

>It is illegal to look at the corefile if dld aborts due to missing shlibs. In fact, dld now does a kill -9 on itself so you don't get one, so you just rely on the error message.

I will quickly delete this core file and cleanup all the traces and proofs of me ever seeing a dump by dld. :-D

Is this(doing kill -9) not done yet, I am seeing this dump on March09 fusion or is there a patch which has this change.

BTW, sometimes its very handy when you have a core dump, one may not get the errors thrown by dld always(something started from rc and having stdout and stderr redirected to /dev/null)

Thanks
Sri


Reopen: Assign points :)
abandon all hope, ye who enter here..
Dennis Handly
Acclaimed Contributor

Re: dld fails due to missing library, find out which library is missing?

>Is this (doing kill -9) not done yet, I am seeing this dump on March09 fusion or is there a patch which has this change.

It may be only done for IPF and PA64?

>sometimes it's very handy when you have a core dump, one may not get the errors thrown by dld always

Except there may be nothing left in the core file to determine the error. (strings -a core may or may not help.) And if it always aborts, you'll soon remove /dev/null.
Srimalik
Valued Contributor

Re: dld fails due to missing library, find out which library is missing?



>It may be only done for IPF and PA64?
I am seeing this on PA64 machine but the binary in question is 32 bit. I assume that you mean 64 bit apps.
Let me chk it on IA.
abandon all hope, ye who enter here..
Dennis Handly
Acclaimed Contributor

Re: dld fails due to missing library, find out which library is missing?

>I am seeing this on PA64 machine but the binary in question is 32 bit. I assume that you mean 64 bit apps.

Yes, since nearly all kernels are 64 bit by now.

strings(1) doesn't work:
$ a.out # PA32
dld.sl: Can't open shared library: /var/tmp/libfoo.sl
dld.sl: No such file or directory
Abort(coredump)
$ strings -a core | fgrep open
dbm: no open database
...

PA64:
$ a.out
/usr/lib/pa20_64/dld.sl: Unable to find library './libfoo.sl'.
Killed
rick jones
Honored Contributor

Re: dld fails due to missing library, find out which library is missing?

I wonder if auditing would/could record the failed open() call that would likely be associated with the missing library error. Of course, enabling auditing just for that sort of thing would be a rather heavyweight response to the issue...

Not that it helps if you cannot reproduce the problem, but a tusc system call trace could have shown the failed open() call.
there is no rest for the wicked yet the virtuous have no pillows
Srimalik
Valued Contributor

Re: dld fails due to missing library, find out which library is missing?

Yes, auditing or tusc logs would have helped but as I have said I have only the core file for now :-)
abandon all hope, ye who enter here..
ranganath ramachandra
Esteemed Contributor

Re: dld fails due to missing library, find out which library is missing?

> Are there any other cases in which we can get this trace?

yes. it could happen due to unresolved symbols, corrupted data in the shared library file, failure to allocate memory, failure to load the shared library due to any other reason (say, not enough shared space for text segment), etc.

you can be sure that a stack trace like this, containing a "finish_dld_main" frame is a result of dld's failure at program startup either in loading shared libraries or in resolving symbols. but most or all of the failure modes mentioned above would show you the same stack trace and tusc would not help in differentiating amongst them. the error message would be specific.

> In fact, dld now does a kill -9 on itself so you don't get one, so you just rely on the error message.

the PA32 dld continues to dump core. in any case, error messages are emitted to stderr, so as with any other program, it is best to capture stderr to a log. digging into core files for something easily explained by an error message can be unnecessary and misleading.
 
--
ranga
[i work for hpe]

Accept or Kudo

Srimalik
Valued Contributor

Re: dld fails due to missing library, find out which library is missing?

Close again.
abandon all hope, ye who enter here..