Operating System - HP-UX
1753471 Members
4864 Online
108794 Solutions
New Discussion юеВ

coredump in exit while using roguewave stl

 
Madhuwati
Occasional Contributor

coredump in exit while using roguewave stl

Hi,

I am using Roguewave stl to build a C++ shared library. When a C client which loads the library exits, it dumps core as follows:

Program terminated with signal 11, Segmentation fault.
#0 0x7ad1744c in std::basic_string,std::allocator>::_C_unlink (this=0x7aa945b0) at /src/hp/build/kernel/11.23-i80_LR/ux/vbe/pa/opt/aCC/include_std/string:993
993 if (!_C_data)
(gdb) where
#0 0x7ad1744c in std::basic_string,std::allocator>::_C_unlink (this=0x7aa945b0) at /src/hp/build/kernel/11.23-i80_LR/ux/vbe/pa/opt/aCC/include_std/string:993
#1 0x7ad175c0 in std::basic_string,std::allocator>::~basic_string,std::allocator> (this=0x40013fd4, #free=3) at /src/hp/build/kernel/11.23-i80_LR/ux/vbe/pa/opt/aCC/include_std/string:361
#2 0x7aa6bf08 in __DestructArray+0x80 () from /usr/lib/libCsup_v2.2
#3 0x7ad235f4 in #options#904+0x2c () from /usr/lib/libvrascmd.sl
#4 0x7aa695cc in __shlTerm+0x290 () from /usr/lib/libCsup_v2.2
#5 0x7aa696a4 in __shlInit+0x44 () from /usr/lib/libCsup_v2.2
#6 0x7ad15688 in _shlInit+0x20 () from /usr/lib/libvrascmd.sl
#7 0x7aa69020 in __shlinit+0xac () from /usr/lib/libCsup_v2.2
#8 0x7aa69324 in __callInitFuncFromHandle+0xf0 () from /usr/lib/libCsup_v2.2
#9 0x7aa6b0c4 in __niamHelper+0x8c () from /usr/lib/libCsup_v2.2
#10 0x7aa6b250 in _niam_body+0x88 () from /usr/lib/libCsup_v2.2
#11 0x7aa6b35c in _niam+0x1c () from /usr/lib/libCsup_v2.2
#12 0x7afb8a08 in exit+0x70 () from /usr/lib/libc.2
#13 0x1ffa8 in die (ex=20) at volrlink.c:8680
#14 0x10c1c in do_att (actp=0x40001040, nargs=1, args=0x7f7f04f0) at volrlink.c:1488
#15 0xeef8 in main (argc=4, argv=0x7f7f04e4) at volrlink.c:525
Current language: auto; currently c++

If I use some other STL the problem is not seen. Details of platform and binaries as follows:

Platform: 11.23 PA PI
aCC: HP ANSI C++ B3910B X.03.37.01
92453-07 linker command s800.sgs ld PA64 B.11.16.01 BE 030415

C client links to :
dynamic /usr/lib/libm.2
dynamic /usr/lib/libcl.2
dynamic /usr/lib/libdld.2
dynamic /usr/lib/libc.2

C++ shared library links to:
dynamic /usr/lib/libstd_v2.2
dynamic /usr/lib/libCsup_v2.2
dynamic /usr/lib/libm.2
dynamic /usr/lib/libcl.2

C++ binaries built with:
-O -g +W829,612,749,495 +Z -Wc,-ansi_for_scope,on -z +DA1.1 -AA -DHPUX -D__EXTERN_C__ -D_REENTRANT -D_RW_MULTI_THREAD -D_RWSTD_MULTI_THREAD -D_POSIX_C_SOURCE=199506L -D_HPUX_SOURCE

shared library creation options: -b -AA

Can someone help here?

Thanks,
- Madhu
6 REPLIES 6
Madhuwati
Occasional Contributor

Re: coredump in exit while using roguewave stl

I had posted the same question on some other forum by mistake and am copy pasting the one reply I got for my question:

http://h21007.www2.hp.com/cmdspp/QuestionAnswer/1%2C1764%2C762C249F-5983-4B80-902C-E7A750858136%2C00.html

Reply By Mike Stroyan :
--------------------------

That would typically be caused by a destructor that is trying to reference an address that was part of global data in
another shared library. The shared libraries each get their destructors called before their text and data get unmapped. If one of the shared libaries is unloaded but one of its text or data addresses recorded in another libraries data structure, then using that address in a destructor can cause a segmentation fault.
You could use "info share" before going into exit to see what address ranges correspond to what libraries. You could then look at the address that caused the segment fault and map that back to a particular shared library.
The cure is to find and remove code that leaves an address in use on an unloaded shared library. A quick workaround is to find that library that was loaded at the faulting address and link that library with "-B nodelete" so it won't actually unload.

P.S. This forum is for itanium development. Questions about PA-RISC would fit in better in another forum such as the
HP-UX languages forum at
http://forums1.itrc.hp.com/service/forums/categoryhome.do?categoryId=150
or the cxx dev mailing list at
http://h21007.www2.hp.com/dspp/ml/ml_MailingLists_IDX/1,1275,,00.html#24
Madhuwati
Occasional Contributor

Re: coredump in exit while using roguewave stl

Hi Mike,

The particular address "addr" from _C_unlink(this="addr") was mapped in /usr/lib/libstd_v2.2

Could not use the -B nodelete in cc which is used to build the C client. So used aCC for compilation of the C client and still that did not help.

On looking at the this="addr2" from the basic_string destructor, I found that this address has null string stored:
gdb > x/2c 0x40013fd4
0x40013fd4 : 0 '\000' 0 '\000'
But this seems correct as the string has been deleted.

Also something to note is an nm on my C client shows this symbol trans_rec_type as undefined:

bash-2.04# nm /vxrlink | grep trans_rec_type
trans_rec_type |1073823708|undef |common |$BSS$

As I said previously, I do not get this core dump if I do not use the Roguwave STL.
Mike Stroyan
Honored Contributor

Re: coredump in exit while using roguewave stl

It is not clear to me that the "addr" from _C_unlink(this="addr") was actually the address that caused the SIGSEGV. Look at the failing instruction and the register that it is using for the load or store address. That faulting address might belong to some already unmapped shared library. Or it might be a completely invalid address arising from some other error.

The /usr/lib/libstd_v2.2 library certainly should not be unmapped if a shared library that depends on it is still mapped.
Madhuwati_1
New Member

Re: coredump in exit while using roguewave stl

Hi Mike,

I tried to follow your pointers below but got lost totally.

>Look at the failing instruction and the
>register that it is using for the load or
>store address. That faulting address might
>belong to some already unmapped shared
>library. Or it might be a complete ly invalid
>address arising from some other error.

The core dump happened at

std::basic_string,std::allocator>::~basic_string,std::allocator>+0x20 ()

This instruction is mapped to libstd.

diassembling this address gives
<_C_unlink__Q2_3std12basic_stringXTcTQ2_3std11char_traitsXTc_TQ2_3std9allocatorXTc__Fv+32>: ldw -0xc(%r31),%r21

NOTE: the earlier core was overwritten by new one hence different values.

(gdb) info registers r31
r31 2
(gdb) info registers r21
r21 c2dbbc2c
(gdb) info registers r2
rp c2dbbc03
(gdb) x/c 0xc2dbbc2c
0xc2dbbc2c
<_C_unlink__Q2_3std12basic_stringXTcTQ2_3std11char_traitsXTc_TQ2_3std9allocatorXTc__Fv>:
107 'k'
(gdb) x/c 0xc2dbbc03
0xc2dbbc03
<__dt__Q2_3std12basic_stringXTcTQ2_3std11char_traitsXTc_TQ2_3std9allocatorXTc__Fv+35>:
-63 '\301'

How do I proceed further?

Thanks,
- Madhu
Madhuwati_1
New Member

Re: coredump in exit while using roguewave stl

Hi Mike,

Ok, So I tried a bit harder and this is what I found out:

The instruction where we get the segmentation fault is:
0xc2f6d67c in
std::basic_string,std::allocator>::_C_unlink
(this=0x7afe75b0) at
/src/hp/build/kernel/11.23-ic71l,VM/ux/vbe/pa/opt/aCC/include_std/string:993
993 if (!_C_data)

First we look at the 'info share' to see what those data and instructions are these:
(gdb) info share
/usr/lib/libvrascmd.sl
0x00000201 0xc2f00000 0xc3002000 0x7afed000 0x7b008000 0x7afedf34
/usr/lib/libstd_v2.2
0x00000201 0xc2d00000 0xc2ef3000 0x7afc9000 0x7afe8000 0x7afcaca0

Clearly the instruction at 0xc2f6d67c belongs to:
/usr/lib/libvrascmd.sl

And the data from this belongs to /usr/lib/libstd_v2.2

Looking at the this pointer to _C_unlink:

(gdb) x 0x7afe75b0
0x7afe75b0: 0x7aff4588
(gdb) x 0x7aff4588
0x7aff4588: 0x00000000
Now, Looking at this instruction 0xc2f6d67c:

(gdb) x/i 0xc2f6d67c
0xc2f6d67c
<_C_unlink__Q2_3std12basic_stringX\TcTQ2_3std11char_traitsXTc_TQ2_3std9allocatorXTc__Fv+28>:
ldw -0xc(%r31),%r20

We would now look at the registers r31 and r20. The above instruction is trying to load a word into general register. The displacement -0xc is added to %r31 = 8089c00


(gdb) info registers
flags: 2f000041
r1: 8089c00 rp: c2f6d7f3
r19: 7afedf34 r20: 0
r31: 1 sar: 10

By doing the above I get :

(gdb) print/x 0x8089c00 - 0xc
$1 = 0x8089bf4

(gdb) x 0x8089bf4
0x8089bf4: Cannot access memory at address 0x8089bf4

This could be the reason why I am getting a segmentation fault.

Core was generated by `vxrlink'.
Program terminated with signal 11, Segmentation fault.

- Madhu


Mike Stroyan
Honored Contributor

Re: coredump in exit while using roguewave stl

Your gdb output actually shows that r31 contains 2 in one run and 1 in another run. That would cause the ldw instruction to be trying bad addresses like 0xfffffff5 or 0xfffffff6.
It is clear that the data loaded into r31 is junk, not an address. Perhaps you could mail me the disassembly of the crashing function at
mike.stroyan@hp.com and I could have a look at where the bad data was loaded from. It is most likely that the bad data is from some object that has been deleted. The bad address is being read from an address that has be reused and overwritten.