Operating System - HP-UX
1752600 Members
3980 Online
108788 Solutions
New Discussion юеВ

Re: Aries coredumps/memory limits.

 
UNIXTEK
Frequent Advisor

Re: Aries coredumps/memory limits.

Hi Rajesh

I hear you. Will open a support case as well. It can just be unbelievable hard to get to the _right_ people through the proper channels in HP ;-)

Sometimes it is way easier just talking to them in here :-)
Dennis Handly
Acclaimed Contributor

Re: Aries coredumps/memory limits.

>Attached a full trace.

Are you able to use /usr/ccs/bin/gdbpa to get a stack trace from your core file?
UNIXTEK
Frequent Advisor

Re: Aries coredumps/memory limits.

tsiv@gobi|pts/2:/home/tsiv$ sudo /usr/ccs/bin/gdbpa /opt/bmc/BladeLogic/8.0/NSH/bin/rscd_full /core.rscd_full
HP gdb 5.9 for PA-RISC 1.1 or 2.0 (narrow), HP-UX 11.00
and target hppa1.1-hp-hpux11.00.
Copyright 1986 - 2001 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 5.9 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
warning: Load module /opt/bmc/BladeLogic/8.0/NSH/bin/rscd_full has been stripped

(no debugging symbols found)...
Core was generated by `rscd_full'.
Program terminated with signal 11, Segmentation fault.
SEGV_UNKNOWN - Unknown Error
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
#0 0xd1054264 in _ZNSsC1ERKSs+0xe4 ()
from /opt/bmc/BladeLogic/8.0/NSH/daal/Implementation/SystemInfo_libdaalpluginsysinfo.sl/libdaalpluginsysinfo.sl

warning: core file might be corrupted.
(gdb) bt
#0 0xd1054264 in _ZNSsC1ERKSs+0xe4 ()
from /opt/bmc/BladeLogic/8.0/NSH/daal/Implementation/SystemInfo_libdaalpluginsysinfo.sl/libdaalpluginsysinfo.sl
#1 0xd1053028 in _ZN22SysinfoVideoCardParser11parseTokensERSt6vectorISsSaISsEE
+0x500 ()
from /opt/bmc/BladeLogic/8.0/NSH/daal/Implementation/SystemInfo_libdaalpluginsysinfo.sl/libdaalpluginsysinfo.sl
#2 0xd0fcc480 in _ZN19SysinfoDeviceParser11parseTokensERSt6vectorISsSaISsEE
+0x398 ()
from /opt/bmc/BladeLogic/8.0/NSH/daal/Implementation/SystemInfo_libdaalpluginsysinfo.sl/libdaalpluginsysinfo.sl
#3 0xd100ec74 in _ZN13SysinfoParser9parseLineERSs+0x4ec ()
from /opt/bmc/BladeLogic/8.0/NSH/daal/Implementation/SystemInfo_libdaalpluginsysinfo.sl/libdaalpluginsysinfo.sl
#4 0xd100cc5c in _ZN13SysinfoParser4loadEPN12plugincommon5AssetEb+0x58c ()
from /opt/bmc/BladeLogic/8.0/NSH/daal/Implementation/SystemInfo_libdaalpluginsysinfo.sl/libdaalpluginsysinfo.sl
#5 0xd1064d24 in _ZN7sysinfo12SysinfoAsset16onGetDescendantsERN12plugincommon6StreamEi+0x164 ()
from /opt/bmc/BladeLogic/8.0/NSH/daal/Implementation/SystemInfo_libdaalpluginsysinfo.sl/libdaalpluginsysinfo.sl
#6 0xd10797ec in _ZN12plugincommon5Asset21bl_OpenGetDescendantsEiPPK21blAssetAttrDescriptorP13blAssetStream+0x10c ()
UNIXTEK
Frequent Advisor

Re: Aries coredumps/memory limits.

Forgot something

from /opt/bmc/BladeLogic/8.0/NSH/daal/Implementation/SystemInfo_libdaalpluginsysinfo.sl/libdaalpluginsysinfo.sl
#7 0xd105d27c in blAsset_OpenGetDescendants+0xfc ()
from /opt/bmc/BladeLogic/8.0/NSH/daal/Implementation/SystemInfo_libdaalpluginsysinfo.sl/libdaalpluginsysinfo.sl
#8 0x27da14 in + 0x23c ()
#9 0x238760 in + 0xa78 ()
#10 0x1fc158 in + 0x130 ()
#11 0x18e740 in + 0x2e0 ()
#12 0x1b8024 in + 0x814 ()
#13 0xceabb1e4 in _ZN3Rpc9RpcServer13executeMethodEPNS_10RpcContextERKSsRNS_8RpcValueES6_+0x12c () from /opt/bmc/BladeLogic/8.0/NSH/lib/librpccommon.sl
#14 0xceac572c in _ZN3Rpc22RpcTCPServerConnection14executeRequestERNS_20RpcNormalizedMessageE+0x234 () from /opt/bmc/BladeLogic/8.0/NSH/lib/librpccommon.sl
#15 0xceac4d68 in _ZN3Rpc22RpcTCPServerConnection13writeResponseERNS_20RpcNormalizedMessageEi+0x3b0 () from /opt/bmc/BladeLogic/8.0/NSH/lib/librpccommon.sl
#16 0xceac479c in _ZN3Rpc22RpcTCPServerConnection11handleEventEji+0x5b4 ()
from /opt/bmc/BladeLogic/8.0/NSH/lib/librpccommon.sl
#17 0xceac2620 in _ZN3Rpc14RpcTCPDispatch4workEd+0x5e8 ()
from /opt/bmc/BladeLogic/8.0/NSH/lib/librpccommon.sl
#18 0xceb468e8 in agent_rpc_dispatch+0x38 ()
from /opt/bmc/BladeLogic/8.0/NSH/lib/libagentrpc.sl
#19 0x6e548 in + 0x1c0 ()
#20 0x7c344 in + 0x47c ()
#21 0x7bd30 in + 0x128 ()
#22 0x7bb28 in + 0x8f8 ()
UNIXTEK
Frequent Advisor

Re: Aries coredumps/memory limits.

The funny thing is

* that if I run the IA64 binary by hand it works.
* I have other machines, where it works.
Rajesh K Chaurasia
Valued Contributor

Re: Aries coredumps/memory limits.

@UNIXTEK,

There are no clues from the tusc log. However I noticed that process 17838 prints following on stdout, before receiving SIGSEGV

06/08/11 13:42:3 IndexedName not found. one created is hardware.networkcard:1
06/08/11 13:42:3 IndexedName not found. one created is hardware.networkcard:2
06/08/11 13:42:3
Here comes SIGSEGV

Does this sound something familiar from application point of view? Could this be related to application configuration?

You could try one more debugging step,
Create $HOME/.ariesrc file with following content

-notrans

and restart the application. It will behave very slow but if it does not fail with same symptoms, the error could be related to ARIES dynamic translator. If the application still fails, the issue could be related to application configuration/setup. What is the difference in application setup on servers where is works and the server where it fails?

Do not forget to remove the .ariesrc file after the experimental run.

Regards
-Rajesh
Dennis Handly
Acclaimed Contributor

Re: Aries coredumps/memory limits.

#0 0xd1054264 in _ZNSsC1ERKSs+0xe4

This is aborting in:
string::string(string const&)

At frame 0, you might want to try:
info reg
disas $pc-4*16 $pc+4*8

>that if I run the IA64 binary by hand it works.

What IA64 binary? It is the PA executable that is aborting.
UNIXTEK
Frequent Advisor

Re: Aries coredumps/memory limits.

Hi Guys

The cat is out of the box. The BladeLogic agent is a PARISC 1.1 agent even on itanium. If I instruct the agent to give me 'system info', on PArisc machines it will call a PA risc binary file to fetch the info. On IA64 it will call an IA64 binary. That is what I mean with "if I run the IA64 binary by hand".

If I put
/opt/bmc/BladeLogic/8.0/NSH/bin/rscd_full -notrans

in /.ariesrc and restarts the BL agent it runs dog slow, but still produces a nice coredump ;-)

The dissassmbled code:

(gdb) info reg
flags: 41
r1: 7 rp/r2: d1054247 r3: 7ab11d78 r4: 7af408ec r5: 400309a0 r6: 7b0452d0
r7: 4006c368 r8: 0 r9: 4 r10: 8 r11: 10 r12: 91
r13: 2 r14: 1000 r15: 400598d0 r16: 0 r17: 0 r18: 0
r19: 7ab11d78 r20: 7b045ac8 r21: 336638 r22: 40030938 arg3/r23: 400b8854 arg2/r24: 400309a0
arg1/r25: 7b045ac8 arg0/r26: 656763 dp/gp/r27: 4002e590 ret0/r28: 65676f ret1/ap/r29: 0 sp/r30: 7b045c00
mrp/r31: d10c988b sar/cr11: 20 pcoqh: d1054264 pcsqh: 0 pcoqt: d1054268 pcsqt: 0
eiem/cr15: 0 iir/cr19: f501094 isr/cr20: 0 ior/cr21: 0 ipsw/cr22: 4000f goto: 2
sr4: 4 sr0: 7 sr1: 4 sr2: 2 sr3: 3 sr5: 5
sr6: 6 sr7: 7 rctr/cr0: 0 pidr1/cr8: 0 pidr2/cr9: 0 ccr/cr10: 0
pidr3/cr12: 0 pidr4/cr13: 0 cr24: 0 cr25: 0 cr26: 0 mpsfu_high: 7ade0098
mpsfu_low: 0 mpsfu_ovflo: 0 pad: 0 fpsr: 8000000 fpe1: 0 fpe2: 0
fpe3: 0 fpe4: 0 fpe5: 0 fpe6: 0 fpe7: 0
(gdb) disas $pc-4*16 $pc+4*8
Dump of assembler code from 0xd1054224 to 0xd1054284:
0xd1054224 <_ZNSsC1ERKSs+0xa4>: stw %r20,-0x100(%sp)
0xd1054228 <_ZNSsC1ERKSs+0xa8>: b,l 0xd1054230 <_ZNSsC1ERKSs+0xb0>,%r1
0xd105422c <_ZNSsC1ERKSs+0xac>: depwi 0,31,2,%r1
0xd1054230 <_ZNSsC1ERKSs+0xb0>: ldo 0xec(%r1),%r1
0xd1054234 <_ZNSsC1ERKSs+0xb4>: stw %r1,-0xfc(%sp)
0xd1054238 <_ZNSsC1ERKSs+0xb8>: stw %sp,-0xf8(%sp)
0xd105423c <_ZNSsC1ERKSs+0xbc>: b,l 0xd104d108 <_Unwind_SjLj_Register>,%rp
0xd1054240 <_ZNSsC1ERKSs+0xc0>: ldo -0x120(%sp),%r26
0xd1054244 <_ZNSsC1ERKSs+0xc4>: ldw -0xf0(%sp),%r19
0xd1054248 <_ZNSsC1ERKSs+0xc8>: ldw -0x164(%sp),%r20
0xd105424c <_ZNSsC1ERKSs+0xcc>: stw %r20,-0xf4(%sp)
0xd1054250 <_ZNSsC1ERKSs+0xd0>: ldw -0x168(%sp),%r20
0xd1054254 <_ZNSsC1ERKSs+0xd4>: ldw 0(%r20),%ret0
0xd1054258 <_ZNSsC1ERKSs+0xd8>: ldo -0xc(%ret0),%r26
0xd105425c <_ZNSsC1ERKSs+0xdc>: ldo -0x138(%sp),%r20
0xd1054260 <_ZNSsC1ERKSs+0xe0>: copy %r20,%r25
0xd1054264 <_ZNSsC1ERKSs+0xe4>: ldw 8(%r26),%r20
0xd1054268 <_ZNSsC1ERKSs+0xe8>: cmpib,> 0,%r20,0xd1054280 <_ZNSsC1ERKSs+0x100>
0xd105426c <_ZNSsC1ERKSs+0xec>: ldi 1,%r20
0xd1054270 <_ZNSsC1ERKSs+0xf0>: ldw -4(%ret0),%r20
0xd1054274 <_ZNSsC1ERKSs+0xf4>: ldo 1(%r20),%r20
0xd1054278 <_ZNSsC1ERKSs+0xf8>: b 0xd1054290 <_ZNSsC1ERKSs+0x110>
0xd105427c <_ZNSsC1ERKSs+0xfc>: stw %r20,-4(%ret0)
0xd1054280 <_ZNSsC1ERKSs+0x100>: stw %r20,-0x11c(%sp)
End of assembler dump.
Dennis Handly
Acclaimed Contributor

Re: Aries coredumps/memory limits.

0xd1054250 <_ZNS+0xd0>: ldw -0x168(%sp),%r20
0xd1054254 <_ZNS+0xd4>: ldw 0(%r20),%ret0
0xd1054258 <_ZNS+0xd8>: ldo -0xc(%ret0),%r26
0xd1054264 <_ZNS+0xe4>: ldw 8(%r26),%r20

This could be memory corruption. R26 has: 0x656763
This is NUL, "egc". Not like a data address?
It is r28: 65676f - 0xc: which is "ego".

And R20 has a stack address: r20: 7b045ac8