- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Core dump (heap corruption)
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2012 01:36 AM
08-02-2012 01:36 AM
Re: Core dump (heap corruption)
Thanks Dennis.
Will start applying the steps provided by you once we receive packcore.
Meanwhile I was browsing throw the code. What I found is :
In .c file I have :
PCA_BufferAllocator* PIMessage::bufAllocator = new(nothrow) PCA_BufferAllocator();
in .h :
private:
static PCA_BufferAllocator* bufAllocator;
there are chunks of buffers created in pca_bufferallocator using new/malloc char * L_mem, L_mem = NEW_NOTHROW char[P_size]; delete on this buffer is causing the dump. Have any inpus that static/global corrutpiton may causing this. Your valuable comments will help me a lot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2012 02:34 AM
08-02-2012 02:34 AM
Re: Core dump (heap corruption)
>PCA_BufferAllocator* PIMessage::bufAllocator = new(nothrow) PCA_BufferAllocator();
Is there an explicit check that bufAllocator isn't NULL?
>there are chunks of buffers created in pca_bufferallocator using new/malloc
So the class PCA_BufferAllocator contains pointers to these?
>delete on this buffer is causing the dump. Have any inputs that static/global corruption may causing this.
What is being deleted? bufAllocator or a field within it? The latter would still be heap corruption.
The former could have the static area overwritten. But you would notice it when you get the $r32 value.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 02:40 AM
08-10-2012 02:40 AM
Re: Core dump (heap corruption)
Hi Dennis,
I did run gdb on the core dump. I could find the corruption/double deletion in frame 4.
(gdb) frame 4
(gdb) info reg
gr32: 0xc00000000000048e
(gdb) x 0xc00000000000048e
0xc00000000000048e: Cannot access memory at address 0xc00000000000048e
(gdb)
This memory is allocated in heap and multiple objects refer to this address location. I suspect one of the object is freeing is this memory location and when other onject tries to free this corrupton is happening. Suspect a corener case.
Please can you comment how to go ahead. Thanks for ur help
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 04:36 AM
08-10-2012 04:36 AM
Re: Core dump (heap corruption)
>I did run gdb on the core dump. I could find the corruption/double deletion in frame 4.
>(gdb) frame 4
>gr32: 0xc00000000000048e
Why are you looking at frame 4? Where is your whole stack trace? I suggested looking at frames 3 .. 0.
>This memory is allocated in heap and multiple objects refer to this address location.
Why do you say that? It can't be invalid if it is in the heap. And the above typically isn't a heap address.
>Please can you comment how to go ahead.
Provide the output I requested.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 05:01 AM
08-10-2012 05:01 AM
Re: Core dump (heap corruption)
Dennis,
Full bt.
#0 0xc0000000003ad110:1 in tree_delete+0xd1 () from /usr/lib/hpux64/libc.so.1
#1 0xc0000000003a84c0:0 in real_free+0x600 () from /usr/lib/hpux64/libc.so.1
#2 0xc0000000003b3210:0 in free+0x170 () from /usr/lib/hpux64/libc.so.1
#3 0x4000000000ff0e70:0 in operator delete[] ()
at build/OCSCframeworks/code/MemoryHandler/MemHandler.C:538
#4 0x4000000001150f10:0 in PCA_BufferAllocator::free ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_BufferAllocator.C:73
#5 0x4000000001145ba0:0 in PCA_DataBuffer_impl::decRef ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_DataBuffer_impl.C:78
#6 0x400000000113d890:0 in PCA_DataBuffer::~PCA_DataBuffer ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_DataBuffer.C:75
#7 0x40000000011398f0:0 in PCA_ArrayBuffer::erase ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_ArrayBuffer.C:211
#8 0x4000000001139ac0:0 in PCA_ArrayBuffer::~PCA_ArrayBuffer ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_ArrayBuffer.C:81
#9 0x400000000114cfd0:0 in PCA_Message_impl::~PCA_Message_impl ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_Message_impl.C:93
#10 0x4000000001147640:0 in PCA_Message::~PCA_Message ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_Message.C:74
#11 0x40000000007863e0:0 in PIMessage::~PIMessage ()
at build/OCSEPcore/code/IC32_Messages/PIMessage.C:125
#12 0x40000000007bb210:0 in SLEEMessage_ImplOf<PIMessage>::~SLEEMessage_ImplOf
() at build/OCSEPcore/code/IC32_SLEEMessage/SLEEMessage_ImplOf.C:31
---Type <return> to continue, or q <return> to quit---
#13 0x40000000007baa30:0 in Inter_PIMessage::~Inter_PIMessage ()
at build/OCSEPcore/code/IC32_SLEEMessage/SLEEMessageOf.C:79
#14 0x40000000009475e0:0 in SlelMessageRef::setRef ()
at build/OCSEPcore/code/IC32_SLEL_common/SlelMessageRefForSlee.C:147
#15 0x40000000007d6e90:0 in Inter::Run ()
at build/OCSEPcore/code/IC32_SLEL/Interpreter.C:751
#16 0x40000000005ee820:0 in CCTX_CallContext::processMessage ()
at build/OCSEPcore/code/IC32_CallContextMgt/CCTX_CallContext.C:1480
#17 0x40000000005dfb30:0 in CCTX_CallProcessor::processRequest ()
at build/OCSEPcore/code/IC32_CallContextMgt/CallProcessor.C:733
#18 0x4000000000523560:0 in GS_AnyHandler2Task<Handler>::process ()
For frame 0:
gr28: 0xf0fff00
gr29: 0x9fffffffef6e7d0c
gr30: 0x6000000000056698
gr31: 0xfb
gr32: 0
gr33: 0x600000010d5518b8
gr34: 0x1
For frame 1:
gr28: 0xf0fff00
gr29: 0x9fffffffef6e7d0c
gr30: 0x6000000000056698
gr31: 0xfb
gr32: 0
gr33: 0x9fffffffef6e61b0
gr34: 0x600000010fae8f40
gr35: 0xc00000000000060f
gr36: 0xc0000000003b3210
gr37: 0x944d
for Frame 2
gr28: 0xf0fff00
gr29: 0x9fffffffef6e7d0c
gr30: 0x6000000000056698
gr31: 0xfb
gr32: 0x9fffffffef6dab10
gr33: 0x9fffffffef6e0e98
gr34: 0xc000000000000713
for frame 3 :
gr29: 0x9fffffffef6e7d0c
gr30: 0x6000000000056698
gr31: 0xfb
gr32: 0x600000000004d7e8
gr33: 0x60000000000537c0
gr34: 0xc000000000000205
gr35: 0x4000000001150f10
gr36: 0x6000000000053298
gr37: 0x404
Please let me know wht else data shoud l capture for this.
Thanks :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 05:10 AM
08-10-2012 05:10 AM
Re: Core dump (heap corruption)
I don't see a common address in r32 in those frames. So we would have to disassemble frame #3.
Also where is is this info for frame 0?
(gdb) info reg
(gdb) disas $pc-16*12 $pc+16*4
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 06:45 AM
08-10-2012 06:45 AM
Re: Core dump (heap corruption)
Hi Dennis,
PLs find the info below infomration. PLs let me know if any info needed. Thanks for ur great support.
info reg of frame 0:
pr63: 0
gr0: 0
gr1: 0x9fffffffef6e0e98
gr2: 0x9fffffff5ffe7c00
gr3: 0x9fffffff5ffe7c00
gr4: 0
gr5: 0xc000000000000408
gr6: 0xc000000000045890
gr7: 0x9fffffffef7f8de8
gr8: 0x440
gr9: 0x450
gr10: 0x600000000019fa00
gr11: 0x440
gr12: 0x9fffffffffffe790
gr13: 0x9fffffffef6a94b0
gr14: 0x440
gr15: 0
gr16: 0x600000010f2c3728
gr17: 0xf0fff20
gr18: 0x10f5c6558
gr19: 0x60000000001a0290
gr20: 0x441
gr21: 0x6000000110bbfff8
gr22: 0xf0fff00
gr23: 0xf0fff18
gr24: 0x440
gr25: 0x1
gr26: 0
gr27: 0x100000001
---Type <return> to continue, or q <return> to quit---
gr28: 0xf0fff00
gr29: 0x9fffffffef6e7d0c
gr30: 0x6000000000056698
gr31: 0xfb
gr32: 0
gr33: 0x600000010d5518b8
gr34: 0x1
br0: 0xc0000000003a84c0
br1: 0xc000000000427f40
br2: 0xc00000000029d160
br3: 0
br4: 0
br5: 0
br6: 0xc00000000042e870
br7: 0xc00000000013cd20
rsc: 0x1f
bsp: 0x9fffffffef7ffa60
bspst: 0x9fffffffef7ff7d0
rnat: 0
ccv: 0x100000001
unat: 0
fpsr: 0x9804c9a74433f
pfs: 0xc000000000000a17
---Type <return> to continue, or q <return> to quit---
(sor:0, sol:20, sof:23)
lc: 0x63
ec: 0
ip: 0xc0000000003ad110:1
cfm: 0x3
(sor:0, sol:0, sof:3)
psr: 0x3a3a616c6c
(gdb)
(gdb) disas $pc-16*12 $pc+16*4
Dump of assembler code from 0xc0000000003ad050:0 to 0xc0000000003ad150:0:
0xc0000000003ad050:0 <tree_delete+0x10>:
adds ret3=16,r33 MMB,
0xc0000000003ad050:1 <tree_delete+0x11>: nop.m 0x0
0xc0000000003ad050:2 <tree_delete+0x12>: (p6) br.ret.dpnt.many rp;;
0xc0000000003ad060:0 <tree_delete+0x20>:
ld8 r28=[ret2] MMI
0xc0000000003ad060:1 <tree_delete+0x21>:
ld8 r22=[ret3]
0xc0000000003ad060:2 <tree_delete+0x22>:
shl ret1=ret1,8
0xc0000000003ad070:0 <tree_delete+0x30>:
addl ret2=0xffffffffffffd968,gp MIB,
0xc0000000003ad070:1 <tree_delete+0x31>: mov r25=1
0xc0000000003ad070:2 <tree_delete+0x32>: nop.b 0x0;;
0xc0000000003ad080:0 <tree_delete+0x40>:
ld8 ret2=[ret2] MMI
0xc0000000003ad080:1 <tree_delete+0x41>:
cmp.ne.unc p7=r0,r28
0xc0000000003ad080:2 <tree_delete+0x42>: mov ret3=r22
0xc0000000003ad090:0 <tree_delete+0x50>:
adds r17=32,r28 MMI,
0xc0000000003ad090:1 <tree_delete+0x51>:
adds r15=32,r22
0xc0000000003ad090:2 <tree_delete+0x52>:
mov r26=r22;;
0xc0000000003ad0a0:0 <tree_delete+0x60>:
(p7) cmp.ne.unc p6=r0,ret3 MMI
0xc0000000003ad0a0:1 <tree_delete+0x61>:
add ret2=ret1,ret2
0xc0000000003ad0a0:2 <tree_delete+0x62>:
cmp.ne.and p7=r0,ret3
0xc0000000003ad0b0:0 <tree_delete+0x70>:
mov ret1=r22 MMB,
0xc0000000003ad0b0:1 <tree_delete+0x71>:
adds r23=24,r28
0xc0000000003ad0b0:2 <tree_delete+0x72>:
(p7) br.cond.dpnt.many 0xc0000000003ad160;; or r22=r28,r22 MII,
0xc0000000003ad0c0:1 <tree_delete+0x81>: nop.i 0x0
0xc0000000003ad0c0:2 <tree_delete+0x82>: nop.i 0x0;;
0xc0000000003ad0d0:0 <tree_delete+0x90>:
ld8.s r14=[r33] MMI,
0xc0000000003ad0d0:1 <tree_delete+0x91>:
ld8 r15=[ret2]
0xc0000000003ad0d0:2 <tree_delete+0x92>:
cmp.ne.unc p6=r0,r22;;
0xc0000000003ad0e0:0 <tree_delete+0xa0>:
(p6) ld8.a ret3=[r33] MII,
0xc0000000003ad0e0:1 <tree_delete+0xa1>:
adds ret1=8,r14
0xc0000000003ad0e0:2 <tree_delete+0xa2>:
cmp.eq.unc p7,p8=r33,r15;;
0xc0000000003ad0f0:0 <tree_delete+0xb0>:
(p8) ld8.s r15=[ret1] MMI,
0xc0000000003ad0f0:1 <tree_delete+0xb1>:
(p7) st8 [ret2]=r22
---Type <return> to continue, or q <return
0xc0000000003ad0c0:0 <tree_delete+0x80>:
xc0000000003ad0f0:2 <tree_delete+0xb2>:
(p8) chk.s.i r14,0xc0000000003ad4f0;;
0xc0000000003ad100:0 <tree_delete+0xc0>:
(p8) chk.s.m r15,0xc0000000003ad510 MI,I
0xc0000000003ad100:1 <tree_delete+0xc1>:
(p8) cmp.eq.unc p7,p8=r33,r15;;
0xc0000000003ad100:2 <tree_delete+0xc2>:
(p8) adds ret1=16,r14
0xc0000000003ad110:0 <tree_delete+0xd0>:
(p7) st8 [ret1]=r22;; M,MI,
0xc0000000003ad110:1 <tree_delete+0xd1>:
(p8) st8 [ret1]=r22
0xc0000000003ad110:2 <tree_delete+0xd2>: nop.i 0x0;;
0xc0000000003ad120:0 <tree_delete+0xe0>:
(p6) ld8.c.clr ret3=[r33] MMI,
0xc0000000003ad120:1 <tree_delete+0xe1>:
(p6) st8 [r22]=ret3
0xc0000000003ad120:2 <tree_delete+0xe2>: nop.i 0x0;;
0xc0000000003ad130:0 <tree_delete+0xf0>:
adds ret1=48,ret2
0xc0000000003ad130:1 <tree_delete+0xf1>:
cmp4.ne.unc p6=r0,r34;;
0xc0000000003ad130:2 <tree_delete+0xf2>: nop.i 0x0
0xc0000000003ad140:0 <tree_delete+0x100>:
(p6) ld8 ret2=[ret1];; M,MI
0xc0000000003ad140:1 <tree_delete+0x101>:
(p6) st8 [r33]=ret2
0xc0000000003ad140:2 <tree_delete+0x102>: nop.i 0x0
End of assembler dump.
(gdb)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 12:17 PM - edited 08-13-2012 01:37 AM
08-10-2012 12:17 PM - edited 08-13-2012 01:37 AM
Re: Core dump (heap corruption)
>find the info below
This is where it is aborting:
0xc0000000003ad0d0:0 <tree_delete+0x90>: ld8.s r14=[r33]
0xc0000000003ad100:1 <tree_delete+0xc1>: (p8) cmp.eq.unc p7,p8=r33,r15;;
0xc0000000003ad100:2 <tree_delete+0xc2>: (p8) adds ret1=16,r14
0xc0000000003ad110:0 <tree_delete+0xd0>: (p7) st8 [ret1]=r22;;
0xc0000000003ad110:1 <tree_delete+0xd1>: (p8) st8 [ret1]=r22 <<
gr9: 0x450
gr14: 0x440
r9 is bad, which comes from r14. Which probably comes from r33's location.
What is "x /gx $r33"?
Looks like part of the output is missing and some is badly formatted. Try redirecting to a file and attaching it:
(gdb) set redirect-file gdb.out
(gdb) set redirect on
(gdb) set width 160
(gdb) bt
(gdb) frame 0
(gdb) info reg
(gdb) disas $pc-16*20 $pc+16*12
(gdb) frame 3
(gdb) info reg
(gdb) disas 0x4000000000ff0e70
(gdb) set redirect off
Then attach gdb.out.
0x450
- Tags:
- set redirect
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-12-2012 09:45 PM - edited 08-13-2012 02:13 AM
08-12-2012 09:45 PM - edited 08-13-2012 02:13 AM
Re: Core dump (heap corruption)
Hi Dennis,
Thanks for the reply.
The gbd file is attached along with this mail and reqested out put is here.
(gdb) x /gx$gr33
0x60000000000537c0 <__gp>: 0x6000000000032b20
(gdb) x/gx $gr33
gr9: 0x450
gr14: 0x440
r9 is bad, which come from r14. Which probably comes from r33's location.
[Prabhakar] Pls can you tell me how to conlucde here thet it is bad address. using `x $gr9`?
Would like to know one more thing that the stack corruption may have invalid values and could have caused some issue? below is the code:
PCA_DataBuffer_impl::~PCA_DataBuffer_impl()
{
if ( !(flags & PCA_DataBuffer::DONT_DELETE) ) {
allocator_->free(base_);
}
}
void
PCA_DataBuffer_impl::decRef()
{
if ( --ref_count == 0 ) {
delete this;
}
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-13-2012 02:54 AM - edited 08-13-2012 02:58 AM
08-13-2012 02:54 AM - edited 08-13-2012 02:58 AM
Re: Core dump (heap corruption)
>(gdb) x /gx $gr33 => 0x60000000000537c0 <__gp>: 0x6000000000032b20
This is the value for frame 3, not frame 0.
>[Prabhakar] can you tell me how to conclude here that it is bad address, using $gr9?
ret1 == $r9
For frame 3, the original r32 parm was copied to r43 and 16 subtracted:
0x4000000000ff0cd0:0 <operator delete[](void*)+0x30>: adds r43=-16,r32 free parm
gr43: 0x600000010d5518d0 free parm (delete parm-16)
gr46: 0x9fffffffef6dab10 out0 (was changed in frame 2, in0 for free)
So r43 is the region being freed.
In frame 0, the value in r33 is 3*8 bytes before that:
gr33: 0x600000010d5518b8 free parm - 8*3
So in frame 0:
(gdb) x /8gx $r33-8*2
(gdb) x /20s $r33-8*2
>Would like to know one more thing that the stack corruption may have invalid values and could have caused some issue?
Sure anything is possible but don't look for zebras just yet.
- Tags:
- zebras