Operating System - HP-UX
1753481 Members
4544 Online
108794 Solutions
New Discussion

Re: Core dump (heap corruption)

 
prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Thanks Dennis.

 

Will start applying the steps provided by you once we receive packcore.

 

Meanwhile I was browsing throw the code. What I found is :

 

In .c file I have :

PCA_BufferAllocator* PIMessage::bufAllocator = new(nothrow) PCA_BufferAllocator();

in .h :

 

private:
   static PCA_BufferAllocator* bufAllocator;

 

 there are chunks of buffers created in pca_bufferallocator using new/malloc char * L_mem, L_mem = NEW_NOTHROW char[P_size];  delete on this buffer is causing the dump. Have any inpus that static/global corrutpiton may causing this. Your valuable comments will help me a lot.

 

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>PCA_BufferAllocator* PIMessage::bufAllocator = new(nothrow) PCA_BufferAllocator();

 

Is there an explicit check that bufAllocator isn't NULL?

>there are chunks of buffers created in pca_bufferallocator using new/malloc

 

So the class PCA_BufferAllocator contains pointers to these?

 

>delete on this buffer is causing the dump. Have any inputs that static/global corruption may causing this.

 

What is being deleted?  bufAllocator or a field within it?  The latter would still be heap corruption.

The former could have the static area overwritten.  But you would notice it when you get the $r32 value.

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Hi Dennis,

 

 

I  did  run gdb on the core dump. I could find the corruption/double deletion in frame 4.

 

(gdb) frame 4

(gdb) info reg

gr32: 0xc00000000000048e

(gdb) x 0xc00000000000048e
0xc00000000000048e:     Cannot access memory at address 0xc00000000000048e
(gdb)

This memory is allocated in heap and multiple objects refer to this address location. I suspect one of the object is freeing is this memory location and when other onject tries to free this corrupton is happening. Suspect a corener case.

Please can you comment how to go ahead. Thanks for ur help

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>I  did run gdb on the core dump. I could find the corruption/double deletion in frame 4.
>(gdb) frame 4
>gr32: 0xc00000000000048e

 

Why are you looking at frame 4?  Where is your whole stack trace?  I suggested looking at frames 3 .. 0.

>This memory is allocated in heap and multiple objects refer to this address location.

 

Why do you say that?  It can't be invalid if it is in the heap.  And the above typically isn't a heap address.

 

>Please can you comment how to go ahead.

 

Provide the output I requested.

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Dennis,

Full bt.

 

#0  0xc0000000003ad110:1 in tree_delete+0xd1 () from /usr/lib/hpux64/libc.so.1
#1  0xc0000000003a84c0:0 in real_free+0x600 () from /usr/lib/hpux64/libc.so.1
#2  0xc0000000003b3210:0 in free+0x170 () from /usr/lib/hpux64/libc.so.1
#3  0x4000000000ff0e70:0 in operator delete[] ()
    at build/OCSCframeworks/code/MemoryHandler/MemHandler.C:538
#4  0x4000000001150f10:0 in PCA_BufferAllocator::free ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_BufferAllocator.C:73
#5  0x4000000001145ba0:0 in PCA_DataBuffer_impl::decRef ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_DataBuffer_impl.C:78
#6  0x400000000113d890:0 in PCA_DataBuffer::~PCA_DataBuffer ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_DataBuffer.C:75
#7  0x40000000011398f0:0 in PCA_ArrayBuffer::erase ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_ArrayBuffer.C:211
#8  0x4000000001139ac0:0 in PCA_ArrayBuffer::~PCA_ArrayBuffer ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_ArrayBuffer.C:81
#9  0x400000000114cfd0:0 in PCA_Message_impl::~PCA_Message_impl ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_Message_impl.C:93
#10 0x4000000001147640:0 in PCA_Message::~PCA_Message ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_Message.C:74
#11 0x40000000007863e0:0 in PIMessage::~PIMessage ()
    at build/OCSEPcore/code/IC32_Messages/PIMessage.C:125
#12 0x40000000007bb210:0 in SLEEMessage_ImplOf<PIMessage>::~SLEEMessage_ImplOf
    () at build/OCSEPcore/code/IC32_SLEEMessage/SLEEMessage_ImplOf.C:31
---Type <return> to continue, or q <return> to quit---
#13 0x40000000007baa30:0 in Inter_PIMessage::~Inter_PIMessage ()
    at build/OCSEPcore/code/IC32_SLEEMessage/SLEEMessageOf.C:79
#14 0x40000000009475e0:0 in SlelMessageRef::setRef ()
    at build/OCSEPcore/code/IC32_SLEL_common/SlelMessageRefForSlee.C:147
#15 0x40000000007d6e90:0 in Inter::Run ()
    at build/OCSEPcore/code/IC32_SLEL/Interpreter.C:751
#16 0x40000000005ee820:0 in CCTX_CallContext::processMessage ()
    at build/OCSEPcore/code/IC32_CallContextMgt/CCTX_CallContext.C:1480
#17 0x40000000005dfb30:0 in CCTX_CallProcessor::processRequest ()
    at build/OCSEPcore/code/IC32_CallContextMgt/CallProcessor.C:733
#18 0x4000000000523560:0 in GS_AnyHandler2Task<Handler>::process ()

 

For frame 0:

 

gr28:          0xf0fff00  
 gr29: 0x9fffffffef6e7d0c  
 gr30: 0x6000000000056698  
 gr31:               0xfb  
 gr32:                  0  
 gr33: 0x600000010d5518b8  
 gr34:                0x1

 

For frame 1:

 

gr28:          0xf0fff00  
 gr29: 0x9fffffffef6e7d0c  
 gr30: 0x6000000000056698  
 gr31:               0xfb  
 gr32:                  0  
 gr33: 0x9fffffffef6e61b0  
 gr34: 0x600000010fae8f40  
 gr35: 0xc00000000000060f  
 gr36: 0xc0000000003b3210  
 gr37:             0x944d

 

for Frame 2

gr28:          0xf0fff00  
 gr29: 0x9fffffffef6e7d0c  
 gr30: 0x6000000000056698  
 gr31:               0xfb  
 gr32: 0x9fffffffef6dab10  
 gr33: 0x9fffffffef6e0e98  
 gr34: 0xc000000000000713

 

for frame 3 :

 

gr29: 0x9fffffffef6e7d0c  
 gr30: 0x6000000000056698  
 gr31:               0xfb  
 gr32: 0x600000000004d7e8  
 gr33: 0x60000000000537c0  
 gr34: 0xc000000000000205  
 gr35: 0x4000000001150f10  
 gr36: 0x6000000000053298  
 gr37:              0x404 

 

Please let me know wht else data shoud l capture for this.

 

Thanks :)

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

I don't see a common address in r32 in those frames.  So we would have to disassemble frame #3.

 

Also where is is this info for frame 0?

(gdb) info reg

(gdb) disas $pc-16*12 $pc+16*4

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Hi Dennis,

 

PLs find the info below infomration. PLs let me know if any info needed. Thanks for ur great support.

 

info reg of frame 0:

 

pr63:                  0  
  gr0:                  0  
  gr1: 0x9fffffffef6e0e98  
  gr2: 0x9fffffff5ffe7c00  
  gr3: 0x9fffffff5ffe7c00  
  gr4:                  0 

gr5: 0xc000000000000408  
  gr6: 0xc000000000045890  
  gr7: 0x9fffffffef7f8de8  
  gr8:              0x440  
  gr9:              0x450  
 gr10: 0x600000000019fa00  
 gr11:              0x440  
 gr12: 0x9fffffffffffe790  
 gr13: 0x9fffffffef6a94b0  
 gr14:              0x440  
 gr15:                  0  
 gr16: 0x600000010f2c3728  
 gr17:          0xf0fff20  
 gr18:        0x10f5c6558  
 gr19: 0x60000000001a0290  
 gr20:              0x441  
 gr21: 0x6000000110bbfff8  
 gr22:          0xf0fff00  
 gr23:          0xf0fff18  
 gr24:              0x440  
 gr25:                0x1  
 gr26:                  0  
 gr27:        0x100000001

---Type <return> to continue, or q <return> to quit---
 gr28:          0xf0fff00  
 gr29: 0x9fffffffef6e7d0c  
 gr30: 0x6000000000056698  
 gr31:               0xfb  
 gr32:                  0  
 gr33: 0x600000010d5518b8  
 gr34:                0x1  
  br0: 0xc0000000003a84c0  
  br1: 0xc000000000427f40  
  br2: 0xc00000000029d160  
  br3:                  0  
  br4:                  0  
  br5:                  0  
  br6: 0xc00000000042e870  
  br7: 0xc00000000013cd20  
  rsc:               0x1f  
  bsp: 0x9fffffffef7ffa60  
bspst: 0x9fffffffef7ff7d0  
 rnat:                  0  
  ccv:        0x100000001  
 unat:                  0  
 fpsr:    0x9804c9a74433f  
  pfs: 0xc000000000000a17

---Type <return> to continue, or q <return> to quit---
       (sor:0, sol:20, sof:23)
   lc:               0x63  
   ec:                  0  
   ip: 0xc0000000003ad110:1
  cfm:                0x3  
       (sor:0, sol:0, sof:3)
  psr:       0x3a3a616c6c  
(gdb)

 

 

 

 

(gdb) disas $pc-16*12 $pc+16*4

Dump of assembler code from 0xc0000000003ad050:0 to 0xc0000000003ad150:0:
0xc0000000003ad050:0 <tree_delete+0x10>:        
          adds             ret3=16,r33                                       MMB,
0xc0000000003ad050:1 <tree_delete+0x11>:              nop.m            0x0
0xc0000000003ad050:2 <tree_delete+0x12>:        (p6)  br.ret.dpnt.many rp;;
0xc0000000003ad060:0 <tree_delete+0x20>:        
          ld8              r28=[ret2]                                        MMI  
0xc0000000003ad060:1 <tree_delete+0x21>:        
          ld8              r22=[ret3]


0xc0000000003ad060:2 <tree_delete+0x22>:        
          shl              ret1=ret1,8
0xc0000000003ad070:0 <tree_delete+0x30>:        
          addl             ret2=0xffffffffffffd968,gp                        MIB,
0xc0000000003ad070:1 <tree_delete+0x31>:              mov              r25=1
0xc0000000003ad070:2 <tree_delete+0x32>:              nop.b            0x0;;
0xc0000000003ad080:0 <tree_delete+0x40>:        
          ld8              ret2=[ret2]                                       MMI  
0xc0000000003ad080:1 <tree_delete+0x41>:        
          cmp.ne.unc       p7=r0,r28

0xc0000000003ad080:2 <tree_delete+0x42>:              mov              ret3=r22
0xc0000000003ad090:0 <tree_delete+0x50>:        
          adds             r17=32,r28                                        MMI,
0xc0000000003ad090:1 <tree_delete+0x51>:        
          adds             r15=32,r22
0xc0000000003ad090:2 <tree_delete+0x52>:        
          mov              r26=r22;;
0xc0000000003ad0a0:0 <tree_delete+0x60>:        
    (p7)  cmp.ne.unc       p6=r0,ret3                                        MMI  
0xc0000000003ad0a0:1 <tree_delete+0x61>:        
          add              ret2=ret1,ret2
0xc0000000003ad0a0:2 <tree_delete+0x62>:        
          cmp.ne.and       p7=r0,ret3
0xc0000000003ad0b0:0 <tree_delete+0x70>:        
          mov              ret1=r22                                          MMB,
0xc0000000003ad0b0:1 <tree_delete+0x71>:        
          adds             r23=24,r28
0xc0000000003ad0b0:2 <tree_delete+0x72>:        
    (p7)  br.cond.dpnt.many 0xc0000000003ad160;;    or               r22=r28,r22                                       MII,
0xc0000000003ad0c0:1 <tree_delete+0x81>:              nop.i            0x0
0xc0000000003ad0c0:2 <tree_delete+0x82>:              nop.i            0x0;;
0xc0000000003ad0d0:0 <tree_delete+0x90>:        
          ld8.s            r14=[r33]                                         MMI,
0xc0000000003ad0d0:1 <tree_delete+0x91>:        
          ld8              r15=[ret2]
0xc0000000003ad0d0:2 <tree_delete+0x92>:        
          cmp.ne.unc       p6=r0,r22;;
0xc0000000003ad0e0:0 <tree_delete+0xa0>:        
    (p6)  ld8.a            ret3=[r33]                                        MII,
0xc0000000003ad0e0:1 <tree_delete+0xa1>:        
          adds             ret1=8,r14
0xc0000000003ad0e0:2 <tree_delete+0xa2>:        
          cmp.eq.unc       p7,p8=r33,r15;;
0xc0000000003ad0f0:0 <tree_delete+0xb0>:        
    (p8)  ld8.s            r15=[ret1]                                        MMI,
0xc0000000003ad0f0:1 <tree_delete+0xb1>:        
    (p7)  st8              [ret2]=r22
---Type <return> to continue, or q <return


0xc0000000003ad0c0:0 <tree_delete+0x80>:     

xc0000000003ad0f0:2 <tree_delete+0xb2>:        
    (p8)  chk.s.i          r14,0xc0000000003ad4f0;;
0xc0000000003ad100:0 <tree_delete+0xc0>:        
    (p8)  chk.s.m          r15,0xc0000000003ad510                            MI,I
0xc0000000003ad100:1 <tree_delete+0xc1>:        
    (p8)  cmp.eq.unc       p7,p8=r33,r15;;
0xc0000000003ad100:2 <tree_delete+0xc2>:        
    (p8)  adds             ret1=16,r14
0xc0000000003ad110:0 <tree_delete+0xd0>:        
    (p7)  st8              [ret1]=r22;;                                      M,MI,
0xc0000000003ad110:1 <tree_delete+0xd1>:        
    (p8)  st8              [ret1]=r22
0xc0000000003ad110:2 <tree_delete+0xd2>:              nop.i            0x0;;
0xc0000000003ad120:0 <tree_delete+0xe0>:        
    (p6)  ld8.c.clr        ret3=[r33]                                        MMI,
0xc0000000003ad120:1 <tree_delete+0xe1>:        
    (p6)  st8              [r22]=ret3
0xc0000000003ad120:2 <tree_delete+0xe2>:              nop.i            0x0;;
0xc0000000003ad130:0 <tree_delete+0xf0>:        
          adds             ret1=48,ret2

0xc0000000003ad130:1 <tree_delete+0xf1>:        
          cmp4.ne.unc      p6=r0,r34;;
0xc0000000003ad130:2 <tree_delete+0xf2>:              nop.i            0x0
0xc0000000003ad140:0 <tree_delete+0x100>:       
    (p6)  ld8              ret2=[ret1];;                                     M,MI
0xc0000000003ad140:1 <tree_delete+0x101>:       
    (p6)  st8              [r33]=ret2
0xc0000000003ad140:2 <tree_delete+0x102>:             nop.i            0x0
End of assembler dump.
(gdb)

 

 

 

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>find the info below

 

This is where it is aborting:

0xc0000000003ad0d0:0 <tree_delete+0x90>:               ld8.s            r14=[r33]

 

0xc0000000003ad100:1 <tree_delete+0xc1>:       (p8)  cmp.eq.unc p7,p8=r33,r15;;
0xc0000000003ad100:2 <tree_delete+0xc2>:       (p8)  adds             ret1=16,r14
0xc0000000003ad110:0 <tree_delete+0xd0>:       (p7)  st8                [ret1]=r22;;
0xc0000000003ad110:1 <tree_delete+0xd1>:       (p8)  st8                [ret1]=r22   <<

 

gr9:              0x450

gr14:              0x440

r9 is bad, which comes from r14.  Which probably comes from r33's location.

What is "x /gx $r33"?

 

Looks like part of the output is missing and some is badly formatted.  Try redirecting to a file and attaching it:

 

(gdb) set redirect-file gdb.out
(gdb) set redirect on

(gdb) set width 160
(gdb) bt
(gdb) frame 0

(gdb) info reg
(gdb) disas $pc-16*20 $pc+16*12

(gdb) frame 3

(gdb) info reg
(gdb) disas 0x4000000000ff0e70
(gdb) set redirect off

Then attach gdb.out.

0x450

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Hi Dennis,

 

Thanks for the reply.

 

The gbd file is attached along with this mail and reqested out put is here.

(gdb) x /gx$gr33
0x60000000000537c0 <__gp>:      0x6000000000032b20
(gdb) x/gx $gr33

 

 

gr9:              0x450

gr14:              0x440

r9 is bad, which come from r14.  Which probably comes from r33's location.

[Prabhakar] Pls can you tell me how to conlucde here thet it is bad address. using `x $gr9`?

 

Would like to know one more thing that the stack corruption may have invalid values and could have caused some issue? below is the code:

 

 

PCA_DataBuffer_impl::~PCA_DataBuffer_impl()
{
   if ( !(flags & PCA_DataBuffer::DONT_DELETE) ) {
      allocator_->free(base_);
   }
}

 

void
PCA_DataBuffer_impl::decRef()
{
   if ( --ref_count == 0 ) {
      delete this;
   }
}

 

 

 

 

 

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>(gdb) x /gx $gr33  => 0x60000000000537c0 <__gp>:      0x6000000000032b20

 

This is the value for frame 3, not frame 0.

 

>[Prabhakar] can you tell me how to conclude here that it is bad address, using $gr9?

 

ret1 == $r9

 

For frame 3, the original r32 parm was copied to r43 and 16 subtracted:

0x4000000000ff0cd0:0 <operator delete[](void*)+0x30>:      adds    r43=-16,r32      free parm

 gr43: 0x600000010d5518d0 free parm (delete parm-16)
 gr46: 0x9fffffffef6dab10 out0 (was changed in frame 2, in0 for free)

So r43 is the region being freed.

 

In frame 0, the value in r33 is 3*8 bytes before that:

 gr33: 0x600000010d5518b8 free parm - 8*3

 

So in frame 0:

(gdb) x /8gx $r33-8*2

(gdb) x /20s $r33-8*2

 

>Would like to know one more thing that the stack corruption may have invalid values and could have caused some issue?

 

Sure anything is possible but don't look for zebras just yet.