Operating System - HP-UX
1827304 Members
2845 Online
109961 Solutions
New Discussion

Re: Core dump (heap corruption)

 
prabhakarbhatt
Advisor

Core dump

I am getting following dump trace. Pls can some help what would be causing the heap corruption.

 

Program terminated with signal 11, Segmentation fault.
SEGV_ACCERR - Invalid Permissions for object
#0  0xc0000000003ad110:1 in tree_delete+0xd1 () from /usr/lib/hpux64/libc.so.1
(gdb) #0  0xc0000000003ad110:1 in tree_delete+0xd1 () from /usr/lib/hpux64/libc.so.1
#1  0xc0000000003a84c0:0 in real_free+0x600 () from /usr/lib/hpux64/libc.so.1
#2  0xc0000000003b3210:0 in free+0x170 () from /usr/lib/hpux64/libc.so.1
#3  0x4000000000ff0e70:0 in operator delete[] ()
   

25 REPLIES 25
Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

You should be using gdb's heap checking to see where you are corrupting the heap:

set heap-check on

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Dennis,

 

Thanks for your quick reply.

 

I will try to get this done as we need to repro this.

 

Need one help from you :

 

I feel  0x4000000000ff0e70:0 in operator delete[] ()
    at build/OCSCframeworks/code/MemoryHandler/MemHandler.C:538 is causing the process to dump. Pls can you confirm is my understanding correct.

 

 

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>I feel operator delete[] is causing the process to dump.

 

Not likely, unless it is passing a non-heap address to free.

Most likely the heap has been corrupted before and operator delete[] is just the victim.

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Thanks dennis.

Agreed for your comment, Will check that. Just thinking double delete or free may also cause such issues? Pls could you comment on this.

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>Just thinking double delete or free may also cause such issues?

 

Yes, those too.

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Hi Danny,

 

I think is it hard to put gdb on that pltfrorm as it is production. And we dont know the exact scenario to repro.

 

Only option left is to debug the core file. Pls can you suggest some pointers to start with here.

 

 

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>I think is it hard to put gdb on that platform as it is production.

>Only option left is to debug the core file. Please can you suggest some pointers to start with here.

 

It will be harder to explain how to look at machine code and raw memory than it is to use gdb.

See my comments here:

http://h30499.www3.hp.com/t5/Languages-and-Scripting/Bus-Error-on-HP-Ux-PARISC/m-p/5422881/

 

Do you have a packcore of the abort?  Or are you debugging on the customer's system?

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Hi Dennis,

 

Thanks for your input. Yes we have already requested for the packcore. So may take a day or 2 get from the customer. Need to ur valuable inputs once we start into debugging the core. :)

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>Need your valuable inputs once we start into debugging the core.

 

The first thing to do is find the address of the memory being freed.  This should be in $r32 in frames 3, 2 and 1.  Make sure you check to make sure they match.  It is possible that the register was reused in one of the frames.

 

Once you do get the memory address, you should look at the contents to see if it matches what is being freed and to see if the previous memory region overwrote it:

 

(gdb) x /80wx address - 8*8

(gdb) x /20s address - 8*8

 

Before each region is a header.  The header contains a pointer to the previous block and a length of the current block.  The low order bits are flags.  If this pointer or length is overwritten, then you have these aborts.

 

Also, dumping the registers and disassembling the instructions around the abort would help:

(gdb) bt

(gdb) info reg

(gdb) disas $pc-16*12 $pc+16*4

 


You might also want to try a small test program and corrupt the heap and see what you get.

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Thanks Dennis.

 

Will start applying the steps provided by you once we receive packcore.

 

Meanwhile I was browsing throw the code. What I found is :

 

In .c file I have :

PCA_BufferAllocator* PIMessage::bufAllocator = new(nothrow) PCA_BufferAllocator();

in .h :

 

private:
   static PCA_BufferAllocator* bufAllocator;

 

 there are chunks of buffers created in pca_bufferallocator using new/malloc char * L_mem, L_mem = NEW_NOTHROW char[P_size];  delete on this buffer is causing the dump. Have any inpus that static/global corrutpiton may causing this. Your valuable comments will help me a lot.

 

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>PCA_BufferAllocator* PIMessage::bufAllocator = new(nothrow) PCA_BufferAllocator();

 

Is there an explicit check that bufAllocator isn't NULL?

>there are chunks of buffers created in pca_bufferallocator using new/malloc

 

So the class PCA_BufferAllocator contains pointers to these?

 

>delete on this buffer is causing the dump. Have any inputs that static/global corruption may causing this.

 

What is being deleted?  bufAllocator or a field within it?  The latter would still be heap corruption.

The former could have the static area overwritten.  But you would notice it when you get the $r32 value.

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Hi Dennis,

 

 

I  did  run gdb on the core dump. I could find the corruption/double deletion in frame 4.

 

(gdb) frame 4

(gdb) info reg

gr32: 0xc00000000000048e

(gdb) x 0xc00000000000048e
0xc00000000000048e:     Cannot access memory at address 0xc00000000000048e
(gdb)

This memory is allocated in heap and multiple objects refer to this address location. I suspect one of the object is freeing is this memory location and when other onject tries to free this corrupton is happening. Suspect a corener case.

Please can you comment how to go ahead. Thanks for ur help

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>I  did run gdb on the core dump. I could find the corruption/double deletion in frame 4.
>(gdb) frame 4
>gr32: 0xc00000000000048e

 

Why are you looking at frame 4?  Where is your whole stack trace?  I suggested looking at frames 3 .. 0.

>This memory is allocated in heap and multiple objects refer to this address location.

 

Why do you say that?  It can't be invalid if it is in the heap.  And the above typically isn't a heap address.

 

>Please can you comment how to go ahead.

 

Provide the output I requested.

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Dennis,

Full bt.

 

#0  0xc0000000003ad110:1 in tree_delete+0xd1 () from /usr/lib/hpux64/libc.so.1
#1  0xc0000000003a84c0:0 in real_free+0x600 () from /usr/lib/hpux64/libc.so.1
#2  0xc0000000003b3210:0 in free+0x170 () from /usr/lib/hpux64/libc.so.1
#3  0x4000000000ff0e70:0 in operator delete[] ()
    at build/OCSCframeworks/code/MemoryHandler/MemHandler.C:538
#4  0x4000000001150f10:0 in PCA_BufferAllocator::free ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_BufferAllocator.C:73
#5  0x4000000001145ba0:0 in PCA_DataBuffer_impl::decRef ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_DataBuffer_impl.C:78
#6  0x400000000113d890:0 in PCA_DataBuffer::~PCA_DataBuffer ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_DataBuffer.C:75
#7  0x40000000011398f0:0 in PCA_ArrayBuffer::erase ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_ArrayBuffer.C:211
#8  0x4000000001139ac0:0 in PCA_ArrayBuffer::~PCA_ArrayBuffer ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_ArrayBuffer.C:81
#9  0x400000000114cfd0:0 in PCA_Message_impl::~PCA_Message_impl ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_Message_impl.C:93
#10 0x4000000001147640:0 in PCA_Message::~PCA_Message ()
    at build/OCSCframeworks/code/PLUGIN_Message/PCA_Message.C:74
#11 0x40000000007863e0:0 in PIMessage::~PIMessage ()
    at build/OCSEPcore/code/IC32_Messages/PIMessage.C:125
#12 0x40000000007bb210:0 in SLEEMessage_ImplOf<PIMessage>::~SLEEMessage_ImplOf
    () at build/OCSEPcore/code/IC32_SLEEMessage/SLEEMessage_ImplOf.C:31
---Type <return> to continue, or q <return> to quit---
#13 0x40000000007baa30:0 in Inter_PIMessage::~Inter_PIMessage ()
    at build/OCSEPcore/code/IC32_SLEEMessage/SLEEMessageOf.C:79
#14 0x40000000009475e0:0 in SlelMessageRef::setRef ()
    at build/OCSEPcore/code/IC32_SLEL_common/SlelMessageRefForSlee.C:147
#15 0x40000000007d6e90:0 in Inter::Run ()
    at build/OCSEPcore/code/IC32_SLEL/Interpreter.C:751
#16 0x40000000005ee820:0 in CCTX_CallContext::processMessage ()
    at build/OCSEPcore/code/IC32_CallContextMgt/CCTX_CallContext.C:1480
#17 0x40000000005dfb30:0 in CCTX_CallProcessor::processRequest ()
    at build/OCSEPcore/code/IC32_CallContextMgt/CallProcessor.C:733
#18 0x4000000000523560:0 in GS_AnyHandler2Task<Handler>::process ()

 

For frame 0:

 

gr28:          0xf0fff00  
 gr29: 0x9fffffffef6e7d0c  
 gr30: 0x6000000000056698  
 gr31:               0xfb  
 gr32:                  0  
 gr33: 0x600000010d5518b8  
 gr34:                0x1

 

For frame 1:

 

gr28:          0xf0fff00  
 gr29: 0x9fffffffef6e7d0c  
 gr30: 0x6000000000056698  
 gr31:               0xfb  
 gr32:                  0  
 gr33: 0x9fffffffef6e61b0  
 gr34: 0x600000010fae8f40  
 gr35: 0xc00000000000060f  
 gr36: 0xc0000000003b3210  
 gr37:             0x944d

 

for Frame 2

gr28:          0xf0fff00  
 gr29: 0x9fffffffef6e7d0c  
 gr30: 0x6000000000056698  
 gr31:               0xfb  
 gr32: 0x9fffffffef6dab10  
 gr33: 0x9fffffffef6e0e98  
 gr34: 0xc000000000000713

 

for frame 3 :

 

gr29: 0x9fffffffef6e7d0c  
 gr30: 0x6000000000056698  
 gr31:               0xfb  
 gr32: 0x600000000004d7e8  
 gr33: 0x60000000000537c0  
 gr34: 0xc000000000000205  
 gr35: 0x4000000001150f10  
 gr36: 0x6000000000053298  
 gr37:              0x404 

 

Please let me know wht else data shoud l capture for this.

 

Thanks :)

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

I don't see a common address in r32 in those frames.  So we would have to disassemble frame #3.

 

Also where is is this info for frame 0?

(gdb) info reg

(gdb) disas $pc-16*12 $pc+16*4

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Hi Dennis,

 

PLs find the info below infomration. PLs let me know if any info needed. Thanks for ur great support.

 

info reg of frame 0:

 

pr63:                  0  
  gr0:                  0  
  gr1: 0x9fffffffef6e0e98  
  gr2: 0x9fffffff5ffe7c00  
  gr3: 0x9fffffff5ffe7c00  
  gr4:                  0 

gr5: 0xc000000000000408  
  gr6: 0xc000000000045890  
  gr7: 0x9fffffffef7f8de8  
  gr8:              0x440  
  gr9:              0x450  
 gr10: 0x600000000019fa00  
 gr11:              0x440  
 gr12: 0x9fffffffffffe790  
 gr13: 0x9fffffffef6a94b0  
 gr14:              0x440  
 gr15:                  0  
 gr16: 0x600000010f2c3728  
 gr17:          0xf0fff20  
 gr18:        0x10f5c6558  
 gr19: 0x60000000001a0290  
 gr20:              0x441  
 gr21: 0x6000000110bbfff8  
 gr22:          0xf0fff00  
 gr23:          0xf0fff18  
 gr24:              0x440  
 gr25:                0x1  
 gr26:                  0  
 gr27:        0x100000001

---Type <return> to continue, or q <return> to quit---
 gr28:          0xf0fff00  
 gr29: 0x9fffffffef6e7d0c  
 gr30: 0x6000000000056698  
 gr31:               0xfb  
 gr32:                  0  
 gr33: 0x600000010d5518b8  
 gr34:                0x1  
  br0: 0xc0000000003a84c0  
  br1: 0xc000000000427f40  
  br2: 0xc00000000029d160  
  br3:                  0  
  br4:                  0  
  br5:                  0  
  br6: 0xc00000000042e870  
  br7: 0xc00000000013cd20  
  rsc:               0x1f  
  bsp: 0x9fffffffef7ffa60  
bspst: 0x9fffffffef7ff7d0  
 rnat:                  0  
  ccv:        0x100000001  
 unat:                  0  
 fpsr:    0x9804c9a74433f  
  pfs: 0xc000000000000a17

---Type <return> to continue, or q <return> to quit---
       (sor:0, sol:20, sof:23)
   lc:               0x63  
   ec:                  0  
   ip: 0xc0000000003ad110:1
  cfm:                0x3  
       (sor:0, sol:0, sof:3)
  psr:       0x3a3a616c6c  
(gdb)

 

 

 

 

(gdb) disas $pc-16*12 $pc+16*4

Dump of assembler code from 0xc0000000003ad050:0 to 0xc0000000003ad150:0:
0xc0000000003ad050:0 <tree_delete+0x10>:        
          adds             ret3=16,r33                                       MMB,
0xc0000000003ad050:1 <tree_delete+0x11>:              nop.m            0x0
0xc0000000003ad050:2 <tree_delete+0x12>:        (p6)  br.ret.dpnt.many rp;;
0xc0000000003ad060:0 <tree_delete+0x20>:        
          ld8              r28=[ret2]                                        MMI  
0xc0000000003ad060:1 <tree_delete+0x21>:        
          ld8              r22=[ret3]


0xc0000000003ad060:2 <tree_delete+0x22>:        
          shl              ret1=ret1,8
0xc0000000003ad070:0 <tree_delete+0x30>:        
          addl             ret2=0xffffffffffffd968,gp                        MIB,
0xc0000000003ad070:1 <tree_delete+0x31>:              mov              r25=1
0xc0000000003ad070:2 <tree_delete+0x32>:              nop.b            0x0;;
0xc0000000003ad080:0 <tree_delete+0x40>:        
          ld8              ret2=[ret2]                                       MMI  
0xc0000000003ad080:1 <tree_delete+0x41>:        
          cmp.ne.unc       p7=r0,r28

0xc0000000003ad080:2 <tree_delete+0x42>:              mov              ret3=r22
0xc0000000003ad090:0 <tree_delete+0x50>:        
          adds             r17=32,r28                                        MMI,
0xc0000000003ad090:1 <tree_delete+0x51>:        
          adds             r15=32,r22
0xc0000000003ad090:2 <tree_delete+0x52>:        
          mov              r26=r22;;
0xc0000000003ad0a0:0 <tree_delete+0x60>:        
    (p7)  cmp.ne.unc       p6=r0,ret3                                        MMI  
0xc0000000003ad0a0:1 <tree_delete+0x61>:        
          add              ret2=ret1,ret2
0xc0000000003ad0a0:2 <tree_delete+0x62>:        
          cmp.ne.and       p7=r0,ret3
0xc0000000003ad0b0:0 <tree_delete+0x70>:        
          mov              ret1=r22                                          MMB,
0xc0000000003ad0b0:1 <tree_delete+0x71>:        
          adds             r23=24,r28
0xc0000000003ad0b0:2 <tree_delete+0x72>:        
    (p7)  br.cond.dpnt.many 0xc0000000003ad160;;    or               r22=r28,r22                                       MII,
0xc0000000003ad0c0:1 <tree_delete+0x81>:              nop.i            0x0
0xc0000000003ad0c0:2 <tree_delete+0x82>:              nop.i            0x0;;
0xc0000000003ad0d0:0 <tree_delete+0x90>:        
          ld8.s            r14=[r33]                                         MMI,
0xc0000000003ad0d0:1 <tree_delete+0x91>:        
          ld8              r15=[ret2]
0xc0000000003ad0d0:2 <tree_delete+0x92>:        
          cmp.ne.unc       p6=r0,r22;;
0xc0000000003ad0e0:0 <tree_delete+0xa0>:        
    (p6)  ld8.a            ret3=[r33]                                        MII,
0xc0000000003ad0e0:1 <tree_delete+0xa1>:        
          adds             ret1=8,r14
0xc0000000003ad0e0:2 <tree_delete+0xa2>:        
          cmp.eq.unc       p7,p8=r33,r15;;
0xc0000000003ad0f0:0 <tree_delete+0xb0>:        
    (p8)  ld8.s            r15=[ret1]                                        MMI,
0xc0000000003ad0f0:1 <tree_delete+0xb1>:        
    (p7)  st8              [ret2]=r22
---Type <return> to continue, or q <return


0xc0000000003ad0c0:0 <tree_delete+0x80>:     

xc0000000003ad0f0:2 <tree_delete+0xb2>:        
    (p8)  chk.s.i          r14,0xc0000000003ad4f0;;
0xc0000000003ad100:0 <tree_delete+0xc0>:        
    (p8)  chk.s.m          r15,0xc0000000003ad510                            MI,I
0xc0000000003ad100:1 <tree_delete+0xc1>:        
    (p8)  cmp.eq.unc       p7,p8=r33,r15;;
0xc0000000003ad100:2 <tree_delete+0xc2>:        
    (p8)  adds             ret1=16,r14
0xc0000000003ad110:0 <tree_delete+0xd0>:        
    (p7)  st8              [ret1]=r22;;                                      M,MI,
0xc0000000003ad110:1 <tree_delete+0xd1>:        
    (p8)  st8              [ret1]=r22
0xc0000000003ad110:2 <tree_delete+0xd2>:              nop.i            0x0;;
0xc0000000003ad120:0 <tree_delete+0xe0>:        
    (p6)  ld8.c.clr        ret3=[r33]                                        MMI,
0xc0000000003ad120:1 <tree_delete+0xe1>:        
    (p6)  st8              [r22]=ret3
0xc0000000003ad120:2 <tree_delete+0xe2>:              nop.i            0x0;;
0xc0000000003ad130:0 <tree_delete+0xf0>:        
          adds             ret1=48,ret2

0xc0000000003ad130:1 <tree_delete+0xf1>:        
          cmp4.ne.unc      p6=r0,r34;;
0xc0000000003ad130:2 <tree_delete+0xf2>:              nop.i            0x0
0xc0000000003ad140:0 <tree_delete+0x100>:       
    (p6)  ld8              ret2=[ret1];;                                     M,MI
0xc0000000003ad140:1 <tree_delete+0x101>:       
    (p6)  st8              [r33]=ret2
0xc0000000003ad140:2 <tree_delete+0x102>:             nop.i            0x0
End of assembler dump.
(gdb)

 

 

 

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>find the info below

 

This is where it is aborting:

0xc0000000003ad0d0:0 <tree_delete+0x90>:               ld8.s            r14=[r33]

 

0xc0000000003ad100:1 <tree_delete+0xc1>:       (p8)  cmp.eq.unc p7,p8=r33,r15;;
0xc0000000003ad100:2 <tree_delete+0xc2>:       (p8)  adds             ret1=16,r14
0xc0000000003ad110:0 <tree_delete+0xd0>:       (p7)  st8                [ret1]=r22;;
0xc0000000003ad110:1 <tree_delete+0xd1>:       (p8)  st8                [ret1]=r22   <<

 

gr9:              0x450

gr14:              0x440

r9 is bad, which comes from r14.  Which probably comes from r33's location.

What is "x /gx $r33"?

 

Looks like part of the output is missing and some is badly formatted.  Try redirecting to a file and attaching it:

 

(gdb) set redirect-file gdb.out
(gdb) set redirect on

(gdb) set width 160
(gdb) bt
(gdb) frame 0

(gdb) info reg
(gdb) disas $pc-16*20 $pc+16*12

(gdb) frame 3

(gdb) info reg
(gdb) disas 0x4000000000ff0e70
(gdb) set redirect off

Then attach gdb.out.

0x450

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Hi Dennis,

 

Thanks for the reply.

 

The gbd file is attached along with this mail and reqested out put is here.

(gdb) x /gx$gr33
0x60000000000537c0 <__gp>:      0x6000000000032b20
(gdb) x/gx $gr33

 

 

gr9:              0x450

gr14:              0x440

r9 is bad, which come from r14.  Which probably comes from r33's location.

[Prabhakar] Pls can you tell me how to conlucde here thet it is bad address. using `x $gr9`?

 

Would like to know one more thing that the stack corruption may have invalid values and could have caused some issue? below is the code:

 

 

PCA_DataBuffer_impl::~PCA_DataBuffer_impl()
{
   if ( !(flags & PCA_DataBuffer::DONT_DELETE) ) {
      allocator_->free(base_);
   }
}

 

void
PCA_DataBuffer_impl::decRef()
{
   if ( --ref_count == 0 ) {
      delete this;
   }
}

 

 

 

 

 

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>(gdb) x /gx $gr33  => 0x60000000000537c0 <__gp>:      0x6000000000032b20

 

This is the value for frame 3, not frame 0.

 

>[Prabhakar] can you tell me how to conclude here that it is bad address, using $gr9?

 

ret1 == $r9

 

For frame 3, the original r32 parm was copied to r43 and 16 subtracted:

0x4000000000ff0cd0:0 <operator delete[](void*)+0x30>:      adds    r43=-16,r32      free parm

 gr43: 0x600000010d5518d0 free parm (delete parm-16)
 gr46: 0x9fffffffef6dab10 out0 (was changed in frame 2, in0 for free)

So r43 is the region being freed.

 

In frame 0, the value in r33 is 3*8 bytes before that:

 gr33: 0x600000010d5518b8 free parm - 8*3

 

So in frame 0:

(gdb) x /8gx $r33-8*2

(gdb) x /20s $r33-8*2

 

>Would like to know one more thing that the stack corruption may have invalid values and could have caused some issue?

 

Sure anything is possible but don't look for zebras just yet.

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Dennis,

 

Please find the data which u asked for the frame 0 below:

 

(gdb) x /8gx $r33-8*2
0x600000010d5518a8:     0x0000000000000000      0x600000010f2c3728
0x600000010d5518b8:     0x0000000000000440      0x000000000f0fff00
0x600000010d5518c8:     0x0000000000000000      0x0000000000000000
0x600000010d5518d8:     0x0000000000000000      0x000100000000476d
(gdb) x /20s $r33-8*2
0x600000010d5518a8:      ""
0x600000010d5518a9:      ""
0x600000010d5518aa:      ""
0x600000010d5518ab:      ""
0x600000010d5518ac:      ""
0x600000010d5518ad:      ""
0x600000010d5518ae:      ""
0x600000010d5518af:      ""
0x600000010d5518b0:      "`"
0x600000010d5518b2:      ""
0x600000010d5518b3:      "\001\017,7("
0x600000010d5518b9:      ""
0x600000010d5518ba:      ""
0x600000010d5518bb:      ""
0x600000010d5518bc:      ""
0x600000010d5518bd:      ""
0x600000010d5518be:      "\004@"
0x600000010d5518c1:      ""
0x600000010d5518c2:      ""
0x600000010d5518c3:      ""
(gdb) x /gx $gr33
0x600000010d5518b8:     0x0000000000000440
(gdb)

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

(gdb) x /8gx $r33-8*2
0x600000010d5518a8:     0x0000000000000000      0x600000010f2c3728
0x600000010d5518b8:     0x0000000000000440<    0x000000000f0fff00
0x600000010d5518c8:     0x0000000000000000  f>  0x0000000000000000
0x600000010d5518d8:     0x0000000000000000 da> 0x000100000000476d

 

That 0440 is where that bad value comes from.  Not sure how it was overwritten.

Here is where you need a watchpoint and rerun it.

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Hi Dennis,

 

Thanks for the reply. Please can you clarify my following querries.

 

 

1. How to decide that (gdb) x /8gx $r33-8*2
0x600000010d5518a8:     0x0000000000000000      0x600000010f2c3728
0x600000010d5518b8:     0x0000000000000440<    0x000000000f0fff00
0x600000010d5518c8:     0x0000000000000000      0x0000000000000000
0x600000010d5518d8:     0x0000000000000000      0x000100000000476d

 

0x0000000000000440< is invalid? any specific format

 

2. means  value of "0x0000000000000440", which we are trying to free?

 

 

2. It means here no double deletion is happening?

 

3. This value woud have overwritten by our application or any chances that from outside source/slibary can also cause this. pls can u tell me wha twould have been the causes of this.

 

4. Here is where you need a watchpoint and rerun it. -------> rerun the gdb  on core or run in production platform????

 

Thanks a lot for your help.

 

Dennis Handly
Acclaimed Contributor

Re: Core dump (heap corruption)

>1. How to decide that (gdb) x /8gx $r33-8*2

 

This includes the $r33 value and backwards and forwards to see if writing beyond the heap block.

There is no string of ascii bytes.

 

>2. means value of "0x0000000000000440", which we are trying to free?

 

No, this should be the link to the previous block.

 

>2. It means here no double deletion is happening?

 

Can't tell.

 

>3. This value would have overwritten by our application or any chances that from outside source/shlib can also cause this.

 

You have to set a watchpoint and run the program to see who is changing that address.  I thought if the previous block was gone beyond the end by just one byte, that 0x00 would overwrite the 0x60 for the heap address.

 

Strangely, when I try a simple heap allocation, I don't get that alignment, I get:

$ a.out

6000000000004010: 0000000000000000, 0000000000000000
6000000000004020: 0000000000000000, 0000000000000000
6000000000004030: 6000000000000070, 000000000f0fff01  (pointer to previous block and length words)
6000000000004040: >ffffffffffffffff   (heap block starts here)

So I only see 16 bytes of heap block overhead.

prabhakarbhatt
Advisor

Re: Core dump (heap corruption)

Hi Dennis,

 

Thanks for the info.

 

1. 0x600000010d5518a8:     0x0000000000000000      0x600000010f2c3728
0x600000010d5518b8:     0x0000000000000440<    0x000000000f0fff00
0x600000010d5518c8:     0x0000000000000000      0x0000000000000000
0x600000010d5518d8:     0x0000000000000000      0x000100000000476d

 

[Prabhakar] What I understood by ur explanation is 0x0000000000000440 this previous block aadress is not correct.  Is my understanding correct.

 

 

2. You have to set a watchpoint and run the program to see who is changing that address.  I thought if the previous block was gone beyond the end by just one byte, that 0x00 would overwrite the 0x60 for the heap address.

[Prabhakar] This should be done run time or can be run using core file?