- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Core dump (heap corruption)
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-31-2012 01:21 AM - edited 07-31-2012 10:37 PM
07-31-2012 01:21 AM - edited 07-31-2012 10:37 PM
Core dump
I am getting following dump trace. Pls can some help what would be causing the heap corruption.
Program terminated with signal 11, Segmentation fault.
SEGV_ACCERR - Invalid Permissions for object
#0 0xc0000000003ad110:1 in tree_delete+0xd1 () from /usr/lib/hpux64/libc.so.1
(gdb) #0 0xc0000000003ad110:1 in tree_delete+0xd1 () from /usr/lib/hpux64/libc.so.1
#1 0xc0000000003a84c0:0 in real_free+0x600 () from /usr/lib/hpux64/libc.so.1
#2 0xc0000000003b3210:0 in free+0x170 () from /usr/lib/hpux64/libc.so.1
#3 0x4000000000ff0e70:0 in operator delete[] ()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-31-2012 01:30 AM
07-31-2012 01:30 AM
Re: Core dump (heap corruption)
You should be using gdb's heap checking to see where you are corrupting the heap:
set heap-check on
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-31-2012 01:37 AM
07-31-2012 01:37 AM
Re: Core dump (heap corruption)
Dennis,
Thanks for your quick reply.
I will try to get this done as we need to repro this.
Need one help from you :
I feel 0x4000000000ff0e70:0 in operator delete[] ()
at build/OCSCframeworks/code/MemoryHandler/MemHandler.C:538 is causing the process to dump. Pls can you confirm is my understanding correct.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-31-2012 01:47 AM - edited 08-01-2012 11:42 AM
07-31-2012 01:47 AM - edited 08-01-2012 11:42 AM
Re: Core dump (heap corruption)
>I feel operator delete[] is causing the process to dump.
Not likely, unless it is passing a non-heap address to free.
Most likely the heap has been corrupted before and operator delete[] is just the victim.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-31-2012 01:53 AM
07-31-2012 01:53 AM
Re: Core dump (heap corruption)
Agreed for your comment, Will check that. Just thinking double delete or free may also cause such issues? Pls could you comment on this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-31-2012 02:38 AM
07-31-2012 02:38 AM
Re: Core dump (heap corruption)
>Just thinking double delete or free may also cause such issues?
Yes, those too.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-31-2012 10:28 PM
07-31-2012 10:28 PM
Re: Core dump (heap corruption)
Hi Danny,
I think is it hard to put gdb on that pltfrorm as it is production. And we dont know the exact scenario to repro.
Only option left is to debug the core file. Pls can you suggest some pointers to start with here.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2012 01:37 AM - edited 08-01-2012 01:37 AM
08-01-2012 01:37 AM - edited 08-01-2012 01:37 AM
Re: Core dump (heap corruption)
>I think is it hard to put gdb on that platform as it is production.
>Only option left is to debug the core file. Please can you suggest some pointers to start with here.
It will be harder to explain how to look at machine code and raw memory than it is to use gdb.
See my comments here:
http://h30499.www3.hp.com/t5/Languages-and-Scripting/Bus-Error-on-HP-Ux-PARISC/m-p/5422881/
Do you have a packcore of the abort? Or are you debugging on the customer's system?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2012 02:33 AM
08-01-2012 02:33 AM
Re: Core dump (heap corruption)
Hi Dennis,
Thanks for your input. Yes we have already requested for the packcore. So may take a day or 2 get from the customer. Need to ur valuable inputs once we start into debugging the core. :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-01-2012 11:41 AM - edited 08-01-2012 11:44 AM
08-01-2012 11:41 AM - edited 08-01-2012 11:44 AM
Re: Core dump (heap corruption)
>Need your valuable inputs once we start into debugging the core.
The first thing to do is find the address of the memory being freed. This should be in $r32 in frames 3, 2 and 1. Make sure you check to make sure they match. It is possible that the register was reused in one of the frames.
Once you do get the memory address, you should look at the contents to see if it matches what is being freed and to see if the previous memory region overwrote it:
(gdb) x /80wx address - 8*8
(gdb) x /20s address - 8*8
Before each region is a header. The header contains a pointer to the previous block and a length of the current block. The low order bits are flags. If this pointer or length is overwritten, then you have these aborts.
Also, dumping the registers and disassembling the instructions around the abort would help:
(gdb) bt
(gdb) info reg
(gdb) disas $pc-16*12 $pc+16*4
You might also want to try a small test program and corrupt the heap and see what you get.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2012 01:36 AM
08-02-2012 01:36 AM
Re: Core dump (heap corruption)
Thanks Dennis.
Will start applying the steps provided by you once we receive packcore.
Meanwhile I was browsing throw the code. What I found is :
In .c file I have :
PCA_BufferAllocator* PIMessage::bufAllocator = new(nothrow) PCA_BufferAllocator();
in .h :
private:
static PCA_BufferAllocator* bufAllocator;
there are chunks of buffers created in pca_bufferallocator using new/malloc char * L_mem, L_mem = NEW_NOTHROW char[P_size]; delete on this buffer is causing the dump. Have any inpus that static/global corrutpiton may causing this. Your valuable comments will help me a lot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2012 02:34 AM
08-02-2012 02:34 AM
Re: Core dump (heap corruption)
>PCA_BufferAllocator* PIMessage::bufAllocator = new(nothrow) PCA_BufferAllocator();
Is there an explicit check that bufAllocator isn't NULL?
>there are chunks of buffers created in pca_bufferallocator using new/malloc
So the class PCA_BufferAllocator contains pointers to these?
>delete on this buffer is causing the dump. Have any inputs that static/global corruption may causing this.
What is being deleted? bufAllocator or a field within it? The latter would still be heap corruption.
The former could have the static area overwritten. But you would notice it when you get the $r32 value.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 02:40 AM
08-10-2012 02:40 AM
Re: Core dump (heap corruption)
Hi Dennis,
I did run gdb on the core dump. I could find the corruption/double deletion in frame 4.
(gdb) frame 4
(gdb) info reg
gr32: 0xc00000000000048e
(gdb) x 0xc00000000000048e
0xc00000000000048e: Cannot access memory at address 0xc00000000000048e
(gdb)
This memory is allocated in heap and multiple objects refer to this address location. I suspect one of the object is freeing is this memory location and when other onject tries to free this corrupton is happening. Suspect a corener case.
Please can you comment how to go ahead. Thanks for ur help
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 04:36 AM
08-10-2012 04:36 AM
Re: Core dump (heap corruption)
>I did run gdb on the core dump. I could find the corruption/double deletion in frame 4.
>(gdb) frame 4
>gr32: 0xc00000000000048e
Why are you looking at frame 4? Where is your whole stack trace? I suggested looking at frames 3 .. 0.
>This memory is allocated in heap and multiple objects refer to this address location.
Why do you say that? It can't be invalid if it is in the heap. And the above typically isn't a heap address.
>Please can you comment how to go ahead.
Provide the output I requested.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 05:01 AM
08-10-2012 05:01 AM
Re: Core dump (heap corruption)
Dennis,
Full bt.
#0 0xc0000000003ad110:1 in tree_delete+0xd1 () from /usr/lib/hpux64/libc.so.1
#1 0xc0000000003a84c0:0 in real_free+0x600 () from /usr/lib/hpux64/libc.so.1
#2 0xc0000000003b3210:0 in free+0x170 () from /usr/lib/hpux64/libc.so.1
#3 0x4000000000ff0e70:0 in operator delete[] ()
at build/OCSCframeworks/code/MemoryHandler/MemHandler.C:538
#4 0x4000000001150f10:0 in PCA_BufferAllocator::free ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_BufferAllocator.C:73
#5 0x4000000001145ba0:0 in PCA_DataBuffer_impl::decRef ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_DataBuffer_impl.C:78
#6 0x400000000113d890:0 in PCA_DataBuffer::~PCA_DataBuffer ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_DataBuffer.C:75
#7 0x40000000011398f0:0 in PCA_ArrayBuffer::erase ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_ArrayBuffer.C:211
#8 0x4000000001139ac0:0 in PCA_ArrayBuffer::~PCA_ArrayBuffer ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_ArrayBuffer.C:81
#9 0x400000000114cfd0:0 in PCA_Message_impl::~PCA_Message_impl ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_Message_impl.C:93
#10 0x4000000001147640:0 in PCA_Message::~PCA_Message ()
at build/OCSCframeworks/code/PLUGIN_Message/PCA_Message.C:74
#11 0x40000000007863e0:0 in PIMessage::~PIMessage ()
at build/OCSEPcore/code/IC32_Messages/PIMessage.C:125
#12 0x40000000007bb210:0 in SLEEMessage_ImplOf<PIMessage>::~SLEEMessage_ImplOf
() at build/OCSEPcore/code/IC32_SLEEMessage/SLEEMessage_ImplOf.C:31
---Type <return> to continue, or q <return> to quit---
#13 0x40000000007baa30:0 in Inter_PIMessage::~Inter_PIMessage ()
at build/OCSEPcore/code/IC32_SLEEMessage/SLEEMessageOf.C:79
#14 0x40000000009475e0:0 in SlelMessageRef::setRef ()
at build/OCSEPcore/code/IC32_SLEL_common/SlelMessageRefForSlee.C:147
#15 0x40000000007d6e90:0 in Inter::Run ()
at build/OCSEPcore/code/IC32_SLEL/Interpreter.C:751
#16 0x40000000005ee820:0 in CCTX_CallContext::processMessage ()
at build/OCSEPcore/code/IC32_CallContextMgt/CCTX_CallContext.C:1480
#17 0x40000000005dfb30:0 in CCTX_CallProcessor::processRequest ()
at build/OCSEPcore/code/IC32_CallContextMgt/CallProcessor.C:733
#18 0x4000000000523560:0 in GS_AnyHandler2Task<Handler>::process ()
For frame 0:
gr28: 0xf0fff00
gr29: 0x9fffffffef6e7d0c
gr30: 0x6000000000056698
gr31: 0xfb
gr32: 0
gr33: 0x600000010d5518b8
gr34: 0x1
For frame 1:
gr28: 0xf0fff00
gr29: 0x9fffffffef6e7d0c
gr30: 0x6000000000056698
gr31: 0xfb
gr32: 0
gr33: 0x9fffffffef6e61b0
gr34: 0x600000010fae8f40
gr35: 0xc00000000000060f
gr36: 0xc0000000003b3210
gr37: 0x944d
for Frame 2
gr28: 0xf0fff00
gr29: 0x9fffffffef6e7d0c
gr30: 0x6000000000056698
gr31: 0xfb
gr32: 0x9fffffffef6dab10
gr33: 0x9fffffffef6e0e98
gr34: 0xc000000000000713
for frame 3 :
gr29: 0x9fffffffef6e7d0c
gr30: 0x6000000000056698
gr31: 0xfb
gr32: 0x600000000004d7e8
gr33: 0x60000000000537c0
gr34: 0xc000000000000205
gr35: 0x4000000001150f10
gr36: 0x6000000000053298
gr37: 0x404
Please let me know wht else data shoud l capture for this.
Thanks :)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 05:10 AM
08-10-2012 05:10 AM
Re: Core dump (heap corruption)
I don't see a common address in r32 in those frames. So we would have to disassemble frame #3.
Also where is is this info for frame 0?
(gdb) info reg
(gdb) disas $pc-16*12 $pc+16*4
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 06:45 AM
08-10-2012 06:45 AM
Re: Core dump (heap corruption)
Hi Dennis,
PLs find the info below infomration. PLs let me know if any info needed. Thanks for ur great support.
info reg of frame 0:
pr63: 0
gr0: 0
gr1: 0x9fffffffef6e0e98
gr2: 0x9fffffff5ffe7c00
gr3: 0x9fffffff5ffe7c00
gr4: 0
gr5: 0xc000000000000408
gr6: 0xc000000000045890
gr7: 0x9fffffffef7f8de8
gr8: 0x440
gr9: 0x450
gr10: 0x600000000019fa00
gr11: 0x440
gr12: 0x9fffffffffffe790
gr13: 0x9fffffffef6a94b0
gr14: 0x440
gr15: 0
gr16: 0x600000010f2c3728
gr17: 0xf0fff20
gr18: 0x10f5c6558
gr19: 0x60000000001a0290
gr20: 0x441
gr21: 0x6000000110bbfff8
gr22: 0xf0fff00
gr23: 0xf0fff18
gr24: 0x440
gr25: 0x1
gr26: 0
gr27: 0x100000001
---Type <return> to continue, or q <return> to quit---
gr28: 0xf0fff00
gr29: 0x9fffffffef6e7d0c
gr30: 0x6000000000056698
gr31: 0xfb
gr32: 0
gr33: 0x600000010d5518b8
gr34: 0x1
br0: 0xc0000000003a84c0
br1: 0xc000000000427f40
br2: 0xc00000000029d160
br3: 0
br4: 0
br5: 0
br6: 0xc00000000042e870
br7: 0xc00000000013cd20
rsc: 0x1f
bsp: 0x9fffffffef7ffa60
bspst: 0x9fffffffef7ff7d0
rnat: 0
ccv: 0x100000001
unat: 0
fpsr: 0x9804c9a74433f
pfs: 0xc000000000000a17
---Type <return> to continue, or q <return> to quit---
(sor:0, sol:20, sof:23)
lc: 0x63
ec: 0
ip: 0xc0000000003ad110:1
cfm: 0x3
(sor:0, sol:0, sof:3)
psr: 0x3a3a616c6c
(gdb)
(gdb) disas $pc-16*12 $pc+16*4
Dump of assembler code from 0xc0000000003ad050:0 to 0xc0000000003ad150:0:
0xc0000000003ad050:0 <tree_delete+0x10>:
adds ret3=16,r33 MMB,
0xc0000000003ad050:1 <tree_delete+0x11>: nop.m 0x0
0xc0000000003ad050:2 <tree_delete+0x12>: (p6) br.ret.dpnt.many rp;;
0xc0000000003ad060:0 <tree_delete+0x20>:
ld8 r28=[ret2] MMI
0xc0000000003ad060:1 <tree_delete+0x21>:
ld8 r22=[ret3]
0xc0000000003ad060:2 <tree_delete+0x22>:
shl ret1=ret1,8
0xc0000000003ad070:0 <tree_delete+0x30>:
addl ret2=0xffffffffffffd968,gp MIB,
0xc0000000003ad070:1 <tree_delete+0x31>: mov r25=1
0xc0000000003ad070:2 <tree_delete+0x32>: nop.b 0x0;;
0xc0000000003ad080:0 <tree_delete+0x40>:
ld8 ret2=[ret2] MMI
0xc0000000003ad080:1 <tree_delete+0x41>:
cmp.ne.unc p7=r0,r28
0xc0000000003ad080:2 <tree_delete+0x42>: mov ret3=r22
0xc0000000003ad090:0 <tree_delete+0x50>:
adds r17=32,r28 MMI,
0xc0000000003ad090:1 <tree_delete+0x51>:
adds r15=32,r22
0xc0000000003ad090:2 <tree_delete+0x52>:
mov r26=r22;;
0xc0000000003ad0a0:0 <tree_delete+0x60>:
(p7) cmp.ne.unc p6=r0,ret3 MMI
0xc0000000003ad0a0:1 <tree_delete+0x61>:
add ret2=ret1,ret2
0xc0000000003ad0a0:2 <tree_delete+0x62>:
cmp.ne.and p7=r0,ret3
0xc0000000003ad0b0:0 <tree_delete+0x70>:
mov ret1=r22 MMB,
0xc0000000003ad0b0:1 <tree_delete+0x71>:
adds r23=24,r28
0xc0000000003ad0b0:2 <tree_delete+0x72>:
(p7) br.cond.dpnt.many 0xc0000000003ad160;; or r22=r28,r22 MII,
0xc0000000003ad0c0:1 <tree_delete+0x81>: nop.i 0x0
0xc0000000003ad0c0:2 <tree_delete+0x82>: nop.i 0x0;;
0xc0000000003ad0d0:0 <tree_delete+0x90>:
ld8.s r14=[r33] MMI,
0xc0000000003ad0d0:1 <tree_delete+0x91>:
ld8 r15=[ret2]
0xc0000000003ad0d0:2 <tree_delete+0x92>:
cmp.ne.unc p6=r0,r22;;
0xc0000000003ad0e0:0 <tree_delete+0xa0>:
(p6) ld8.a ret3=[r33] MII,
0xc0000000003ad0e0:1 <tree_delete+0xa1>:
adds ret1=8,r14
0xc0000000003ad0e0:2 <tree_delete+0xa2>:
cmp.eq.unc p7,p8=r33,r15;;
0xc0000000003ad0f0:0 <tree_delete+0xb0>:
(p8) ld8.s r15=[ret1] MMI,
0xc0000000003ad0f0:1 <tree_delete+0xb1>:
(p7) st8 [ret2]=r22
---Type <return> to continue, or q <return
0xc0000000003ad0c0:0 <tree_delete+0x80>:
xc0000000003ad0f0:2 <tree_delete+0xb2>:
(p8) chk.s.i r14,0xc0000000003ad4f0;;
0xc0000000003ad100:0 <tree_delete+0xc0>:
(p8) chk.s.m r15,0xc0000000003ad510 MI,I
0xc0000000003ad100:1 <tree_delete+0xc1>:
(p8) cmp.eq.unc p7,p8=r33,r15;;
0xc0000000003ad100:2 <tree_delete+0xc2>:
(p8) adds ret1=16,r14
0xc0000000003ad110:0 <tree_delete+0xd0>:
(p7) st8 [ret1]=r22;; M,MI,
0xc0000000003ad110:1 <tree_delete+0xd1>:
(p8) st8 [ret1]=r22
0xc0000000003ad110:2 <tree_delete+0xd2>: nop.i 0x0;;
0xc0000000003ad120:0 <tree_delete+0xe0>:
(p6) ld8.c.clr ret3=[r33] MMI,
0xc0000000003ad120:1 <tree_delete+0xe1>:
(p6) st8 [r22]=ret3
0xc0000000003ad120:2 <tree_delete+0xe2>: nop.i 0x0;;
0xc0000000003ad130:0 <tree_delete+0xf0>:
adds ret1=48,ret2
0xc0000000003ad130:1 <tree_delete+0xf1>:
cmp4.ne.unc p6=r0,r34;;
0xc0000000003ad130:2 <tree_delete+0xf2>: nop.i 0x0
0xc0000000003ad140:0 <tree_delete+0x100>:
(p6) ld8 ret2=[ret1];; M,MI
0xc0000000003ad140:1 <tree_delete+0x101>:
(p6) st8 [r33]=ret2
0xc0000000003ad140:2 <tree_delete+0x102>: nop.i 0x0
End of assembler dump.
(gdb)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-10-2012 12:17 PM - edited 08-13-2012 01:37 AM
08-10-2012 12:17 PM - edited 08-13-2012 01:37 AM
Re: Core dump (heap corruption)
>find the info below
This is where it is aborting:
0xc0000000003ad0d0:0 <tree_delete+0x90>: ld8.s r14=[r33]
0xc0000000003ad100:1 <tree_delete+0xc1>: (p8) cmp.eq.unc p7,p8=r33,r15;;
0xc0000000003ad100:2 <tree_delete+0xc2>: (p8) adds ret1=16,r14
0xc0000000003ad110:0 <tree_delete+0xd0>: (p7) st8 [ret1]=r22;;
0xc0000000003ad110:1 <tree_delete+0xd1>: (p8) st8 [ret1]=r22 <<
gr9: 0x450
gr14: 0x440
r9 is bad, which comes from r14. Which probably comes from r33's location.
What is "x /gx $r33"?
Looks like part of the output is missing and some is badly formatted. Try redirecting to a file and attaching it:
(gdb) set redirect-file gdb.out
(gdb) set redirect on
(gdb) set width 160
(gdb) bt
(gdb) frame 0
(gdb) info reg
(gdb) disas $pc-16*20 $pc+16*12
(gdb) frame 3
(gdb) info reg
(gdb) disas 0x4000000000ff0e70
(gdb) set redirect off
Then attach gdb.out.
0x450
- Tags:
- set redirect
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-12-2012 09:45 PM - edited 08-13-2012 02:13 AM
08-12-2012 09:45 PM - edited 08-13-2012 02:13 AM
Re: Core dump (heap corruption)
Hi Dennis,
Thanks for the reply.
The gbd file is attached along with this mail and reqested out put is here.
(gdb) x /gx$gr33
0x60000000000537c0 <__gp>: 0x6000000000032b20
(gdb) x/gx $gr33
gr9: 0x450
gr14: 0x440
r9 is bad, which come from r14. Which probably comes from r33's location.
[Prabhakar] Pls can you tell me how to conlucde here thet it is bad address. using `x $gr9`?
Would like to know one more thing that the stack corruption may have invalid values and could have caused some issue? below is the code:
PCA_DataBuffer_impl::~PCA_DataBuffer_impl()
{
if ( !(flags & PCA_DataBuffer::DONT_DELETE) ) {
allocator_->free(base_);
}
}
void
PCA_DataBuffer_impl::decRef()
{
if ( --ref_count == 0 ) {
delete this;
}
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-13-2012 02:54 AM - edited 08-13-2012 02:58 AM
08-13-2012 02:54 AM - edited 08-13-2012 02:58 AM
Re: Core dump (heap corruption)
>(gdb) x /gx $gr33 => 0x60000000000537c0 <__gp>: 0x6000000000032b20
This is the value for frame 3, not frame 0.
>[Prabhakar] can you tell me how to conclude here that it is bad address, using $gr9?
ret1 == $r9
For frame 3, the original r32 parm was copied to r43 and 16 subtracted:
0x4000000000ff0cd0:0 <operator delete[](void*)+0x30>: adds r43=-16,r32 free parm
gr43: 0x600000010d5518d0 free parm (delete parm-16)
gr46: 0x9fffffffef6dab10 out0 (was changed in frame 2, in0 for free)
So r43 is the region being freed.
In frame 0, the value in r33 is 3*8 bytes before that:
gr33: 0x600000010d5518b8 free parm - 8*3
So in frame 0:
(gdb) x /8gx $r33-8*2
(gdb) x /20s $r33-8*2
>Would like to know one more thing that the stack corruption may have invalid values and could have caused some issue?
Sure anything is possible but don't look for zebras just yet.
- Tags:
- zebras
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-13-2012 03:32 AM
08-13-2012 03:32 AM
Re: Core dump (heap corruption)
Dennis,
Please find the data which u asked for the frame 0 below:
(gdb) x /8gx $r33-8*2
0x600000010d5518a8: 0x0000000000000000 0x600000010f2c3728
0x600000010d5518b8: 0x0000000000000440 0x000000000f0fff00
0x600000010d5518c8: 0x0000000000000000 0x0000000000000000
0x600000010d5518d8: 0x0000000000000000 0x000100000000476d
(gdb) x /20s $r33-8*2
0x600000010d5518a8: ""
0x600000010d5518a9: ""
0x600000010d5518aa: ""
0x600000010d5518ab: ""
0x600000010d5518ac: ""
0x600000010d5518ad: ""
0x600000010d5518ae: ""
0x600000010d5518af: ""
0x600000010d5518b0: "`"
0x600000010d5518b2: ""
0x600000010d5518b3: "\001\017,7("
0x600000010d5518b9: ""
0x600000010d5518ba: ""
0x600000010d5518bb: ""
0x600000010d5518bc: ""
0x600000010d5518bd: ""
0x600000010d5518be: "\004@"
0x600000010d5518c1: ""
0x600000010d5518c2: ""
0x600000010d5518c3: ""
(gdb) x /gx $gr33
0x600000010d5518b8: 0x0000000000000440
(gdb)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-13-2012 01:27 PM - edited 08-13-2012 09:33 PM
08-13-2012 01:27 PM - edited 08-13-2012 09:33 PM
Re: Core dump (heap corruption)
(gdb) x /8gx $r33-8*2
0x600000010d5518a8: 0x0000000000000000 0x600000010f2c3728
0x600000010d5518b8: 0x0000000000000440< 0x000000000f0fff00
0x600000010d5518c8: 0x0000000000000000 f> 0x0000000000000000
0x600000010d5518d8: 0x0000000000000000 da> 0x000100000000476d
That 0440 is where that bad value comes from. Not sure how it was overwritten.
Here is where you need a watchpoint and rerun it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-13-2012 08:48 PM - edited 08-13-2012 10:04 PM
08-13-2012 08:48 PM - edited 08-13-2012 10:04 PM
Re: Core dump (heap corruption)
Hi Dennis,
Thanks for the reply. Please can you clarify my following querries.
1. How to decide that (gdb) x /8gx $r33-8*2
0x600000010d5518a8: 0x0000000000000000 0x600000010f2c3728
0x600000010d5518b8: 0x0000000000000440< 0x000000000f0fff00
0x600000010d5518c8: 0x0000000000000000 0x0000000000000000
0x600000010d5518d8: 0x0000000000000000 0x000100000000476d
0x0000000000000440< is invalid? any specific format
2. means value of "0x0000000000000440", which we are trying to free?
2. It means here no double deletion is happening?
3. This value woud have overwritten by our application or any chances that from outside source/slibary can also cause this. pls can u tell me wha twould have been the causes of this.
4. Here is where you need a watchpoint and rerun it. -------> rerun the gdb on core or run in production platform????
Thanks a lot for your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-13-2012 09:34 PM
08-13-2012 09:34 PM
Re: Core dump (heap corruption)
>1. How to decide that (gdb) x /8gx $r33-8*2
This includes the $r33 value and backwards and forwards to see if writing beyond the heap block.
There is no string of ascii bytes.
>2. means value of "0x0000000000000440", which we are trying to free?
No, this should be the link to the previous block.
>2. It means here no double deletion is happening?
Can't tell.
>3. This value would have overwritten by our application or any chances that from outside source/shlib can also cause this.
You have to set a watchpoint and run the program to see who is changing that address. I thought if the previous block was gone beyond the end by just one byte, that 0x00 would overwrite the 0x60 for the heap address.
Strangely, when I try a simple heap allocation, I don't get that alignment, I get:
$ a.out
6000000000004010: 0000000000000000, 0000000000000000
6000000000004020: 0000000000000000, 0000000000000000
6000000000004030: 6000000000000070, 000000000f0fff01 (pointer to previous block and length words)
6000000000004040: >ffffffffffffffff (heap block starts here)
So I only see 16 bytes of heap block overhead.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-14-2012 03:07 AM
08-14-2012 03:07 AM
Re: Core dump (heap corruption)
Hi Dennis,
Thanks for the info.
1. 0x600000010d5518a8: 0x0000000000000000 0x600000010f2c3728
0x600000010d5518b8: 0x0000000000000440< 0x000000000f0fff00
0x600000010d5518c8: 0x0000000000000000 0x0000000000000000
0x600000010d5518d8: 0x0000000000000000 0x000100000000476d
[Prabhakar] What I understood by ur explanation is 0x0000000000000440 this previous block aadress is not correct. Is my understanding correct.
2. You have to set a watchpoint and run the program to see who is changing that address. I thought if the previous block was gone beyond the end by just one byte, that 0x00 would overwrite the 0x60 for the heap address.
[Prabhakar] This should be done run time or can be run using core file?