nulptr dereferences trap

 
SOLVED
Go to solution
zorin09
New Member

nulptr dereferences trap

I have a COBOL program, which call a C routine. The C code is trying to deference a NULL pointer. The "nulptr dereferences trap" is disabled in the executable. This program runs fine on a HP UX machine "M4"(host name) with below configuration :

OS info:
sysname = HP-UX
nodename = ---removed intentionally---
release = B.11.23
version = U (unlimited-user license)
machine = ia64
idnumber = ---removed intentionally---
vmunix _release_version:
@(#) $Revision: vmunix: B11.23_LR FLAVOR=perf Fri Aug 29 22:35:38 PDT 2003 $

If I "chatr -z " (enable the "nulptr dereferences trap")on the exe, then is crashes in runtime on machine M4. This is fine and expected.


However if I run the same exe(with "nulptr dereferences trap" disabled) on another HP UX machine M1(host name) with the same OS/HW specs as M4, the exe crashes in runtime with "Bus error". Below if the stack trace of the core:

(gdb) bt
#0 0x60000000cb0c87f0:0 in signal_handler_catch () at opsig.c:165
#1
#2 0x60000000cb0c87f0:0 in signal_handler_catch () at opsig.c:165
#3
#4 0x2be0790:1 in blbn_rankCopyBuf+0x251 ()
#5 0x110d720:0 in + 0xf6a0 ()
#6 0x1263e10:0 in + 0xcd0 ()
#7 0x1262910:0 in + 0x13a0 ()
#8 0x7c4d60:0 in + 0x3630 ()
#9 0x60000000cb19f850:0 in mFt_xe_xcall+0x130 ()
from /opt/cobol_exp/lib/libcobrts.so.2
#10 0x60000000cb152870:0 in _mFgmain () at rtgmain.c:252
#11 0x7c14e0:0 in main+0x60 ()



What I don't understand is that even thoug the "nulptr dereferences trap" is disabled why the NULL pointer dereferencing is trapped on M1? Is there some kernel/environement level setting that is enabling the "nulptr dereferences trap".

Thanks for your help.



7 REPLIES 7
Dennis Handly
Acclaimed Contributor

Re: nulptr dereferences trap

>What I don't understand is that even though the "nulptr dereferences trap" is disabled why the NULL pointer dereferencing is trapped on M1?

Why do you think you have a null pointer? You could have a bad pointer. And you never can store into a null pointer.

You would need to do:
(gdb) disas $pc-16*4 $pc+16
(gdb) info reg
zorin09
New Member

Re: nulptr dereferences trap

Thanks for your inputs Dennis. I can see in the code that it is trying to access a NULL pointer. I have attached the output of the commands that you had provided in your reply.

(gdb) bt
#0 0x6443470:0 in blbn_rankCopyBuf+0x240 ()
#1 0x640ffa0:0 in + 0xf6a0 ()
#2 0x800f4d0:0 in + 0xcd0 ()
#3 0x800dfb0:0 in + 0x13a0 ()
#4 0xc99fc0:0 in + 0x3630 ()
#5 0x60000000cdbf6850:0 in mFt_xe_xcall+0x130 ()
from /opt/cobol_exp40/lib/libcobrts.so.2
#6 0x60000000cdba9870:0 in _mFgmain () at rtgmain.c:252
#7 0xc96720:0 in main+0x60 ()
(gdb) disas $pc-16*4 $pc+16
Dump of assembler code from 0x6443430:0 to 0x6443480:0:
0x6443430:0 : ld4 r8=[r8];;
0x6443430:1 : cmp4.ge p6=r40,r8
0x6443430:2 : nop.i 0x0
0x6443440:0 : nop.m 0x0
0x6443440:1 : nop.m 0x0
0x6443440:2 :
(p6) br.cond.dptk.few blbn_rankCopyBuf+0x1070;;
0x6443450:0 : shladd r8=r40,2,r0
0x6443450:1 : addl r9=36,r1;;
0x6443450:2 : nop.i 0x0
0x6443460:0 : ld4 r9=[r9]
0x6443460:1 : nop.i 0x0;;
0x6443460:2 : addp4 r8=r8,r9;;
0x6443470:0 : ld4 r8=[r8]
0x6443470:1 : nop.i 0x0;;
0x6443470:2 : addp4 r8=0,r8;;
End of assembler dump.
(gdb) info reg
pr0: 0x1
pr1: 0
pr2: 0x1
pr3: 0
pr4: 0
pr5: 0
pr6: 0
pr7: 0
pr8: 0
pr9: 0
pr10: 0
pr11: 0x1
pr12: 0x1
pr13: 0x1
pr14: 0x1
pr15: 0
pr16: 0
pr17: 0
pr18: 0
pr19: 0
pr20: 0
pr21: 0
pr22: 0
pr23: 0
pr24: 0
pr25: 0
pr26: 0
pr27: 0
pr28: 0
pr29: 0
pr30: 0
pr31: 0
pr32: 0
pr33: 0
pr34: 0
pr35: 0
pr36: 0
pr37: 0
pr38: 0
pr39: 0
pr40: 0
pr41: 0
pr42: 0
pr43: 0
pr44: 0
pr45: 0
pr46: 0
pr47: 0
pr48: 0
pr49: 0
pr50: 0
pr51: 0
pr52: 0
pr53: 0
pr54: 0
pr55: 0
pr56: 0
pr57: 0
pr58: 0
pr59: 0
pr60: 0
pr61: 0
pr62: 0
pr63: 0
gr0: 0
gr1: 0x90e1e08
gr2: 0
gr3: 0x2001
gr4: 0
gr5: 0xc000000000000408
gr6: 0x60000000c004b4e0
gr7: 0x200000007fff357c
gr8: 0
gr9: 0
gr10: 0x200000007fff3230
gr11: 0
gr12: 0x200000007fff3270
gr13: 0x200000006744a080
gr14: 0x8
gr15: 0x200000007fff29e4
gr16: 0x20000000675f31a0
gr17: 0x28
gr18: 0x67449cd3
gr19: 0
gr20: 0x200000007fff29e5
gr21: 0x7e
gr22: 0x2000000067449d00
gr23: 0x200000007fff29e6
gr24: 0x2000000067449d01
gr25: 0x675f8560
gr26: 0x37
gr27: 0x34
gr28: 0x200000007fff29e7
gr29: 0xfb628ac
gr30: 0
gr31: 0x200000007fff3070
gr32: 0xfb36470
gr33: 0x910a530
gr34: 0xfb364a0
gr35: 0xfb36470
gr36: 0x910a530
gr37: 0xfb364a0
gr38: 0
gr39: 0
gr40: 0
gr41: 0x643f5bc
gr42: 0
gr43: 0x4e
gr44: 0x9c66e08
gr45: 0x42
gr46: 0x4d
gr47: 0xc000000000002c60
gr48: 0x90e1e08
gr49: 0x640ffa0
gr50: 0x4d4720
gr51: 0x2
gr52: 0x20000000675ef580
gr53: 0x200000007fff3224
gr54: 0x200000007fff3228
gr55: 0x20000000675f2c20
gr56: 0x20000000675ef588
gr57: 0x33
br0: 0x6443410
br1: 0
br2: 0
br3: 0
br4: 0
br5: 0
br6: 0x60000000c0333a10
br7: 0xe000000000711ab0
rsc: 0x1f
bsp: 0x20000000678ffe38
bspst: 0x20000000678ffe38
rnat: 0
ccv: 0
unat: 0
fpsr: 0x9804c9a74433f
pfs: 0xc00000000000091a
(sor:0, sol:18, sof:26)
lc: 0
ec: 0
ip: 0x6443470:0
cfm: 0x91a
(sor:0, sol:18, sof:26)
psr: 0x6261636b75
(gdb)
Dennis Handly
Acclaimed Contributor

Re: nulptr dereferences trap

>I can see in the code that it is trying to access a NULL pointer.

It sure looks like it:
0x6443470:0 : ld4 r8=[r8]
gr8: 0

Are you sure you used chatr(1) on the executable and not some shared lib?
zorin09
New Member

Re: nulptr dereferences trap

Dennis I just want to make sure that you got my question right.

What I don't understand is that even though the "nulptr dereferences trap" is disabled why the NULL pointer dereferencing is trapped on M1? Is there some kernel/environement level setting that is enabling the "nulptr dereferences trap". If I do chatr on the exe I see that "nulptr dereferences trap disabled"
Dennis Handly
Acclaimed Contributor
Solution

Re: nulptr dereferences trap

>What I don't understand is that even though the "nulptr dereferences trap" is disabled why the NULL pointer dereferencing is trapped on M1?

This should be impossible. In order to implement the -Z default, a special null pointer page is used by the kernel. All processes with -Z share it. And I would think it is impossible for a normal user to use mprotect(2) to change the permissions.

>Is there some kernel/environment level setting that is enabling the "nulptr dereferences trap".

Not that I know of.
Can you write a small test case that fails on M1?
zorin09
New Member

Re: nulptr dereferences trap

I can't run anything on M1. M1 is production box. I'm not allowed to touch it.
Dennis Handly
Acclaimed Contributor

Re: nulptr dereferences trap

>I can't run anything on M1. I'm not allowed to touch it.

I suppose you could contact the Response Center to see if they have heard about bogus errors with -Z?
Is this executable on a NFS mount?

gr1: 0x90e1e08

This isn't valid. GR1 must be a valid 64 bit pointer at all times.  And be 16 byte aligned.