Operating System - HP-UX
1748183 Members
3339 Online
108759 Solutions
New Discussion юеВ

Attempt to access item beyond bounds of memory (Signal 11)

 
SOLVED
Go to solution
Aneesh Mohan
Honored Contributor

Attempt to access item beyond bounds of memory (Signal 11)

Hi all,

We are getting Cobol application error message in our application server after the reboot of the application server(vpar2) and Memory upgrade in Database server(vpar1).

Server :- HPUS v3, rx8640
Application :- COBOL
Error Message :- Attempt to access item beyond bounds of memory (Signal 11)


The error started after the below tasks done in HW side.

Database Server (Vpar 1) :- Upgraded Memory
Application Server (Vpar 2) :- Only just rebooted

Both Vpar1 and Vpar2 is in same Npar.

There was no change in kernel values of appplication & database server before and after reboot .

Any one faced this problem before please let me know.

Aneesh
15 REPLIES 15
Dennis Handly
Acclaimed Contributor

Re: Attempt to access item beyond bounds of memory (Signal 11)

This could just be a latent bug that you need to track down.
Aneesh Mohan
Honored Contributor

Re: Attempt to access item beyond bounds of memory (Signal 11)

Dear Dennis,

Could you please throw more light in your previous comment.

Aneesh
Dennis Handly
Acclaimed Contributor

Re: Attempt to access item beyond bounds of memory (Signal 11)

>Could you please throw more light in your previous comment?

Any changes on the system could change configuration files or uninitialized memory to cause it to abort.

You need to assume this is a bug in the application and get a stack trace to see where the bad value is coming from.
Do you have wdb/gdb installed?
Aneesh Mohan
Honored Contributor

Re: Attempt to access item beyond bounds of memory (Signal 11)

Dear Dennis,

I have gdb installed in the broken server.

I have asked the application team to generate stack trace.Meanwhile I have traced the application job through tusc.

TUSC o/p
========
hpus72^root:/fnsqa1 > grep brk bank.out | grep signal
1285769710.2588030 [28314]{812628} Received signal 18, SIGCLD, in brk(), [caug ht], no siginfo
1285769721.5492824 [29067]{814838} Received signal 18, SIGCLD, in brk(), [caug ht], no siginfo
1285769721.6267955 [29074]{814854} Received signal 18, SIGCLD, in brk(), [caug ht], no siginfo
1285769721.6460949 [29076]{814859} Received signal 18, SIGCLD, in brk(), [caug ht], no siginfo
hpus72^root:/fnsqa1 >

Can I know this can be the reason for the memory error.

Aneesh
Dennis Handly
Acclaimed Contributor

Re: Attempt to access item beyond bounds of memory (Signal 11)

>Received signal 18, SIGCLD, in brk(), [caug ht], no siginfo

This just says that a child process died/exited while you were busy working. This is normal.

>Can I know this can be the reason for the memory error?

No, unrelated. Note: This Signal 11 is a software error, nothing to do with RAM failing.
Aneesh Mohan
Honored Contributor

Re: Attempt to access item beyond bounds of memory (Signal 11)

Hi Dennis,

Please let me inform you I have noticed the memory segmentation error is same always.

faulting address: 0xffffffffffff7824


First run:-
===========

834898 1286007085.1875483 [16711]{1483418} brk(0x40016000) ..................................... [entry]
834899 1286007085.1875832 [16701]{1483391} sigprocmask(SIG_SETMASK, 0x200000007fffd380, NULL) .. [entry]
834900 1286007085.1876389 [16212]{1482166} stat("extfh.cfg", 0x9fffffffffff9940) ............... = 0
834901 1286007085.1877169 [16256]{1482285} brk(0x6000000005335000) ............................. = 0
834902 1286007085.1877989 [16305]{1482422} read(10, 0x600000000042bf36, 2064) .................. [entry]
834903 1286007085.1878552 [16416]{1482619} read(10, "0496\0\006\0\0\0\0\0060102\001L ".., 2064) = 1174
834904 1286007085.1879359 [16474]{1482768} semop(70, 0x9fffffffffffc470, 1) .................... [entry]
834905 1286007085.1879896 [16516]{1482885} semop(70, 0x9fffffffffffc470, 2) .................... [entry]
834906 1286007085.1880437 [16622]{1483174} write(10, 0x600000000042c766, 40) ................... [entry]
834907 1286007085.1881253 [16566]{1483027} read(10, "04ab\0\006\0\0\0\0\0060102\001L ".., 2064) = 1195
834908 1286007085.1882174 [16707]{1483406} getmount_cnt(0x9fffffffffff7010) .................... = 12
834909 1286007085.1882796 [16667]{1483298} ftruncate(28, 0) .................................... = 0
834910 1286007085.1883271 [16711]{1483418} brk(0x40016000) ..................................... = 0
834911 1286007085.1883590 [16701]{1483391} sigprocmask(SIG_SETMASK, 0x7fffd380, NULL) .......... = 0
834912 1286007085.1884248 [16212]{1482166} Received signal 11, SIGSEGV, in user mode, [caught], partial siginfo
834913 1286007085.1884298 [16212]{1482166} Siginfo: si_code: SEGV_ACCERR, faulting address: 0xffffffffffff7824, si_errno: 0
834914 1286007085.1884368 [16212]{1482166} PC: 00000001000000a0.0 break.m 0x16000
834915 1286007085.1886208 [16256]{1482285} brk(0x6000000005337000) ............................. [entry]
834916 1286007085.1886807 [16305]{1482422} read(10, "06` \0\006\0\0\0\0\0060102\0\0' ".., 2064) = 1632
834917 1286007085.1887719 [16622]{1483174} write(10, "\0( \0\006\0\0\0\0\003N ca\0\0\0".., 40) . = 40




Second run:-
==========

285772484.1635832 [29623]{816256} fcntl(49, F_SETLKW, 0x9fffffffffff9f00) [entr
y]
1285772484.1636261 [29559]{816105} fcntl(46, F_SETLK, 0x9fffffffffff95b0) = 0
1285772484.1636761 [29596]{816194} stat(0x60000000000477e0, 0x9fffffffffff9610)
[entry]
1285772484.1637118 [29091]{814897} fstat(49, 0x9fffffffffff9630) [entry]
1285772484.1637533 [29141]{815022} stat(0x60000000000477e0, 0x9fffffffffff9610)
[entry]
1285772484.1637878 [29191]{815151} fstat(49, 0x9fffffffffff9e20) [entry]
1285772484.1638263 [29245]{815292} Received signal 11, SIGSEGV, in user mode,
[caught], partial siginfo
1285772484.1638336 [29245]{815292} Siginfo: si_code: SEGV_ACCERR, faulting a
ddress: 0xffffffffffff7824, si_errno: 0
1285772484.1638398 [29245]{815292} PC: 00000001000000a0.0 break.m
0x16000
1285772484.1640003 [29352]{815571} close(48) ............. = 0
1285772484.1640376 [29303]{815440} stat("/fns/pd/r/data/file/ACCTCLSE_E", 0x9fff
ffffffff9550) = 0


I would like to add one more point here.I have copied complete application to one of my test server and ran the jobs succesfully ( 5 times).So I just wondering that there is some issue specific to this server since there is code change.

Aneesh

Could be any kernel parameter causing this restriction

Aneesh
Dennis Handly
Acclaimed Contributor

Re: Attempt to access item beyond bounds of memory (Signal 11)

>I have noticed the memory segmentation error is same always. faulting address: 0xffffffffffff7824

gdb is a better tool to track this down. You can find where this address is coming from.
You can also look at the registers and memory.

>Could be any kernel parameter causing this restriction?

The size of maxssiz could affect the memory region layout and the relative positions of each shared lib data area.

Are the libc (etc) versions the same on both systems?
Aneesh Mohan
Honored Contributor

Re: Attempt to access item beyond bounds of memory (Signal 11)

Hi Dennis,

TEST SERVER:
============
hpus102^root:/ > what /usr/lib/libc.sl
/usr/lib/libc.sl:
$ PATCH_11.31/PHCO_38658 Apr 7 2009 21:48:10 $
hpus102^root:/ > what /usr/lib/libc.0
/usr/lib/libc.0:
PATCH-PHCO_32718 for 10.20; for 10.30, 11.x compatibility libc.1_ID@@/main/r10dav/libc_dav/libc_dav_cpe//1
/ux/core/libs/libc/shared_pa1/libc.1_ID
Feb 4 2005 10:03:03
hpus102^root:/ > what /usr/lib/libc.1
/usr/lib/libc.1:
PATCH-PHCO_32718 for 10.20; for 10.30, 11.x compatibility libc.1_ID@@/main/r10dav/libc_dav/libc_dav_cpe//1
/ux/core/libs/libc/shared_pa1/libc.1_ID
Feb 4 2005 10:03:03
hpus102^root:/ >

hpus102^root:/ > kctune |grep ^max
max_acct_file_size 2560000 Default Immed
max_async_ports 4096 Default Immed
max_mem_window 0 Default Immed
max_thread_proc 1100 1100 Immed
maxdsiz 0x80000000 0x80000000 Immed
maxdsiz_64bit 0xf0000000 0xf0000000 Immed
maxfiles (now) 2048 Default
maxfiles_lim 60000 60000 Immed
maxrsessiz 8388608 Default
maxrsessiz_64bit 8388608 Default
maxssiz 134217728 134217728 Immed
maxssiz_64bit 1073741824 1073741824 Immed
maxtsiz 1073741824 1073741824 Immed
maxtsiz_64bit 0x40000000 0x40000000 Immed
maxuprc 256 Default Immed
hpus102^root:/ >


Production Server:-
==================

hpus72^root:/ > what /usr/lib/libc.sl
/usr/lib/libc.sl:
$ PATCH_11.31/PHCO_38658 Apr 7 2009 21:48:10 $
hpus72^root:/ > what /usr/lib/libc.0
/usr/lib/libc.0:
PATCH-PHCO_32718 for 10.20; for 10.30, 11.x compatibility libc.1_ID@@/main/r10dav/libc_dav/libc_dav_cpe//1
/ux/core/libs/libc/shared_pa1/libc.1_ID
Feb 4 2005 10:03:03
hpus72^root:/ > what /usr/lib/libc.1
/usr/lib/libc.1:
PATCH-PHCO_32718 for 10.20; for 10.30, 11.x compatibility libc.1_ID@@/main/r10dav/libc_dav/libc_dav_cpe//1
/ux/core/libs/libc/shared_pa1/libc.1_ID
Feb 4 2005 10:03:03
hpus72^root:/ >

hpus72^root:/ > kctune |grep ^max
max_acct_file_size 2560000 Default Immed
max_async_ports 4096 Default Immed
max_mem_window 0 Default Immed
max_thread_proc 1100 1100 Immed
maxdsiz 0x80000000 0x80000000 Immed
maxdsiz_64bit 0xf0000000 0xf0000000 Immed
maxfiles 4096 4096
maxfiles_lim 60000 60000 Immed
maxrsessiz 8388608 Default
maxrsessiz_64bit 8388608 Default
maxssiz 134217728 134217728 Immed
maxssiz_64bit 1073741824 1073741824 Immed
maxtsiz 1073741824 1073741824 Immed
maxtsiz_64bit 0x40000000 0x40000000 Immed
maxuprc 13324 13324 Immed
hpus72^root:/ >
Dennis Handly
Acclaimed Contributor

Re: Attempt to access item beyond bounds of memory (Signal 11)

>what /usr/lib/libc.sl

Unless you are running Aries for your COBOL app, you should be looking at /usr/lib/hpux32/libc.so.1.