1844080 Members
4389 Online
110227 Solutions
New Discussion

Memory Fault (coredump)

 
Shawn D. Givler
Occasional Contributor

Memory Fault (coredump)

I'm getting a "Memory fault" with a coredump running a f90 compiled binary [I don't have access to the source:(] with HP-UX ia64 B.11.23 and am getting the following core message:

% file core
core: ELF-64 core file - IA64 from 'ammbatch' - received SIGSEGV

which indicates an invalid memory reference.

User limits appear to be adequate:
% ulimit -a
cputime unlimited
filesize unlimited
datasize 1048576 kbytes
stacksize 98252 kbytes
coredumpsize 0 kbytes
descriptors 2048 files
memoryuse unlimited

And hardware inventory is:
CPU Type: Intel(R) Itanium 2 (zx6000)
CPU Addressability: 64bit
CPU Count: 2
CPU Clock: 1500 MHz
CPU Cache: 256 KB (L2), 6144 KB (L3)
Physical Memory: 12788 MB
Swap Space: 32245 MB

Are there any kernel parameters (see attachment) which may not be set appropriately which could be causing this?
4 REPLIES 4
A. Clay Stephenson
Acclaimed Contributor

Re: Memory Fault (coredump)

Nothing in the tunables jumps out at me. All of your limits should be big enough. You need to increase the coredumpsize so that a real core file will be created so that the debugger has something to work with. Your first task should be to examnine the core file and executable with a debugger (gdb?) and do a stack trace. That will at least point you to the offending system call. The file command is all but useless for this.

I suspect that this has very little to do with tunables but is instead bad code or missing enviroment variables OR the code is not correct for your target environment. Do the stack trace and then contact the vendor.
If it ain't broke, I can fix that.
Bill Hassell
Honored Contributor

Re: Memory Fault (coredump)

SIGSEGV most likely means that the program continued to add items to the stack until it overflowed. There are two possible programming errors: runaway code that loops and eventually exceeds the maximum stack size allowed by the kernel. The other error is that the code actually requires a lot more than 98 megs and the programmer forgot to document this unusual requirement. If maxssiz_64 (this is a 64bit program) value is a lot larger (say 200 or 400 megs), use ulimit -s to increase the current environment. If maxssiz_64 is small (ie, 98megs) then change this kernel parameter to a much larger value and try the program again

Or you can ask the programmer what is needed for a maximim stack...and if it's hundreds of megs, ask why...


Bill Hassell, sysadmin
Gerhard Roets
Esteemed Contributor

Re: Memory Fault (coredump)

Hi Shawn

Please do a file command on the binary you executed as well and post it here.

Regards
Gerhard
Muthukumar_5
Honored Contributor

Re: Memory Fault (coredump)

Your corefile size limit is given as,
coredumpsize 0 kbytes

Change into % ulimit -c unlimited
so that it will gather unlimited amount of corefile there.

Check with gdb as,

gdb ammbatch core

gdb> bt
will give backtrace informations there.


Try to analyse backtrace inforamations ther e more. And lookinto maxssiz, maxssiz_* kernel parameters there.


HTH.
Easy to suggest when don't know about the problem!