Operating System - Linux
1753747 Members
5019 Online
108799 Solutions
New Discussion юеВ

Re: LINUX: segfault error 4

 
SOLVED
Go to solution
Jojo Castro
Regular Advisor

LINUX: segfault error 4

Hi All,

I am currently checking the problem with our application developer. Apparently, theire application always stopped after every two hours running in the background. The process is being submitted in the background via fork. When I check the messages log, i found out this:

May 5 05:39:34 URM01 kernel: logging_app[17789]: segfault at 0000000000000000 rip 0000003bb7861dd1 rsp 0000007fbfffe460 error 4
May 6 07:14:43 URM01 kernel: logging_app[10965]: segfault at 0000000000000000 rip 0000003bb7861dd1 rsp 0000007fbfffe430 error 4
May 6 07:14:43 URM01 kernel: logging_app[10964]: segfault at 0000000000000000 rip 0000003bb7861dd1 rsp 0000007fbfffe430 error 4
May 6 10:00:46 URM01 kernel: logging_app[12754]: segfault at 0000000000000000 rip 0000003bb7861dd1 rsp 0000007fbfffe420 error 4
May 6 13:17:46 URM01 kernel: logging_app[16639]: segfault at 0000000000000000 rip 0000003bb7861dd1 rsp 0000007fbfffe430 error 4
May 6 13:18:10 URM01 kernel: logging_app[16638]: segfault at 0000000000000000 rip 0000003bb7861dd1 rsp 0000007fbfffe430 error 4
May 6 16:14:44 URM01 kernel: logging_app[18255]: segfault at 0000000000000000 rip 0000003bb7861dd1 rsp 0000007fbfffe430 error 4
May 6 16:14:53 URM01 kernel: logging_app[18256]: segfault at 0000000000000000 rip 0000003bb7861dd1 rsp 0000007fbfffe430 error 4
May 6 19:11:16 URM01 kernel: logging_app[10646]: segfault at 0000000000000000 rip 0000003bb7861dd1 rsp 0000007fbfffe430 error 4
May 6 19:11:23 URM01 kernel: logging_app[10648]: segfault at 0000000000000000 rip 0000003bb7861dd1 rsp 0000007fbfffe430 error 4

Basically, I already look at google and check the meaning of segfault 4 and somebody told that it has something to do with SELINUX.

Here are my queries:
1.) Can someone tell me what does the exact error means (segfault)
2.) Strange that application always stopped every two hours that corresponds to the segfault error time being logged in messages
3.) PLEASE PLEASE PLEASE give us resolution to this problem.

Thanks in advance!
4 REPLIES 4
Matti_Kurkela
Honored Contributor
Solution

Re: LINUX: segfault error 4

1.) Segfault = segmentation fault = the application is trying to access a memory area that belongs to the OS or some other program. The memory management unit in the CPU stops the operation and triggers an exception. The standard segfault exception handler in the kernel kills the program.

As the message is "segfault at 0000000000000000", I'd guess the program probably tried to use an uninitialized pointer, which has a value NULL. It is very likely that your application has a fairly serious bug in it.

The "rip" value is the Instruction Pointer: the program location the CPU was running at the time of the error. It seems it is always exactly the same, so the error is repeatable - that is good.

The "rsp" is the Stack Pointer. Its value seems to vary just a little. If your developer is good, s/he will know whether this is important or not.

2.) Not strange at all. After receiving a segfault, the program cannot continue.

3.) Without having the program source code, this is impossible. Your application developer will have to fix it him/herself.

However, there are some things you can do:

If possible, have your application developer produce a version of the application that includes debug information. If the application is compiled using gcc, this is as simple as adding the "-g" option to the compilation commands.

Before starting the application, run "ulimit -c unlimited". This allows the segfault handler to produce a core dump file when the segfault handler is triggered. This file contains all the memory used by the application, so it might be very big.

Then your application developer needs to run a debugger program on the application and the core file. If the application was compiled with debug information, the debugger can identify exactly on what line of the source code the error happened. The developer can also use the debugger to examine the values of any variables at the time of the error. The debugger has many other features which might be useful too. If your developer does not know how to use a debugger, he/she should definitely learn it.

For Linux, the most common debugger program is named "gdb" and it is available in most Linux distributions. It is usually in the "development tools" category of the distribution's package collection.

MK
MK
Jojo Castro
Regular Advisor

Re: LINUX: segfault error 4

Hi MK,

Thanks for the information regarding segfault.
I have already fowarded your recommendation to our developer and they will try to look to the issue of "pointer" being a bug on the're application.

Currently, this are my ulimit values:

core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
pending signals (-i) 1024
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 137215
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited


Actually, we have already tried ulimit -c 1000 to be included on .bash_profile of the account that runs the program.

Action points:

1.) Run the debug mode on program
2.) Since there are two application running on the same time, we will try to run 1 application only at a time. These might the issue on "locking" you mentioned.
3.) How can we use gdb?

Another question though, is adjusting some kernel parameters on OS side will somehow help?

Thanks!
Jojo Castro
Regular Advisor

Re: LINUX: segfault error 4

btw, here is also the finding from metalink as our dba also open a case for this...

KERNEL PARAMETER
--------------------------------
semmsl 250 / 250 / OK
semmns 32000 / 32000 / OK
semopm 100 / 32 / ----> TO LOW
semmni 128 / 128 / OK
--> kernel.sem = 250 32000 32 128

shmall 2097152 / 2097152 / OK
shmmax - / 33554432 / OK
shmmni 4096 / 4096 / OK
file-max 65536
ip_local_port_range 1024 - 65000 / 32768 - 61000 / OK
rmem_default 262144 / 135168 ----> TO LOW
rmem_max 262144 / 135168 ----> TO LOW
wmem_default 262144 / 135168 ----> TO LOW
wmem_max 262144 / 135168 ----> TO LOW



SEMOPM
-------------
The SEMOPM kernel parameter is used to control the number of semaphore operations that can be perfo
rmed per semop system call.

The semop system call (function) provides the ability to do operations for multiple semaphores with one semop system call. A se
maphore set can have the maximum number of SEMMSL semaphores per semaphore set a
nd is therefore recommended to set SEMOPM equal to SEMMSL.

Oracle recommends setting the SEMOPM to a value of no less than 100.


ACTION PLAN
===========

1. following kernel parameters on the system where Client 10.2.0.1 is used to run application
must be increased like following. After that, machine has to be rebootet - see

Oracle├В┬о Database Installation Guide
10g Release 2 (10.2) for Linux x86-64
Part Number B15667-03

http://download.oracle.com/docs/cd/B19306_01/install.102/b15667/pre_install.htm#BABCHAED

--------------------------------
semopm 100
rmem_default 262144
rmem_max 262144
wmem_default 262144
wmem_max 262144
---------------------------------

2. Oracle does NOT support user-generated makefiles - only shipped ones contained in

$ORACLE_HOME/precomp/demo/proc
$ORACLE_HOME/precomp/lib/


Thanks.
Jojo Castro
Regular Advisor

Re: LINUX: segfault error 4

Hi MK,

Just to give you an update, we found out that the application is hugging so many files thus hitting 1024 number of open files limit.
Our application developer is now looking at the application part were looping is currently happening.

Thanks again for the info!