System Administration
cancel
Showing results for 
Search instead for 
Did you mean: 

Out of memory: kill process 4387 (gnome-session)

 
SOLVED
Go to solution
Maaz
Valued Contributor

Out of memory: kill process 4387 (gnome-session)

OS: SLES11 i586
kernel version: 2.6.27.x

Hardware: Its a Xeon Dual Core processor, with 2 GB Physical Memory and 4 GB Swap.

/ is mounted on /dev/sda2
swap is on /dev/sda1

/dev/sdb1 ext3 mounted on /devel_1
/dev/sdb2 ext3 mounted on /devel_2

/dev/sdc5 ntfs mounted on /mnt

after installation, I started copying 6 directories(having sub directories, and a large number of files) of size 53 GB from /mnt to /devel_1, and when the 50 GB was copied on /devel_1, I started copying some other remaining directories of size 34 GB from /mnt to /devel_2.

now when almost 52.2 GB was copied on /devel_1, and 7 GB was copied on /devel_2, the copy operation quit/abort automatically(I was copy/pasting in X/GUI via mouse)

following are the messages in /var/log/messages

Jul 7 20:40:38 linux-6021 kernel: Active:58526 inactive:224713 dirty:603 writeback:0 unstable:0
Jul 7 20:40:38 linux-6021 kernel: free:5583 slab:185922 mapped:8563 pagetables:503 bounce:0
Jul 7 20:40:38 linux-6021 kernel: Node 0 DMA free:3524kB min:64kB low:80kB high:96kB active:92kB inactive:0kB present:15852kB pages_scanned:1496 all_unreclaimable? yes
Jul 7 20:40:38 linux-6021 kernel: lowmem_reserve[]: 0 872 1990 1990
Jul 7 20:40:38 linux-6021 kernel: Node 0 Normal free:3840kB min:3744kB low:4680kB high:5616kB active:1340kB inactive:692kB present:893200kB pages_scanned:4329 all_unreclaimable? yes
Jul 7 20:40:39 linux-6021 kernel: lowmem_reserve[]: 0 0 8946 8946
Jul 7 20:40:39 linux-6021 kernel: Node 0 HighMem free:14968kB min:512kB low:1712kB high:2912kB active:232672kB inactive:898160kB present:1145176kB pages_scanned:0 all_unreclaimable? no
Jul 7 20:40:39 linux-6021 kernel: lowmem_reserve[]: 0 0 0 0
Jul 7 20:40:39 linux-6021 kernel: Node 0 DMA: 3*4kB 3*8kB 0*16kB 1*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3524kB
Jul 7 20:40:39 linux-6021 kernel: Node 0 Normal: 76*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3640kB
Jul 7 20:40:39 linux-6021 kernel: Node 0 HighMem: 610*4kB 1084*8kB 87*16kB 56*32kB 6*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 15064kB
Jul 7 20:40:39 linux-6021 kernel: 242306 total pagecache pages
Jul 7 20:40:39 linux-6021 kernel: 4530 pages in swap cache
Jul 7 20:40:39 linux-6021 kernel: Swap cache stats: add 9237, delete 4707, find 1212/1779
Jul 7 20:40:39 linux-6021 kernel: Free swap = 4174136kB
Jul 7 20:40:39 linux-6021 kernel: Total swap = 4192956kB
Jul 7 20:40:39 linux-6021 kernel: 523264 pages RAM
Jul 7 20:40:39 linux-6021 kernel: 293888 pages HighMem
Jul 7 20:40:39 linux-6021 kernel: 44083 pages reserved
Jul 7 20:40:39 linux-6021 kernel: 259053 pages shared
Jul 7 20:40:39 linux-6021 kernel: 234509 pages non-shared
Jul 7 20:40:39 linux-6021 kernel: Out of memory: kill process 4387 (gnome-session) score 4815 or a child
Jul 7 20:40:39 linux-6021 kernel: Killed process 5067 (nautilus)

is it a bug ?
or a limitation of i586 linux kernel ?
or a hardware limitation ?

please help me understand the reason ?

Regards
Maaz
12 REPLIES
Steven E. Protter
Exalted Contributor

Re: Out of memory: kill process 4387 (gnome-session)

Shalom Maaz,

Run the free command please.

Lets be certain that it is not an actual shortage of memory.

What I suspect based on the output is a bug, some kind of a memory leak that caused a process or process group to consume a great deal of memory.

Probably with gnome.

I would think you are not the first to see this and there might be updated gnome packages out there.

Or it could be something else, another mean ugly application eating all of the memory.

It is probably not a hardware limitation or a i586 kernel problem.

Is the system still running? There are a lot of things we can do if the system is still running.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Maaz
Valued Contributor

Re: Out of memory: kill process 4387 (gnome-session)

>Run the free command please.
# free -m
total used free shared buffers cached
Mem: 1871 1822 49 0 129 1586
-/+ buffers/cache: 106 1765
Swap: 4094 0 4094

the above command/output was after I reboot the server. By mistake, I rebooted the server, without running the diagnostic tools/commands.

>What I suspect based on the output is a bug, some kind of a memory leak that
>caused a process or process group to consume a great deal of memory.
>Probably with gnome.
OK.

>I would think you are not the first to see this and there might be
>updated gnome packages out there.
I didnt found any update for gnome packages at
http://support.novell.com/linux/psdb/i386SLESERVER11.html

>Or it could be something else, another mean ugly application eating all of
>the memory.
no other application installed, other then the default SLES 11 installation, and no additional service was running(later I configured the SAMBA, and now its in production as a file/samba server)

>It is probably not a hardware limitation or a i586 kernel problem.
OK.

>Is the system still running? There are a lot of things we can do if the system is
>still running.
no, unfortunately we rebooted the system when the copy operation terminated/abort.

Regards
Steven E. Protter
Exalted Contributor

Re: Out of memory: kill process 4387 (gnome-session)

Shalom again Maaz,

I suggest at this point that you update the system from SUSE network and begin taking regular diagnostics.

Of particular interest is vmstat, we need to see if the system is paging.

If you are really hot to find the cause and not just solve the problem, which is probably base OS, then don't update the system and use cron to regularly obtain data on vmstat top (to a file) and free.

The Memory leak detector will help spot the name of top memory processes if you set it to run periodically to collect data.

http://www.hpux.ws/?p=8

Once the process is identified its pretty easy to track it back to a particular application.

Now the memory leak detector has never been tested on SUSE. If ps works is the same as Linux it will work. If not modification is required. We'll actually need to speak on the telephone or Internet Messaging to work that out.

I will modify the leak detector for you if it is needed.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Court Campbell
Honored Contributor

Re: Out of memory: kill process 4387 (gnome-session)

You ran out of virtual memory so the oom killer chose a process to kill in order to free up virtual memory. This is not a bug. It just so happens that gnome-session won the oom killer lottery and was the process that got killed. There are a few ways to alleviate this, but IMO the easiest may be to add more RAM. You could go to approx 4GB of ram. Since it's a 32 bit OS, anything more than that is somewhat useless. You can always go the route of a pae enabled kernel, etc. But I don't think you are buying yourself anything.
"The difference between me and you? I will read the man page." and "Respect the hat." and "You could just do a search on ITRC, you don't need to start a thread on a topic that's been answered 100 times already." Oh, and "What. no points???"
Maaz
Valued Contributor

Re: Out of memory: kill process 4387 (gnome-session)

Hi SEP,
>I suggest at this point that you update the system from SUSE network
>and begin taking regular diagnostics.

>If you are really hot to find the cause and not just solve the problem, which is
>probably base OS, then don't update the system and...
Actually this SLES11 box is a file server for windows clients(before we were running MS Windows 2k3 Server as a file server). Just after the installation, and before registering with Novell and before configuring/running the SAMBA, I was copying data from ntfs drive(/dev/sdc5) to ext3 drives(/dev/sdb1 and /dev/sdb2), and this issue only occured when almost 50 GB(out of 53 GB) was copied to /dev/sdb1(/devel_1) and 7 GB(out of 33 GB) was copied to /dev/sdb2(/devel_2)... no such error occured again(obviously I didnt copied such huge data again, in GUI)

Now I have already registered this machine, and now fully updated

>Of particular interest is vmstat, we need to see if the system is paging.
I run the vmstat(vmstat 1) in busy hours, and 'si' and 'so' are almost always '0'.

>We'll actually need to speak on the telephone or Internet Messaging to work that out.
I really appreciate your willingness to help me. ;-)

Hi Court Campbell
>You ran out of virtual memory so the oom killer chose a process to kill in
>order to free up virtual memory. This is not a bug
I think(only think.. because I am not expert like you guys) that its a bug.. because a process(in my case nautilus(copy/paste function) ) should not eat/demand such amount of memory that cause the system(kernel: Out of memory:) to kill the process(nautilus copy/paste)

I think kernel or process(nautilus copy/paste) must understand that how much memory is installed/available to the system, and should not exceed the limits.

I think its better that process(in my case nautilus copy/paste) take more time, rather eat memory completely, and cause the operation aborted/killed by the system/kernel

Regards
Maaz

Court Campbell
Honored Contributor
Solution

Re: Out of memory: kill process 4387 (gnome-session)

I am telling you it is not a bug. The OOM killer has a standard of weights and measures that it uses to decide what to kill. It is not by any means perfect, but it keeps your system running out of pages. You might want to Google "Linux OOM Killer" and read up on it.

If I had to guess, since you were doing the copy in the GUI, which would have used nautilus, it killed it since the copy operation was using a lot of pages for buffering.
"The difference between me and you? I will read the man page." and "Respect the hat." and "You could just do a search on ITRC, you don't need to start a thread on a topic that's been answered 100 times already." Oh, and "What. no points???"
Maaz
Valued Contributor

Re: Out of memory: kill process 4387 (gnome-session)

Hi Court Campbell

>I am telling you it is not a bug.
OK.

>The OOM killer has a standard of weights and measures that it uses to
>decide what to kill. It is not by any means perfect, but it keeps your system
>running out of pages. You might want to Google "Linux OOM Killer" and
>read up on it.

Thanks for very nice explanation

>If I had to guess, since you were doing the copy in the GUI, which would
>have used nautilus, it killed it since the copy >operation was using a lot
>of pages for buffering.
Yes ... from /mnt(ntfs) I was copying some selected directories of 53 GB size to /devel_1(ext3 /dev/sdb1) and some remaining directories from /mnt of size 33 GB to /devel_2(ext3 /dev/sdb2), and when 51 GB was copied on /devel_1 and 10 GB was copied to /devel_2, this issue was appeared i.e the copy operation was aborted/quit/stopped automatically.

I am a person with almost 0 knowledge...and I use to beleive on the opinions/suggestions from you gurus... but with due respect I feel that this is a very worst feature(of linux).. because instead of killing the process(in my case copy operation) if the process was demanding/asking for too much memory.. the kernel should prevent the process from eating more memory, rather kill the process. I mean system must have a maximum limit imposed/applied on every process, so that no process can starve the others or no process can cross the limit(depending upon the installed memory), so that such type of process killing never hapend.

Again thanks for nice inputs/feedbacks/suggestions

Regards
Maaz
Court Campbell
Honored Contributor

Re: Out of memory: kill process 4387 (gnome-session)

Can you post the info from /proc/sys/vm/overcommit_memory? Just run

cat /proc/sys/vm/overcommit_memory

"The difference between me and you? I will read the man page." and "Respect the hat." and "You could just do a search on ITRC, you don't need to start a thread on a topic that's been answered 100 times already." Oh, and "What. no points???"
Steven E. Protter
Exalted Contributor

Re: Out of memory: kill process 4387 (gnome-session)

Based on the recent data, especially the fact the system is not paging, it seems like an unresolved problem in nautilus.

If nautilus is crashing consistently with out of memory and the system has plenty of memory or is not paging, the finger should be pointed to either the failing application or general kernel memory management.

I would look now enable core dumps to disk, duplicate the problem again and send the core dump to Novell.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Maaz
Valued Contributor

Re: Out of memory: kill process 4387 (gnome-session)

Hi Court Campbell
>Can you post the info from /proc/sys/vm/overcommit_memory?

# cat /proc/sys/vm/swappiness
60
# cat /proc/sys/vm/overcommit_memory
0
# cat /proc/sys/vm/overcommit_ratio
50

Hi SEP
>Based on the recent data, especially the fact the system is not paging, it seems
>like an unresolved problem in nautilus.
OK.

>If nautilus is crashing consistently with out of memory and the system has
>plenty of memory or is not paging, the finger should be pointed to either the
>failing application or general kernel memory management.
a, It happend just the once
b, nautilus was not crashed(I mean no crash message was appeared) but the the small window that shows the copy progress/status was automatically closed/quit .. the copy/paste progress/status window/dialog-box was closed as like that the copy operation was completed.(e.g you start copying via mouse and once all data copied the progress window/dialog-box disappears/closes).
c, now this machine is a file server(samba) and is working quite normal/fine, i.e no crash or performance issue. And I never copied such big directories in X(GNOME) again.

>I would look now enable core dumps to disk, duplicate the problem again and
>send the core dump to Novell.
I dont have this opportunity, because I have no spear/additional box, and I cant practice this on the machine in question(because this machine is in production)

Regards
Maaz
Steven E. Protter
Exalted Contributor

Re: Out of memory: kill process 4387 (gnome-session)

So Maaz,

This rather involved thread is due to a problem that occurred once. We didn't get memory data off the system because it was production and rebooted, presumably to get it back into service.

Since the problem can not be easily duplicated, allow me to present the possibility:

The error message was accurate, due to some process with a memory leak, the system really was out of memory, and nautilus crashed exactly why it said it crashed.

I therefore recommend that you run the memory leak detector an hour a week to identify processes that may be leaky. I further recommend you enable core dumps to disk and use the system normally until it happens again.

Good Luck with this.

It was a fun thread.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Court Campbell
Honored Contributor

Re: Out of memory: kill process 4387 (gnome-session)

Since you don't like overcommiting you can echo the number 2 to /proc/sys/vm/overcommit_memory. I am sure this can be set permanently, but I can not sure where to do that on SUSE. This will disable overcommiting memory. Now if a process can't claim any pages it will just fail.

My only other suggestion would be to use a shell. Forget about window managers. They just add bloat. I'll take a simple cp command, or an rsync command over the extra bloat added by doing a copy through nautilis.
"The difference between me and you? I will read the man page." and "Respect the hat." and "You could just do a search on ITRC, you don't need to start a thread on a topic that's been answered 100 times already." Oh, and "What. no points???"