1828943 Members
2377 Online
109986 Solutions
New Discussion

Re: System hanging

 
SOLVED
Go to solution
Manoj Sivan
Regular Advisor

System hanging

Hi,

We are having Proliant ML-370 Server running Red Hat Enterprise Linux AS release 3 (Taroon Update 6).

I am worried about the system hang. When it happens i am not able to login through SSH and one by one each process kills by itself and finaly the OS fully hangs.

The server is mainly NIS master with Automount. NIS process kills at end.during this time since VNC will be active i am able to manage the server.but at last it also dies.

On my investigation i had seen some irregularity in the top output.The system memory is almost consumed, but still the swap is not used.Please help me on this.

Below is the output of top and /proc/swaps. anything else ??

[root@fwsvr4 tmp]# more /proc/swaps
Filename Type Size Used Priority
/dev/sda2 partition 4016240 0 -1
/dev/sdb2 partition 4016240 0 -2
===============================================
[root@fwsvr4 tmp]# more /proc/swaps
Filename Type Size Used Priority
/dev/sda2 partition 4016240 0 -1
/dev/sdb2 partition 4016240 0 -2
============================================
Thanks in advance

Manoj
5 REPLIES 5
Manoj Sivan
Regular Advisor

Re: System hanging

17:36:58 up 2 days, 10:43, 9 users, load average: 10.02, 10.05, 9.97
163 processes: 161 sleeping, 1 running, 1 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 0.1% 0.0% 0.4% 0.0% 0.0% 0.2% 98.9%
cpu00 0.1% 0.0% 0.3% 0.0% 0.1% 0.0% 99.2%
cpu01 0.3% 0.0% 0.0% 0.0% 0.0% 0.0% 99.6%
cpu02 0.0% 0.0% 1.3% 0.0% 0.0% 0.5% 98.0%
cpu03 0.1% 0.0% 0.1% 0.0% 0.0% 0.5% 99.0%
Mem: 5064172k av, 5037604k used, 26568k free, 0k shrd, 281712k buff
1042140k actv, 2848104k in_d, 112840k in_c
Swap: 8032480k av, 0k used, 8032480k free 4374156k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
5414 root 15 0 1252 1252 904 S 0.1 0.0 2:56 2 cmaeventd
6916 root 15 0 9904 9900 2024 S 0.1 0.1 1:14 1 Xvnc
1 root 15 0 496 496 436 S 0.0 0.0 0:06 0 init
2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1
4 root RT 0 0 0 0 SW 0.0 0.0 0:00 2 migration/2
5 root RT 0 0 0 0 SW 0.0 0.0 0:00 3 migration/3
6 root 15 0 0 0 0 SW 0.0 0.0 0:00 2 keventd
7 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
8 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1
9 root 34 19 0 0 0 SWN 0.0 0.0 0:00 2 ksoftirqd/2
10 root 34 19 0 0 0 SWN 0.0 0.0 0:00 3 ksoftirqd/3
13 root 25 0 0 0 0 SW 0.0 0.0 0:00 1 bdflush
11 root 15 0 0 0 0 SW 0.0 0.0 0:03 0 kswapd
12 root 15 0 0 0 0 SW 0.0 0.0 0:18 1 kscand
14 root 15 0 0 0 0 SW 0.0 0.0 0:18 1 kupdated
Vitaly Karasik_1
Honored Contributor

Re: System hanging

Regarding memory/swap usage - it's OK, linux kernel just use all available RAM for cahing/buffering.

As for system hangs - do you see something interesting in /var/log/messages?

Manoj Sivan
Regular Advisor

Re: System hanging

Hi Vitaly,

Thanks for the reply.The /var/log/messages is silent on any kind of errors.

Is the following message in dmesg useful ?

scsi singledevice 7 4 0 4
scsi singledevice 7 4 0 5
scsi singledevice 7 4 0 6
scsi singledevice 7 4 0 7
Fusion MPT misc device (ioctl) driver 2.05.16.02
mptctl: Registered with Fusion MPT base driver
mptctl: /dev/mptctl @ (major,minor=10,220)
mtrr: type mismatch for fc000000,800000 old: uncachable new: write-combining
mtrr: type mismatch for fc000000,800000 old: uncachable new: write-combining
audit_intercept: error 14, killing task

Awaiting your reply.

Manoj
Vitaly Karasik_1
Honored Contributor
Solution

Re: System hanging

After some googling I found this thread http://linux.derkeiler.com/Mailing-Lists/RedHat/2005-10/109index.html - audit kills programs.
So if you don't need audit - stop auditing.

Rgds,
Vitaly
Manoj Sivan
Regular Advisor

Re: System hanging

Hi Vitaly,

Your solution worked and till date server didnt hanged. Apreciated a lot.

Thanks and Regards

Manoj