Secure OS Software for Linux
1752282 Members
4722 Online
108786 Solutions
New Discussion юеВ

Re: RedHat ES3 server crashing

 
tyreman
New Member

RedHat ES3 server crashing

We have a new HP ML370 G5 Quad Core Xeon server running RedHat ES3 which randomly crashes every few days.

The messages which appeared on my screen are :

-bash: [: too many arguments
-bash: [: : integer expression expected

Whenever we tried any command such as "uptime", "top", "who" or "ps" the system returned with the message "Killed"

Upon startup we had the following message:
smartd: smartd startup failed

Any help would be greatly appreciated!!
5 REPLIES 5
Alexander Chuzhoy
Honored Contributor

Re: RedHat ES3 server crashing

As for the smartd - you can simply disable it with "chkconfig smartd off".
It failes to start in this version of RedHat when there're SAS or other relatively new hard drives on the system.

Could you post the content of the /etc/bashrc, ~/.bash_profile and ~/.bashrc files? It seems like someone edited at least one of these files.

Also, what do you mean by "server crashes"? Does it actually reboot/goes down or prints the above messages?
Steven E. Protter
Exalted Contributor

Re: RedHat ES3 server crashing

Shalom,

What update of ES3.

Updates below update 4 will crash due to hardware issues. Though it should not have installed.

/var/log/messages

Take a look for clues time/date stamped around the crash time.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
tyreman
New Member

Re: RedHat ES3 server crashing

thanks for the advise so far, the update is update 8
tyreman
New Member

Re: RedHat ES3 server crashing

We're using ES3 U8.
the /var/log/messages shows the following data, that repeats over and over. when this happens users are kicked from the server. New logon's are incredibly slow if they can log on at all, and the only cure appears to be reboot the server. Which will be fine for 2 or 3 more days, until the whole process repeats itself.

data from messages follows :-

Feb 12 16:30:21 bushlinux kernel: Free swap: 0kB
Feb 12 16:30:22 bushlinux kernel: 523871 pages of RAM
Feb 12 16:30:22 bushlinux kernel: 294487 pages of HIGHMEM
Feb 12 16:30:22 bushlinux kernel: 10418 reserved pages
Feb 12 16:30:22 bushlinux kernel: 2689813 pages shared
Feb 12 16:30:22 bushlinux kernel: 73 pages swap cached
Feb 12 16:30:22 bushlinux kernel: Out of Memory: Killed process 2434 (bash).
Feb 12 16:30:24 bushlinux kernel: Mem-info:
Feb 12 16:30:24 bushlinux kernel: Zone:DMA freepages: 2922 min: 0 low:
0 high: 0
Feb 12 16:30:24 bushlinux kernel: Zone:Normal freepages: 1276 min: 1278 low:
4543 high: 6303
Feb 12 16:30:24 bushlinux kernel: Zone:HighMem freepages: 254 min: 255 low:
4600 high: 6900
Feb 12 16:30:24 bushlinux kernel: Free pages: 4452 ( 254 HighMem)
Feb 12 16:30:24 bushlinux kernel: ( Active: 325089/3145, inactive_laundry: 0, in
active_clean: 1, free: 4452 )
Feb 12 16:30:24 bushlinux kernel: aa:0 ac:0 id:0 il:0 ic:0 fr:2922
Feb 12 16:30:24 bushlinux kernel: aa:138933 ac:1389 id:3095 il:0 ic:1 fr:1276
Feb 12 16:30:24 bushlinux kernel: aa:182965 ac:1789 id:63 il:0 ic:0 fr:254
Feb 12 16:30:24 bushlinux kernel: 2*4kB 4*8kB 4*16kB 2*32kB 4*64kB 2*128kB 1*256
kB 1*512kB 0*1024kB 1*2048kB 2*4096kB = 11688kB)
Feb 12 16:30:24 bushlinux kernel: 6*4kB 3*8kB 8*16kB 22*32kB 4*64kB 1*128kB 1*25
6kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 5104kB)
skt_skt
Honored Contributor

Re: RedHat ES3 server crashing

Feb 12 16:30:21 bushlinux kernel: Free swap: 0kB

how is the swap/mem usage at the problamaitc window

Feb 12 16:30:22 bushlinux kernel: Out of Memory: Killed process 2434 (bash).

how is the swap/mem usage at the problamaitc window


As a test, you can simulate a crash with the Sysrq facility. You can test this by enabling sysrq and following this article:
http://kbase.redhat.com/faq/FAQ_80_5559.shtm

The 'c' character will simulate a crash. Time the core creation so that this will give you a guideline if there is a crash again. Do not manually reboot until AFTER the core is created.