Operating System - Linux
1826123 Members
4906 Online
109690 Solutions
New Discussion

Re: Performance problems on ML350G3

 
SOLVED
Go to solution
Rodrigo_33
New Member

Performance problems on ML350G3

Hello there,

I'm having a strange and serious performance problem with Redhat 7.3
running on a HP ML350 G3 dual Xeon server with 1.5 GB of RAM and three
RAID disks on a 641 controller.

Server seems to work all right for three, four, even six days. Then it
becomes absolutely unresponsive but it does not crash. It will not
even display a console screen.

I decided to run via cron the free command every 10 minutes, log this
information and create line graphics to analyze the problem. After
rebooting the server I can see:
1. It takes 2.5 hours for free memory to drop from 1.2 GB to 11 MB.
2. Cached memory grows to almost all free memory and then starts a 45
degree decrease which ends in around 75 MB then it becomes
unresponsive.
3. Between 12 and 24 hours before becoming unresponsive memory buffers
suffer a strong decrease and then a steady decrease, reaching values
of 1120K.

This server runs apache, tomcat, postgres, AWStats and tapeware backup
software.
Any help will be greatly appreciated, since I really do not know were
to start.
Could it be Postgres memory settings?
Kernel memory settings?

I have these graphics and memory value table in Excel, see attachment.

thank you all
7 REPLIES 7
KristofH
Frequent Advisor

Re: Performance problems on ML350G3

Can you have a look at the output of 'top' ?
Maybe you can find the process that is taking up the resources.

I had a similar problem with my swap file being filled up completely. (makeing the system unresponsive because disk access is 'so' slow compared to RAM access)
Steven E. Protter
Exalted Contributor

Re: Performance problems on ML350G3

I do not think, based on hardware you should be having any problem based on the software set.

Checklist:
* Make sure patches are installed.
* Run up2date or yum and be on the latest patch set and kernel

I do believe that there could be an interaction between the postgres database and the kernel.

I'm wondering if ussage is heavy. In this circumstance, I would lean toward thinking an application has a memory leak. The key to isolating it is shutting down various components and see if memory stops leaking.

Also, check your httpd logs for strange entries that could indicate a denial of service attack. I've recently been forced to harden my servers again due to the latest attack.

This attack is an attempt to force a buffer overflow in apache and gain root access. It has the effect of cluttering up the logs with http gets and posts for sites you don't host.

I've seen it crash the kernel on Linux machines.

I've developed a protection setup against it that works. Let me know what is in the access_log for the httpd server.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Rodrigo_33
New Member

Re: Performance problems on ML350G3

Steven and Kristof,

First of all thanks for your replies.

Kristof, after rebooting the server all swap seems to be in use, but this level of usage suffers a steady decrease to 0 while Buffer/cache memory grows to use almost all available memory. So it is not a problem of swap being filled up. It seems swap is not being used and only RAM is used.

Steven:
Starting from your answer I decided to focus on Postgres - Redhat interaction. Found some documents for tuning and performance and compared with my installation. In /etc/sysctl.conf I found

kernel.shmmax=268435456
kernel.shmall=268435456

And

vm.bdflush = 100 1200 128 0 500 5000 80 50 0

These settings are not present in another RedHat 7.3 installation I have and values are certainly not these. Could this be the problem?
Rodrigo_33
New Member

Re: Performance problems on ML350G3

Kristof,

You are right! swap is full, 1GB full and all the available RAM too.
My mistake, the excel table I sent has the free and used column names (for swap) swapped. So the graphics I made were lying. I'm attaching the file again.
More and more I start thinking the problem may be at the Virtual Memory management.

thank you
KristofH
Frequent Advisor
Solution

Re: Performance problems on ML350G3

Well, that could be for different reasons (that your swap is filling up). My guess is, some application is driving crazy and eating up all the memory. Best luck could be by doing a 'top' regularly to see wich application is eating up the swap. (and thus the cpu, wich makes your system go slow as hell)

Also a very good tool to use as replacement to op can be htop, you can find it at: http://htop.sourceforge.net/

good luck, and keep us posted.
Rodrigo_33
New Member

Re: Performance problems on ML350G3

Kristof, Steven,

I have found some answers to my problem. Following your advice I detected Apache is not freeing memory. I decided to start taking services down (Steven's suggestion); one by one, betting postgres would be the problem, but no.
When I stop Apache almost all swap is freed and RAM also. I wonder why that is. The best explanation I've heard so far is problems in the Tomcat<->Apache interaction, connections between them live forever. I'm upgrading apache from 1.3.31 to 2.0.53 and tomcat 4 to tomcat 5.5, we'll see.

thanks
Steven E. Protter
Exalted Contributor

Re: Performance problems on ML350G3

I would say focus on apache and get it patched.

There has to be a patch for this, even if you have to compile apache anew.

http://httpd.apache.org

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com