1846357 Members
3499 Online
110256 Solutions
New Discussion

IPCS Help

 
MARK OWEN
Occasional Contributor

IPCS Help

I need your help with trying to troubleshoot a vendors application. The vendor claims that we need to reboot the server once every two weeks in order for their software to remain stable. This to me seems silly on a Unix host so I decided to not take their advice. After about three weeks the application started slowing down. I checked the system (using top) and found neither CPU or Memory load to be particulary high. My next idea would be to check shared memory but I have to admit my lack of knowledge as to how to track down my problem using IPCS. Any help with developing a strategy on this would be appreciated.
6 REPLIES 6
Chris Wilshaw
Honored Contributor

Re: IPCS Help

The question of reboots comes up quite often.

As far as HP-UX is concerned, you should only need to reboot for hardware failures/upgrades, power failures/outages, and patching/software installation that change the kernel.

On most of our systems, we find that just restarting the application itself is sufficient to clear out any memory locks etc that it has caused.

Use Glance or a similar monitoring tool to help you check the system out. The trial version can be found on the application pack CD's.
A. Clay Stephenson
Acclaimed Contributor

Re: IPCS Help

If you don't have it already, I would load the 30-day trial version of Glance. That is going to be your best weapon.

ipcs -ma will reveal the shared memory segments but that may not be of much use unless you know something about the underlying software. Top is not that good of a tool to judge memory usage because it only knows about memory related tp processes.

Having said all this, the correct approach to this problem is to ask the vendor a very simple question: "Why?". I can assure you that a vendor would only make that request of me once because that indicates very sloppy coding practices. The one exception to this would be if you or your users have been trained to get rid of those pesky processes via kill -9. There is no way cleanup can be done in that case but that is a problem for a different baseball bat.

If it ain't broke, I can fix that.
John Poff
Honored Contributor

Re: IPCS Help

Hi Mark,

Good for you! I'm glad to see somebody resisting the silly notion of rebooting just "because". Those silly ideas are part of what I call "voodoo system administration".

It sounds like your vendor has a program with a memory leak, and instead of fixing it they just make everybody reboot their systems on a regular basis. You say you've checked your memory usage and it looks ok. You might try monitoring the memory usage of any processes that the application has running all the time. I'd suggest doing a 'ps -el' on the application processes at regular intervals (30 minutes or 60 minutes) and writing the output to a log file. Then you can pull up the values for individual processes and see if the SZ column is growing, which could indicate a memory leak.

As for the ipcs, you can use it to look at semaphores, shared memory, and message queues. You probably won't run into too many problems there, unless they are using up resources without releasing them. I'd try doing an 'ipcs -ma' first, which will show you all the shared memory segments. The OWNER and CREATOR columns will show you which user made them, and the NATTCH column will show you how many processes are attached to each segment. You could run into a problem where the application is trying to release a shared memory segment but a process is holding it open. In that case, you would see an uppercase D (for delete) at the beginning of the MODE column, and the NATTCH column would not be zero. That means the shared memory segment has been flagged for deletion, but one or more processes are holding it open. The only way out of it is to figure out which processes have the segment and kill them.

JP


Sridhar Bhaskarla
Honored Contributor

Re: IPCS Help

I would change the vendor who asks me to reboot my servers every two weeks. It implies that their software is not stable and they need to work on it. It seems to me like they know their problem but are not willing to work on it and get away with rebooting the server.

There are different ways to deal with the memory leaks. One way is to do a "UNIX95= ps -e -o "sz args" " and keep track of the first column (process size) for the interesting processes. If it keeps growing over a period of time, then the process would be one of the suspected.

To find out the information about the shared memory segments do an "ipcs -mob" and note down the NATTCH column. It is the number of processes associated with that segment. 0 may (not always) mean that it is a candidate for scrutiny.

-Sri

You may be disappointed if you fail, but you are doomed if you don't try
MARK OWEN
Occasional Contributor

Re: IPCS Help

Thanks for the great response. This vendor has always been a real sore spot within our organization but we have so much data arhived using their software that is very difficult to get out from under them. Hopefully what you have given me will help substantiate my claim.
Wodisch_1
Honored Contributor

Re: IPCS Help

Hi,

get yourself a copy of the HP Response Center tool "shminfo" at:
ftp://contrib:9unsupp8@hprc.external.hp.com/sysadmin/programs/shminfo/

That help you detecting the fragmentation of your shared memory.

HTH,
Wodisch