Operating System - HP-UX
1844168 Members
2630 Online
110229 Solutions
New Discussion

How to determine what is causing a memory leak?

 
SOLVED
Go to solution
Gino Castoldi_2
Honored Contributor

How to determine what is causing a memory leak?

Hi,

Mgmt Svr: HP-UX 11.0 VPO 6.14 NNM 6.2 Oracle 8.1.7.2 MC/SG 11.09 Two node cluster running OVO 6.14. Each L2000 server has 4GB of memory.

We believe we have a memory leak on our system.

We added 2GB of additional memory 9 days ago.

A 'vmstat 5' consistently showed 800MB in the "free" column.

Now (9 days later) a 'vmstat 5' shows only 80MB of "free" memory.

Question, How do we determine what is causing the memory leak?

10 points to any good answer.
Thank you
Gino
7 REPLIES 7
John Poff
Honored Contributor
Solution

Re: How to determine what is causing a memory leak?

Hi Gino,

I've used the output from the 'ps -el' command to track the size of a program in memory. The tricky part is first to identify the program with the potential memory leak, but if your memory is mostly gone now it will probably be one of the processes using the most memory right now.

You can do this:

ps -el | sort -nr -k10 | head

to see the processes using the most memory. Once you have a likely candidate, you can track it with 'ps -lp #####' where ##### is the PID of the suspect process. I just put that command in a little shell script loop with a sleep and dumped the output to a file. I used this method once to find a memory leak in an Oracle web program.

Have fun!

JP
Pete Randall
Outstanding Contributor

Re: How to determine what is causing a memory leak?

Chris Wilshaw
Honored Contributor

Re: How to determine what is causing a memory leak?

By using

UNIX95= ps -eopid,vsz,user,args |sort -rnk2

You get the PID, the memory use in KB of a process, the owner and the command (with available arguments), sorted to display the processes using the most memory first.

Running this at intervals will allow you to see which processes are consuming memory. You can then look for patches (from HP or your application vendor) to address the problem.
Stefan Farrelly
Honored Contributor

Re: How to determine what is causing a memory leak?

Ways to determine memory leak;

1. ensure its not the OS. Do this by ensuring that dbc_mac_pct is set to a hard value (not too big - around 400MB is sufficient) and vx_ninode is not set to 0 - set it to a value.

2. reboot server. dont start any apps. check how much free memory using vmstat (free line is free mem in pages, multiply by 4096 for size in bytes)

3. start an app. check free memory so you can see how much it is using. shutdown app, free memory total should go back to what it was before you started app.

4. continue with each app in turn, only starting and stopping one at a time. If free memory does not return to the same level after youve shutdown an app then it has a memory leak.
Im from Palmerston North, New Zealand, but somehow ended up in London...
Kevin Wright
Honored Contributor

Re: How to determine what is causing a memory leak?

Check the RSS, or Resident set size of processes. When you see this value growing over time for a process, that could be your culprit.
A. Clay Stephenson
Acclaimed Contributor

Re: How to determine what is causing a memory leak?

You need to setup a cron jon that runs periodically (every 20 mins or so) and gathers statistics:

e.g.
date >> /var/tmp/ps.log
UNIX95= ps -e -o vsz,pid,comm >> /var/tmp/ps.log
echo "" >> /var/tmp/ps.log


The first column is the size in KB of the process. You then write a script to extract the entries by pid and note the one that grow over time --- those will be the 'leaker's'. Note: It is perfectly normal for some processes to grow over time as more and more data is cached --- it all depends upon how the program was written and what the programmers intent was. You should also do ipcs -m's over time and see if the number/size of shared memory segments grows.


If it ain't broke, I can fix that.
Alzhy
Honored Contributor

Re: How to determine what is causing a memory leak?

On servers where dynamic buffer allocation is not disabled AND yor Oracle instance is using cooked filesystems (AND your instance is heavy DSS/DW AND VxFS mount option is not set to force DirectIO) -- you will have a semblance of a "memory leak" ..This happens when in the life of the OS - it experienced very heavy Filesystem IO -- which is perhaps the biggest user of memory.. Check your Glance Output and not you will see your memory allocated to buffer chaches is way too high...

Solution: If your instance is leaning towards more DSS/DW type -- employ direct IO on your VxFS Filesystems (,mincache=direct,delaylog,convosync=direct). And limit disable dynamic caching and limit it to between 5%-10% of total memory (dbc_max_pct = dbc_min_pct ~ 5-to-10%)

Hakuna Matata.