Operating System - HP-UX
1829963 Members
2603 Online
109998 Solutions
New Discussion

Memory 99% full only 10% used by processes according ps

 
SOLVED
Go to solution
Franky Leeuwerck_1
Regular Advisor

Memory 99% full only 10% used by processes according ps

Hi,

I have a performance problem on a HP-UX 11 box.

- Total RAM : 512 MB
- Machine has not been rebooted for 6 months.
- Occasionaly a kill -9 command has been issued.
- dbc_max_pct is 20
- vhand process takes most of disk I/O (Glance)

I understand there is not enough RAM available for this system, but I puzzled with this :

a) Glance shows that memory usage is 99% all the time.
b) The sum of the SZ column in the ps -efl output multiplied by 4096K is only 50 MB ( = total size of memory occupied?).
So, where is all the rest of the memory gone ?

Thanks in advance

Franky
12 REPLIES 12
ian Dennison
Regular Advisor

Re: Memory 99% full only 10% used by processes according ps

try ipcs -ma, for shared memory areas.
Lets do it to them before they do it to us! www.fred.net.nz
Stefan Farrelly
Honored Contributor

Re: Memory 99% full only 10% used by processes according ps

vhand is using most of disk I/O because its swapping memory to and from disk - ie. youve run out of memory.

swapinfo -mt

shows your memory usage. If you have device swap in use then this is how much short of memory you are.

You have glance installed, so run the gpm (GUI) version and go to memory report - this shows a better summary of memory usage by kernel, users, shared memory, etc. And gpm has a process list window which you can sort by memory usage which is a much better way to show large processes in memory.
Im from Palmerston North, New Zealand, but somehow ended up in London...
Franky Leeuwerck_1
Regular Advisor

Re: Memory 99% full only 10% used by processes according ps

Thanks for helping me towards a solution.

This the output from swapinfo and ipcs :

swapinfo -tm
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 600 173 427 29% 0 - 1 /dev/vg00/lvol2
dev 600 178 422 30% 0 - 1 /dev/vg00/lvswap2
reserve - 147 -147
memory 364 364 0 100%
total 1564 862 702 55% - 0 -


ipcs -ma
IPC status from /dev/kmem as of Tue Jan 20 12:45:20 2004
T ID KEY MODE OWNER GROUP CREATOR CGROUP NATTCH
SEGSZ CPID LPID ATIME DTIME CTIME
Shared Memory:
m 0 0x4118061a --rw-rw-rw- root root root root 0
348 391 391 13:05:03 13:05:03 13:04:56
m 1 0x4e0c0002 --rw-rw-rw- root root root root 2
31040 391 393 18:17:53 13:05:03 13:04:56
m 2 0x411c0514 --rw-rw-rw- root root root root 2
8192 391 393 18:17:53 13:04:56 13:04:56
m 403 0x411c4c13 --rw-rw-rw- root sys root sys 1
8785 769 769 14:20:38 no-entry 14:20:38
m 2004 0x0c6629c9 --rw-r----- root sys root sys 3
11944188 626 20055 12:08:49 12:15:43 14:17:31
m 5 0x06347849 --rw-rw-rw- root root root root 1
77384 1156 630 14:17:45 14:17:31 13:05:53
m 406 0xffffffff --rw-r--rw- root root root root 0
22908 1125 1125 13:05:55 13:05:55 13:05:55
m 8207 0x411c545e --rw-rw-rw- root sys root sys 1
8785 768 20713 23:00:21 no-entry 14:20:36
m 208 0xd9901620 --rw------- ingres sys ingres sys 76
155648 28605 28431 7:30:36 no-entry 11:50:42
m 209 0xd9901626 --rw------- ingres sys ingres sys 76
26976256 28614 28431 7:30:36 no-entry 11:50:45
m 210 0xd9901627 --rw------- ingres sys ingres sys 25
49152 28614 28638 11:50:47 no-entry 11:50:46
m 211 0xd9900482 --rw------- ingres sys ingres sys 25
925696 28661 28686 11:51:06 no-entry 11:51:04
m 212 0xd9900c29 --rw------- ingres sys ingres sys 50
29958144 28661 28728 12:00:55 no-entry 11:51:13
m 213 0xd9900ca5 --rw------- ingres sys ingres sys 2
8192 28661 28704 11:51:36 no-entry 11:51:14
m 214 0xd9900ca6 --rw------- ingres sys ingres sys 2
212992 28661 28704 11:51:37 no-entry 11:51:17
m 215 0xd9900ca7 --rw------- ingres sys ingres sys 25
466944 28704 28711 11:51:27 no-entry 11:51:25
m 2016 0x2f100002 --rw------- root sys root sys 0
1286144 25410 29298 0:31:15 no-entry 0:31:15

Roughly calculated :
512
- 25 MB Buff Cache (glance)
- 70 MB Shared Memory
- 50 MB Processes (Sum : ps -efl SZ x 4096 )
=======
367 MB
This leaves me with about 367 MB of RAM that is not used, although Glance tells me that memory is 99% occupied. What is this 367 MB used for then ?


Franky


Bill Hassell
Honored Contributor
Solution

Re: Memory 99% full only 10% used by processes according ps

In Glance, the S_____S area in the bar graph shows you the kernel memory, probably a couple of hundred megs. Then there are memory mapped files and other inter-process communication areas. Look at the 'm' screen in Glance. At the bottom, you'll see the System memory which is for the kernel. Glance does a much better job in assigning memory than standard ps. Compare ps -efl output, with the following:

UNIX95= ps -eo vsz,pid,args | sort -rn

The first set of numbers is Kbytes. Add all of those together and see how it compares to ps -efl (the XPG4 behavior from UNIX95= is much easier to parse) and then compare to Glance's summary values for memory. If you read the help screen for this page, you'll see that some structures for dynamically allocated memory are not counted with the system. With so little memory, these values will make more of a difference. Glance will assign these structures to the GBL_MEM_USER (or U____U section of the graph).


Bill Hassell, sysadmin
Bill Hassell
Honored Contributor

Re: Memory 99% full only 10% used by processes according ps

One additional note: your swapinfo -tm values show that swapping (actually, paging) is occuring. To see how bad it is, use vmstat during the busiest periods of the day and look at the po (page out) column. This column shows how much process memory is being swapped out. Ignore pi (page in) as this metric covers two very different values: process starts and return pages from memory. There is no way to separate these two values.

The po values are OK as single digits, not good at double digits and terrible performance when po reaches 3 digits for lengthy periods. You can figure about 50:1 to 100:1 performance penalties with 3 digit po rates.


Bill Hassell, sysadmin
Stefan Farrelly
Honored Contributor

Re: Memory 99% full only 10% used by processes according ps

Your total in b) is not quite right as processes often share memory regions, so this total is inflated. gpm gives a much better indication but again you need to take into account shared areas, which can only be done by looking at each process in detail - a long job.

You could have a memory leak where a process over time uses more and more memory without freeing it up. This usually happens more often than you would think and needs a process of elimination to find the culprit.

What you do is reboot your server. Dont start any applications and see how much memory is being used. Note it. then start your applications, note the new memory used total. Let users on etc. Watch the free total over a couple of days and see if it continues to shrink - meaning users connecting and disconnecting is not freeing up all memory and thus you have memory creep, or memory leak.
Shutdown your application and see if the free memory total returns to what it was when you rebooted your server and before you started your application. IF it doesnt then you app has a memory leak - it doesnt free up all memory it was using when its shutdown.
Im from Palmerston North, New Zealand, but somehow ended up in London...
Franky Leeuwerck_1
Regular Advisor

Re: Memory 99% full only 10% used by processes according ps

Hello Bill,

Thanks for your input.

The S...S bar in Glance indicates about 50 Mb.

SysMem in glance shows about 49 MB.

Summarizing the first column in 'UNIX95= ps -eo vsz,pid,args | sort -rn' gives me 370 MB

Glance shows me :
Total VM : 283.7mb
Sys Mem : 49.0mb
User Mem: 433.4mb
Phys Mem: 512.0mb
Active VM: 211.9mb
Buf Cache: 25.6mb
Free Mem: 4.0mb

vmstat shows most of the time 2-digit values, and now and then 3-digit values.

I am not sure if I completely understood your message, but here is a new rough calculation of memory usage, gives me :
512 MB
- 25 MB Buff Cache (Glance)
- 70 MB Shared Memory (ipcs)
-370 MB Processes (UNIX95=ps -eo..)
- 50 MB SysMem (Glance)
=======
-3 MB

Now we are getting somewhere.

In the UNIX95 list, the top process is 114984 28661 /disk1/ingres/bin/iidbms which is an Ingres DBMS server, sharing its dbms cache with the second largest process 56264 28704 /disk1/ingres/bin/iidbms .
Is the shared cache used by these process double calculated in my calculation (ipcs) ?

Another maybe more important question.
How can we see which processes are responsable for the large po figures in vmstat ?


Franky
Bill Hassell
Honored Contributor

Re: Memory 99% full only 10% used by processes according ps

>> Is the shared cache used by these process double calculated in my calculation (ipcs) ?

Shared memory is always tricky. One process may request the shared memory segment(s), then communicate this information to other processes and terminate. Now it's tricky to figure out which process should be assigned the shared memory area. In the ipcs listing, these segments are listed as they occur and with the CPID (creator PID) and LPID (last access PID) so there is no double calculation.


>> How can we see which processes are responsable for the large po figures in vmstat ?

There is no easy way to see this type of information, and it wouldn't be useful. The processes that are swapped out are the least important, not necessarily the biggest. In fact, swapping is not done anymore. When memory is almost completely used, all the processes are evaluated as to their priority and their current share of the CPU. Processes that have been very busy for a long time are candidates for deactivation to give resources to other programs. Once deactivated, their process memory is now available to be paged out. When enough pages have been freed, a competing process is loaded into the free pages. This competing process may be a new one or one that had also been deactivated and is now returning.

And your buffer cache (25Mb) is very small, most likely because it has been reduced to the kernel value dbc_min_pct because of process requirements. It should be about 10x larger.

So there is nothing good or bad about processes that are being paged. The system is doing the best it can in a bad situation (not enough RAM). You can ask the DBAs to reduce the size of their shared memory and local memory so that less paging occurs but then the database programs will have limited space to accomplish their tasks and it will still be slow. There is no substitute for memory when it comes to database performance.


Bill Hassell, sysadmin
Chris Wilshaw
Honored Contributor

Re: Memory 99% full only 10% used by processes according ps

Ingres is VERY memory hungry, and in most installations is subject to memory leaks.

One thing you need to watch out for is the value of the maxdsiz kernel parameter.

the iidbms processes can (and will) try to use up to the value of this parameter.

You say that the server has not been rebooted in 6 months - this shouldn't matter. HP-UX itself is very unlikely to be the cause of your troubles.

How often do you restart the dbms processes?
Franky Leeuwerck_1
Regular Advisor

Re: Memory 99% full only 10% used by processes according ps

Bill,

Thanks for your explanation which helps me to understand the system better.

I reduced some Ingres resource parameters and we will reboot the machine.
I can only regain about some 50MB with this action .. we'll see.

Franky
Franky Leeuwerck_1
Regular Advisor

Re: Memory 99% full only 10% used by processes according ps

Chris,

We reboot Ingres once a week for the full OS backup or in case of problems.

Franky
Franky Leeuwerck_1
Regular Advisor

Re: Memory 99% full only 10% used by processes according ps

Some feedback after solving this issue.

After stopping the DBMS the memory was still 93% occupied.

After rebooting the HP-UX, memory was 30% full before starting up the DBMS.
Starting up the DBMS, with lowered resource parameters, increased memory usage up to 70% . After 1 day with user sessions getting reconnected, usage was back to 97% .. BUT this time very little swapping going on.

Thanks all for your inputs.

Franky