Operating System - HP-UX
1834140 Members
2510 Online
110064 Solutions
New Discussion

Re: Unexplained swapping.

 
Tom Bies
Occasional Advisor

Unexplained swapping.

Hello,

I have a L2000 2-way 440MHz system with 4GB physical RAM. I am seeing slow performance & unexplained swapping on this system. The application that runs on this system is memory hungry, but I see lots of free physical RAM available, yet a good percentage of swap used. Could a kernel pram be at fault? Here's what I see.

From top:
Memory: 2501112K (1297360K) real, 3161668K (1585196K) virtual, 466968K free Pag
e# 1/18

From Swapinfo -mt:
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 1024 445 579 43% 0 - 1 /dev/vg00/lvol2
dev 1024 447 577 44% 0 - 1 /dev/vg00/lvol14
dev 1024 446 578 44% 0 - 1 /dev/vg00/lvol15
dev 1024 444 580 43% 0 - 1 /dev/vg00/lvol16
reserve - 1726 -1726
memory 3070 1915 1155 62%
total 7166 5423 1743 76% - 0 -

From a script which shows processes that consums the most RAM:
RUSER Kbytes PID Command-Line
--------- ------ ----- --------------------------------------------------
root 1367692 3598 /var/process/exec/GDS-DSACargill1 -t /var/opt/gds/cfg/D
root 1071260 26465 /var/process/exec/GDS-DSACargill1L -t /var/opt/gds/cfg/
root 152380 29556 /var/process/exec/GDS-DSACargill1E -t /var/opt/gds/cfg/
root 51680 20676 /var/process/exec/DUA-DSACargill1protected -rc /usr/var
root 47360 24391 /var/process/exec/DUA-DSACargill1Lprotected -rc /usr/va
root 47280 24204 /var/process/exec/DUA-DSACargill1Eprotected -rc /usr/va
root 46688 24719 /var/process/exec/DUA-DSACargill1Lpublic -rc /var/opt/g
root 46688 24555 /var/process/exec/DUA-DSACargill1Epublic -rc /var/opt/g
root 46688 20333 /var/process/exec/DUA-DSACargill1public -rc /var/opt/gd
root 38972 25143 /opt/mailhub/java/jre1.4.1/bin/PA_RISC2.0/java -Xmx64m
webadmin 30568 20071 ns-httpd -d /opt/web/https-www.ds.cargill.com/config
webadmin 28520 17128 ns-httpd -d /opt/web/https-ssl/config
root 26684 22354 /usr/sbin/osi/java/jre/bin/../bin/PA_RISC2.0/native_thr
root 20124 17411 /var/process/exec/LDAPCD-DSACargill1 -t /usr/var/mhs/pr
root 12772 27044 /opt/perf/bin/scopeux
root 8028 1205 /opt/lde/perl5.005/bin/perl /cargill/ds/bin/DS_to_Other
root 7020 27026 /opt/perf/bin/midaemon
root 6424 27672 /opt/perf/bin/rep_server -t SCOPE /var/opt/perf/datafil
root 6348 1008 /opt/dce/sbin/rpcd
root 5176 22696 coda -redirect

And from /stand/system

* Tunable parameters

STRMSGSZ 65535
dbc_max_pct 10
max_thread_proc 256
maxdsiz (2*1024*1024*1024)
maxdsiz_64bit (2*1024*1024*1024)
maxfiles 4096
maxfiles_lim 8192
maxssiz (80*1024*1024)
maxssiz_64bit (80*1024*1024)
maxswapchunks 3600
maxtsiz (1024*1024*1024)
maxtsiz_64bit (1024*1024*1024)
maxuprc 1500
maxusers 1500
msgmax 65535
msgmnb 65535
msgseg 8192
msgssz 32
msgtql 1030
ncallout 24592
nfile 200000
nflocks 24400
ninode 52450
nkthread 2048
nproc 24576
nstrpty 60
semmns 2048
semmnu 2200
semume 500
shmmax 1073741824
shmmni 1024

Any ideas appreciated.

Tom
13 REPLIES 13
Tomek Gryszkiewicz
Trusted Contributor

Re: Unexplained swapping.

I dont think your system is swapping too much - the swap is only reserved for processes.
Try to use tools like glance to measune system performance.

-Tomek
A. Clay Stephenson
Acclaimed Contributor

Re: Unexplained swapping.

Just a quick perusal of your processes indicates > 3GB of memory use BUT I strongly suspect that this does not include shared memory segments. You are definitely swapping so you need to reduce resources used and/or increase memory. This is not the fault of any kernel tunables.
If it ain't broke, I can fix that.
A. Clay Stephenson
Acclaimed Contributor

Re: Unexplained swapping.

I would also do a vmstat and look at the pageout (po) column -- it's really the only metric of vmstat that very useful -- or better yet use Glance to examoine the pageout rate. The other possible "hidden" use of memory is memory-mapped files.
If it ain't broke, I can fix that.
I.Delic
Super Advisor

Re: Unexplained swapping.

Tom,
is your vhand proces active. This proces does PI an PO . (swap )
One tool to use (if you don't have Glance) is vmstat. The only column of any real value is "po" (pageout). That should be very nearly zero or you will have a dog.
Use vmstat 5 100 to check swap.

Increasing kernel params means that in many cases you are reserving RAM (memory) for system usage. This reduced the amount of memory that the application or users have access to.
I'd analyse each kernel param, and cut them to the bone.

Idriz
I.Delic
Super Advisor

Re: Unexplained swapping.

Michael Steele_2
Honored Contributor

Re: Unexplained swapping.

please attach:

sar -v 5 5
sar -u 5 5
sar -d 5 5
vmstat 5 5
swapinfo -tam
Support Fatherhood - Stop Family Law
Tom Bies
Occasional Advisor

Re: Unexplained swapping.

Thanks for the replys all. See the attached info containing sar, vmstat and swapinfo data.

Tom
Stefan Farrelly
Honored Contributor

Re: Unexplained swapping.


swapinfo output is NOT Realtime. Yours shows device swap actually used - but this DOES NOT MEAN its currently swapping - only that at some point since your last reboot you were swapped out that much.

To get a good realtime swap usage measurement use glance/gpm.

The fact that vmstat shows you have 161223 pages free (*4096=660MB free) and po rate is 0 means you are not swapping at the moment, you have lots of free memory.

The only way to reset swapinfo device usage back to 0% is reboot. Once you do this you should run sar and vmstat regularly via cron and log it somewhere (if you havent paid for the licensed measureware/perfview historical logger) so that next time you notice device swap actually being used (ie. youre swapping) you can track down why.
Im from Palmerston North, New Zealand, but somehow ended up in London...
Michael Steele_2
Honored Contributor

Re: Unexplained swapping.

At this point in time you had some issues. Was this during a backup? Set up a cron to periodically sample the server and collect more data, say, every 15 minutes.

a) sar -v

Reduce the kernel parameters 'nproc' and 'nfile'. Here you're using less than 1%. 179/24576.

b) sar -u

07:25:35 %usr %sys %wio %idle
07:25:50 65 10 18 7

Indicates a tape or disk bottleneck. Was this during a backup?

c) sar -d

What's on c1t2d0 & c2t2d0? The O/S. These disks are I/O bound.

d) vmstat

N/A

e) swapinfo

All your swap is in vg00. Is this c1t2d0 and c2t2d0? Use other disks and other controllers if possible and extend them, or, add a little more for at 79% total you're about ready for some more. (* Or add RAM *)

/dev/vg00/lvol2
/dev/vg00/lvol14
/dev/vg00/lvol15
/dev/vg00/lvol16
Support Fatherhood - Stop Family Law
Tom Bies
Occasional Advisor

Re: Unexplained swapping.

After reading the replies and watching glace, I was able to deduce the following.

* At any given time the system is consuming just over 3.4GB of physical memory.
* The primary application on this server is spawning another process (randomly?) which attempts to allocate an additional 1.2GB of memory. Since this is exceeding available physical RAM (4GB), this causes the system to page out to disk and creates a I/O bottleneck, high load, and slow application response.
* A cp/tar/gzip cronjob is running via the applications cron every 30 minutes.
* If the system is paging out and happens to be running a cp/tar/gzip at the same time, the system will be waiting on I/O and most likely cause the application to be unresponsive or extremely slow during this time. A high queue depth will also be seen.

My solution(Do you agree?): Add more physical RAM or eliminate number of active processes.
A. Clay Stephenson
Acclaimed Contributor

Re: Unexplained swapping.

Your analysis is correct. During those periods when the secondary process is forked and exec'ed, your pageout rate has to be very high. It's time for more memory unless you can configurew the application to use less memeory --- or learn to live with it.
If it ain't broke, I can fix that.
Michael Steele_2
Honored Contributor

Re: Unexplained swapping.

Gee I hate to go against someone I have great admiration for, (* Clay *), but I don't see any paging. This is the 'vmstat' 'po' metric and it's 0.

I do see a very high risk for system memory fragmentation. Refer to 'nfile' and 'nproc' using less than 1% of the file and proc tables. And I do see c1t2d0 and c2t2d0 I/O bound. So what logical volumes and file system reside on these disks?

If gzip then what is the source and destination? c1t2d0 and c2t2d0?

Can you:

pvdisplay -v | more (* id the log. vol.s *)

lvdisplay -v (* id the boot disks *)

If c1t2d0 and c2t2d0 are O/S, then you're issue is with all you swap in vg00. And swap is currently at 79% total utilized. That's a lot of acitivity.

Is swap on c1t2d0 and c2t2d0?
Support Fatherhood - Stop Family Law
Dietmar Konermann
Honored Contributor

Re: Unexplained swapping.

The key is your statement "I am seeing slow performance". Take care to answer yourself these questions:

- What especially is slow
- Compared to what "base line"
- When is it slow?
- Does it correspond with other observations
(high cpu usage, high disk io hotspots, pageout activity)

What I want to say is, that you don't have any memory bottleneck at all, if you see bad performance while freemem is about 600MB.

Best regards...
Dietmar.
"Logic is the beginning of wisdom; not the end." -- Spock (Star Trek VI: The Undiscovered Country)