Operating System - HP-UX
1820137 Members
3235 Online
109619 Solutions
New Discussion юеВ

Interpreting vmstat output

 
Chuck Muraski
New Member

Interpreting vmstat output

I'm using vmstat and top for performance monitoring on a remote system running
HP-UX 10.20. Users of the system are complaining of poor response times from
their client applications that access the system and I'm attempting to get an
idea of what's causing the sluggishness. I'm looking at the vmstat data for
clues about paging activity, CPU utilization, etc.

I've spent a little time looking around for some guidelines for interpreting
the vmstat output, but I haven't found much. What I'm wondering about is the
significance of the values in the "r", "b", "in", "sy", and "cs" columns.
What ranges of values in these columns indicates problems?

I'm also wondering about the significance of cases where both the "r" column
and "id" column are non-zero, which I interpret to mean that there are runable
processes, but at the same time the CPU is idle. Is this an indication that
there are processes waiting on I/O?

If you have some rules of thumb to offer, or if you know of some reference
that goes into more detail than the man page, I'd appreciate your help.
6 REPLIES 6
Michael Tully
Honored Contributor

Re: Interpreting vmstat output

Hi,

Suggest that you install at least
the trial copy of glance. It is much
easier to use and understand than
the outputs associated with vmstat.

You could use sar....

sar -M -u 5 5 (mulitple cpus)
sar -d 5 5 (disks)
sar -b 5 5 (buffer)

Michael
Anyone for a Mutiny ?
Chuck Muraski
New Member

Re: Interpreting vmstat output

Thanks for the feedback, Michael.

I've used Glance before, and I agree that it would be a great choice for the job, but in this case I'm not able to use it. I'm constrained to using tools that can output something to a log file, which eventually gets into my hands via email.

Thanks again,
Chuck.
Bill Hassell
Honored Contributor

Re: Interpreting vmstat output

If the system is sluggish, start with a system witrh half the users...better? If so, you probably have a capacity issue, not a kernel problem. It's easy top blame the kernel when there are just too many programs running at the same time.

Start with memory: vmstat is good for about one simple stat: po or page-out. If this number is high (more than 10-20 for long (many minutes) periods, then paging is way too high and the only fix is to double or triple the amount of RAM (or limit the number of processes). Be sure the kernel parameter dbc_max_pct is less than 10 or less than 200-400 megs of RAM. And if make sure mib2agt is patched or not running (it can memory leak to 200-300 megs).

Then look at CPU usage: if it is consistently above 80% most of the time, you need more CPUs (if possible) or faster CPUs. Otherwise, run fewer programs.

Finally, look at disk usage. This is the most complicated to measure as there may be dozens of causes for high disk rates. More channels and physical disks can help if the hotspots can be moved around.


Bill Hassell, sysadmin
steven Burgess_2
Honored Contributor

Re: Interpreting vmstat output

I have found a nifty little programme called sarcheck which utilises the sar utlity and creates an html report giving advice on what parameters require tuning. Bottlenecks etc etc

An evaluation copy can be found on

http://www.sarcheck.com/orderform.htm

As far as vmstat here is info with regard to the output

The virtual memory part of the output is divided into 3 parts memory , page and faults

avm = active virtual memory. These are the pages that have been assigned to some processes

free = free pages

Under the page sub heading

re = page reclaims. A large number shows a memory shortage

at = Address translation faults

pi = pages paged in

po = pages paged out

fr = pages freed per second

de = Anticipated short term memory shortfall

sr = pages scanned by clock algorithm, per second

FAULTS show trap and interrupt rate averages per sec over the last 5 secs

in = device interrupts per second

sy = system calls per second

cd = CPU context switch rate

CPU output is divided into 2 parts , cpu and procs

cpu show utilisation

us = User time for normal and low-priority processes

sy = system time

id = idle cpu time

PROCS subheading is as follows

r = process in the run queue

b = number of processes blocked waiting for resource

w = runnable to swapped out from main memory

I have attached a copy of the vmstat docs also giving you various options depending on what you want to monitor

Regards, steve
take your time and think things through
Chuck Muraski
New Member

Re: Interpreting vmstat output

Bill:

Thanks for the feedback. You've given me a couple new things to look at.

I take it from your remark about vmstat and page-outs that there's not much more to be discerned by looking at the numbers in the columns I mentioned in my original post. What I was hoping for was some rule of thumb based on the numbers in some combination of those columns that would be a general indication of some sort of capacity issue.

Am I asking too much of vmstat in this regard?

Thanks again, Chuck.
Chuck Muraski
New Member

Re: Interpreting vmstat output

Steve:

Thanks for the tip about sarcheck. I'll make sure to check it out.

Regards,
Chuck.