Operating System - HP-UX
1846576 Members
2128 Online
110256 Solutions
New Discussion

Re: Basic OS performace question

 
SOLVED
Go to solution
Glenn Couture
Occasional Advisor

Basic OS performace question

Very basic stuff here. I have hpux 11 64 bit with Oracle 8.1.6. I am the dba and I am very happy with the performance of the system. The sys admin says that the machine is getting too busy. When I look I dont see that. When I do an uptime, the load is maybe 1.5. I still see 500 megs of free memory when I check. When I run top, 3 of the four cpu's are only on avg 30 - 50 %, the fourth is maybe 90%. The machine seems very responsive and fast at the unix prompt with a ps -ef. The question is.......What does a busy machine look like. When can one say a machine is busy. I think they are just use to an idle box. Any info would be a big help. Thanks.
10 REPLIES 10
harry d brown jr
Honored Contributor

Re: Basic OS performace question

Glenn,

I personally don't use "uptime", other than a "quick look" at the system. To me the queue depth is meaningless without a lot of other information.

THe tool of choice is "glance/measureware" and "perfview" to monitor, capture, and display the performance of the system. Great for trend analysis.

One of the first things I look for is bottle necks, which can be CPU, memory, or IO (disk or network). Most "performance" issues are actually crappy applications doing stupid things, but we usually just throw hardware at the problem hoping to "mask" the issue, but sometimes adding another CPU just makes the "performance" issue happen faster.

If you have a current support contract, I'd suggest having your local HP help you "LOOK AT" the system. They usually are more than happy to assist.

live free or die
harry
Live Free or Die
Glenn Couture
Occasional Advisor

Re: Basic OS performace question

Thanks Harry. I guess I would like to step back further. The machine seems very responsive. All apps are running as desired. What cpu levels would show a busy box. On your own machines, what makes you say, Hey, the machine is to cpu busy. 0% idle in top, 30% idle? Does top show good information?
Thanks again.
Ceesjan van Hattum
Esteemed Contributor

Re: Basic OS performace question

Hi Glenn,

Have you tried talking with your Sys.Admin?
The world of a sysadmin is a slightly different world of a dba. (talking from own experience).
What is the sysadmin looking for. What is his understanding of 'normal' and 'busy'. Maybe he is right, maybe not. Please do not guess but talk to these people. Let him(her) show (by statistics of 'sar' or 'measuerware' what his/her problem is...

Regards,
Ceesjan
harry d brown jr
Honored Contributor

Re: Basic OS performace question

Glen,

If I don't have idle time, ie CPU's are 100% busy for long stretches of time, and there are no other bottleneck's (memory or IO), and performance (which is subjective - what does response time mean to an individual) is degrading, then it's probably time to consider more or faster CPU's. But if this "pegging" of the CPU's is only due to a special processing situation, ie Month End, then it can usually be tolerated.

But in your case, with 3 of the 4 CPU's running idle 30% of the time, and 1 at 90%, usually doesn't mean you are underpowered, especially with a queue depth of 1.5.

Now, if your SA has performance statistics collected that shows a TREND that you are consuming more and more resources, then you could easily map out an "upgrade" based upon when processes will consume more of the system than is available.

live free or die
harry
Live Free or Die
Glenn Couture
Occasional Advisor

Re: Basic OS performace question

Our SA's are great, and I plan on working very close with them. But I also would like to learn more myself. I could not find any "white pappers" on HP OS performance. Can anyone recommend a resource on this website or maybe else where on line that could be helpful in learning more about HPUX performance?
S.K. Chan
Honored Contributor

Re: Basic OS performace question

If all you wanted to know is "is the machine busy or not" then uptime and top are good enough to tell you that. The command uptime gives the 1, 5 & 15 minute average number of jobs in the run queue. This is a good gauge of how much system activity there is on your system. Typically if the av 1 minute load is more than 1.5 I would say the machine is "doing something". The top pulls the current system status from /dev/kmem and provides enough detail to perhaps isolate a system problem if any. Top is fine to use as a diagnostic tool however glance pulls some of its data from /dev/kmem, but additionally, a special kernel interface was written to retrieve performance information from the time the daemon has been running so it gives a more "real-time" information. That simply means you can use top which is designed to provide a quick look at the major CPU users. But one tool provides all the performance data. You can cross check the data with that seen in Glance, but not number to number, but rather does glance show the same processes as the highest CPU users.

S.K. Chan
Honored Contributor

Re: Basic OS performace question

harry d brown jr
Honored Contributor
Solution

Re: Basic OS performace question

Glenn Couture
Occasional Advisor

Re: Basic OS performace question

Great. Thanks. I'm a happy camper.
Bill Hassell
Honored Contributor

Re: Basic OS performace question

All that measurement stuff is good for the management reports but from a sysadmin perspective (DBA's opinion notwithstanding), performance is in the eye of the beholder (the user in this case). Your machine is showing no signs of getting bogged down since just one of the processors is really busy.

NOTE: There is nothing you (the sysadmin) can do to improve processor usage. If the program(s) all want to eat CPU cycles, then so be it. That's why you have a multi-processor system. To reduce the percentage of CPU usage:

1. Rewrite the program(s) to be more efficient, or
2. Replace the computer with faster processors

Now number 2 assumes that programs don't just waste CPU cycles, 'cause a faster computer will only burn useless cycles even faster. Example:

while do :
do
:
done

Anyone can type this in at a shell prompt and immediately burn 100% of one CPU. Efficient? Not really since it doesn't accomplish anything. But start 10 of these and you should see a significant reduction in the perfromance of the machine.

You (your company) spent a lot on the CPU and RAM so ideally, it will mostly be used. Requiring computers be only 1/2 used is an artificial boundary that dates back to punched cards and papertape where computers would 'lock up' if loaded too heavily. That's virtually unknown in HP-UX (assumes proper patching).

Now, that being said, there are a few things you can do once you discover I/O limitations (the infamous bottleneck). If it's LAN (and the LAN trraffic is legitimate), then look at AutoPort Aggregation (gigabit isn't quite the solution, APA works real well). If it's disk limits, then there's lots of choices, starting with the application. For Oracle, make sure the indexes are reasonable balanced (lots of row insertions may unbalance an index). And indexes sometimes get corrupted and are quietly ignored until rebuilt (means lots of serial reads). Make sure sorts are in RAM and not a temp sort area (SGA may need lots more RAM). Move hot spots such as rollback/archive logs and sort areas to physically different disks, and so on.

Once those areas have been examined, you can look at disk striping and perhaps changing the block size in the filesystems (default 8kb, might try 16kb if data records are large). The basic read-ahead size for HP-UX is 64kb but LVM will aggregate 8kb I/O's into 64kb if the I/O's are sequential. And prefetch for the buffer cache is typically 4x64kb or 256Kb. But as always, your mileage may vary.


Bill Hassell, sysadmin