- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- CPU Load on a single-CPU superdome
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 01:20 AM
08-02-2004 01:20 AM
I am running the first batch of reports from Measureware on CPU performance, which shows CPU running 90% plus, but PRI_QUEUE at 0.50 and RUN_QUEUE at 12 to 24. Context Switch CPU util is low (less than 1%).
My conclusion from this initial data - the system is busy, but not thrashing. Low PRI_QUEUE and low CSwitchCPU indicate it is handling the load from large numbers of Processes and not bottlenecking. However, it would be prudent to look at the other queues to see if there are excessive waits (Page faults?) I have checked 'timeslice' and it's normal (value of 10).
Check my logic anyone? - does all this sound sane?
Share and Enjoy! Ian
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 01:37 AM
08-02-2004 01:37 AM
SolutionI would want some data during peak use and batch runs.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 01:52 AM
08-02-2004 01:52 AM
Re: CPU Load on a single-CPU superdome
But most programs spend most of their time in localized areas in RAM so all is still OK. Only when a large program bounces all over the data area will a high page fault occur. High TLB misses (a needed page does not have a TLB entry) will cause a specific process to slow down as the hardware registers are being reloaded. Now this is sub-microsecond overhead but it's possible that this could impact a very compute-bound program (think massive mathematical array manipulations in simulations, Fourier transforms, etc). There is a fix for these programs: large page size. By simply using the chatr command, an executable can be changed to use a page size larger than 4k (megs if necessary) and reduce the number of unique TLB entries. Here's a link: http://docs.hp.com/cgi-bin/fsearch/framedisplay?top=/hpux/onlinedocs/B3782-90716/B3782-90716_top.html&con=/hpux/onlinedocs/B3782-90716/00/00/24-con.html&toc=/hpux/onlinedocs/B3782-90716/00/00/24-toc.html&searchterms=page%7csize%7chp-ux%7clarge&queryid=20040802-075047
Bill Hassell, sysadmin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 02:06 AM
08-02-2004 02:06 AM
Re: CPU Load on a single-CPU superdome
Thanks for the script. I had a quick run of it, and it pointed me towards some of the "sar" utility metrics. I will run this during peak times and post what de-sensitised data I can (our data security policy has recently been updated!)
Thanks for your assistance and the loan of the script.
Share and Enjoy! Ian
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 02:48 AM
08-02-2004 02:48 AM
Re: CPU Load on a single-CPU superdome
What is the motivation behind lots of single-cpu V-pars? It sounds like such a big missed opportunity for spreading some load, sharing resources, makign mutlipel CPU's availabel to some applications while they need, assuming not all applications will be at peak load at the same time?
Is this setup driven by bean-counters? Political/Power games? Just uncertainty about co-hosting applications that did not used to share a host?
Just curious...
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 07:00 AM
08-02-2004 07:00 AM
Re: CPU Load on a single-CPU superdome
My conclusion from the above is that you may have an IO bottleneck.... The only reason I say this is the pri_Q is relatively low at 0.5 BUT you have 12 to 24 processes running or runable (RUN_Q). Therefore 11.5 to 23.5 are in the runable state and waiting for something (like IO). If I'm right (as I'm human, and may not be) as, when or if you relieve the IO bottleneck (say disks) your CPU will jump to 100%... and the pri_Q will leap up too. So in some ways you may not want to fix this!!
If you do try looking at network & disk stats to see if they are busy try
extract -xp -n -r
and
extract -xp -d -r
Regards
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-02-2004 09:34 AM
08-02-2004 09:34 AM
Re: CPU Load on a single-CPU superdome
So if this is what you expect (lots of short-run processes) then all may be well. If you have a lot of short-runeads (Java, Oracle, middleware, etc) then this might be just fine. A more interesting metric would be the %SYS value. Something over 30-40% indicates a lot of system calls, still not a bad thing if this is expected. However, you could artificially create a high run queue and high system overhead with a swarm of programs that perform lots of kernel calls. The problem is that such programs would look normal except for the high system overhead and long run queue. The context switch rate will be high, along with the system call stats.
Bill Hassell, sysadmin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-04-2004 06:45 AM
08-04-2004 06:45 AM
Re: CPU Load on a single-CPU superdome
I think I did not make my point very well...
with a run_q of 12-24 and cpu% ~ 90% I would have expected a pri_q of 2-5, certainly greater than 1. This would indicate to me processes are fighting for cpu time (after all you are doing 90% cpu so this is not unreasonable) and using the hp-ux scheduling to sort out who runs when.
With such a low pri_q I'm guessing that very few processes are actually fighting for CPU.... so I conclude an IO bottleneck may be cause... usualy IO bottlenecks are network & disk.
If we saw a system with say
cpu% ~ 20%
run_q ~ 12
pri_q ~ 0.5
it would be easier to say IO bottleneck, especially if, say, peak dsk% was 70%+. all that is different from the hypothetical easy to diagnose system and this is the cpu util.
That said, the sys cpu% would be an interesting metric to look at. if it were high (say 40%, as suggested previously) then lots of sys calls are going on (say network, disk, memory etc), again this may point to
o IO bottleneck
o poor code resulting in IO bottleneck
on a different tack what does "sar -u" show?
Regards
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-05-2004 03:39 AM
08-05-2004 03:39 AM
Re: CPU Load on a single-CPU superdome
OK, now what does this all mean? I looked through the metrics for MWA 3.72 and there is no global Wait IO metric (I seem to remember there was one previously). I am guessing that with Backups / Processing of multiple databases, we are getting IO bound on the chassis or the SAN. Possibly the dsk cache is experiencing hits.
I will look at splitting up the applications into tighter definitions (other_user_root and OmniBack separate) and will monitor IO wait by process and application during the "hot points". This should tell me who is the source of the IO load.
Thanks everyone for their help so far.
Share and Enjoy! Ian
CPU
date / time tot sys user
07/26/04|22:00|208|1090879200| 77.82| 16.07| 61.74
07/26/04|22:15|208|1090880100| 99.99| 72.16| 27.83
07/26/04|22:30|208|1090881000| 99.91| 69.39| 30.52
07/26/04|22:45|208|1090881900| 88.16| 59.89| 28.27
07/26/04|23:00|208|1090882800| 66.01| 41.46| 24.55
07/26/04|23:15|208|1090883700| 38.57| 18.60| 19.96
DISK phys peak logl
date / time io rate util io rate effic.
07/26/04|22:00|208| 479.8| 28.05| 773| 0.12|
07/26/04|22:15|208| 3302.0| 72.33| 1789| 0.34|
07/26/04|22:30|208| 3033.2| 94.99| 1847| 0.23|
07/26/04|22:45|208| 2932.7| 96.01| 1607| 0.13|
07/26/04|23:00|208| 2272.0| 95.59| 1167| 0.09|
07/26/04|23:15|208| 742.8| 31.48| 808| 0.02|
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-05-2004 05:58 AM
08-05-2004 05:58 AM
Re: CPU Load on a single-CPU superdome
OK, you need BY_DSK metrics
extract -xp -d -b
This will dump a line for everey disk giving disk utilisation etc. basically figure out what disk does what!!! I do have a script that splits out the xfrdDISK.asc into individual disks (dsk01, dsk02 etc), but not to hand!!!
Regards
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-08-2004 09:52 AM
08-08-2004 09:52 AM