1827806 Members
2240 Online
109969 Solutions
New Discussion

Re: Monitor CPU level

 
smsc_1
Regular Advisor

Monitor CPU level


Hi,
I just perform a check on ES40 OpenVMS system with following command:

$ monitor cluster /average /summary

Result on cluster was:
CPU Busy 0 25 50 75 100|
+----+----+----+----+|
NODE11 45 |aaaaaaaaa |
NODE12 41 |aaaaaaaa |
NODE13 35 |aaaaaaa |
| |
| |
| |

ES40 has 4 CPU onboard...

Our support says us that 50% was ONLY for ONE CPU

Is it real value for ONE CPU??

Thanks
./ Lucas
11 REPLIES 11
Wim Van den Wyngaert
Honored Contributor

Re: Monitor CPU level

It's the average value for all cpu's together. So 50% can be 25% + 75% OR 50% + 50%.

Wim
Wim
Volker Halle
Honored Contributor

Re: Monitor CPU level

smsc,

if you want to look at the utilisation of individual CPUs in your ES40 nodes, you can use MONITOR MODE/CPU

Note that when a SMP system shows a high CPU usage and especially if one CPU is always at 100% CPU usage, this could indicate a looping process, i.e. a process in a tight loop just consuming CPU cycles.

Volker.
Jess Goodman
Esteemed Contributor

Re: Monitor CPU level

If you only want to monitor one ES40 you would probably be better served with one of these two commands:

$ MONITOR SYSTEM /AVERAGE/SUMMARY
or if you want to see idle % instead:
$ MONITOR MODES /AVERAGE /SUMMARY

Add /NODE=NODEnn to the end of the command if you are not on that node currently.

The CPU display for this output will be for all four CPUs so the maximum display value would be 400%. If it shows 50% that could be 50% of one CPU with the other 3 idle, or 25% of two CPUs with 2 idle, etc.
I have one, but it's personal.
Andy Bustamante
Honored Contributor

Re: Monitor CPU level


Besides the monitor notes, consider using T4 from HP http://h71000.www7.hp.com/openvms/products/t4/index.html to drill down into system performance. You also get the ability to log and review trends.

For a very nice write up of free performance monitoring tools see: http://www.parsec.com/general/FreePerformanceTools.pdf . This was web seminar presented by the Parsec group

Andy Bustamante
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
smsc_1
Regular Advisor

Re: Monitor CPU level


Thanks to all for reply! ^_^

I just want to clarify better my question.
For the XMAS holidays our AlphaServer got a very big TCP/SS7 traffic.

My goal was to know if one ES40 can handle that traffic.

According to your reply, if I perform following command:
MONITOR SYSTEM /AVERAGE/SUMMARY
I should see value of ONE CPU (the first). That's clear, but why I got 100% as limit (and not 400%)??

How can I see (in case first CPU got 100%) other CPU LEVEL??


Anyway! Happy XMAS to all!

./ Lucas
Volker Halle
Honored Contributor

Re: Monitor CPU level

smsc,

if the MONITOR display you're using shows 400% on the x-axis, this will be the utilization for 4 CPUs, one fully loaded CPU will contribute 100%. Some of the other displays may only show 100% on the x-axis, so if you have 4 CPUs in such a system, one CPU, if it is completely busy, will only contribute 25% to the overall value in such a case.

If you are interested in the CPU load of the individual CPUs in one system, use MONITOR MODE/CPU, the display will cycle through the individual CPUs and show you their usage, as far as I remember.

Volker.
Hoff
Honored Contributor

Re: Monitor CPU level

So why not ask your support staff for help? Seriously. One obvious potential inference here is that there are some trust issues here, and these are a class of issues that tend to get worse through use of ITRC. Not better.

From the other side of this equation -- from the view from your support staff -- finding users running MONITOR has often been a red flag for latent system and performance issues. Both for actual performance issues, and for performance perceptions. And finding users over here in ITRC asking these sorts of questions is another red flag for site issues.

I'd encourage you to work directly with your support staff, and to get TDC and T4 loaded on this system, and to perform a systematic approach toward performance. Toward long-term collection of performance data. And toward building up trust.

Sure, knowing about spikes can be useful.

Knowing longer-term patterns and particularly around increasing load -- whether it's a pre-holiday telephony spikes year to year, or a longer-term increase in baseline performance, is invaluable. A one-shot MONITOR doesn't tell you much. What's more interesting is year over year and peak to peak; a trend.

System trends are not easily visible with isolated and with rogue MONITOR passes. And sometimes an unexpected MONITOR itself can stuff up the data collection processes; one site I dealt with had a performance problem that (and I kid you not) turned out to be eight or ten folks all running gonzo MONITOR passes. And FWIW, TDC and T4 are far better at displaying data than is an isolated MONITOR. And support is in a better place to help with long-term performance monitoring.

Stephen Hoffman
HoffmanLabs LLC
comarow
Trusted Contributor

Re: Monitor CPU level

Actually, there's another way to
look at how well the cpu is being used.

Look at the CPU Queue length. If on the average there is more than one job waiting
for the CPU, then by definition you have
a cpu bottleneck.

$Monitor System
You will see a column called COM

If that sits above 1 for any length of time,
it means that jobs are waiting for CPU time.

smsc_1
Regular Advisor

Re: Monitor CPU level


Thanks to all,
with:
MONITOR SYSTEM /AVERAGE/SUMMARY
I got exactly what I want!

CPU busy MAX value=400, this means that with supplied command I can monitor ALL CPU.

Only one field was not clear:

+ Buffered I/O Rate (1538)-+
|aaaaaaaaaaaaaaaaaaaaaaaaaa|
0+--------------------------+ 500
|aaaaaaaaaaaaaaaaaaaaaaaaaa|
+--------------------------+
Cur Top: VIQ_NTP_V351 (1451)


As you can see values goes from 0 to 500 and actual value was 500... What does it mean: Buffered I/O Rate (1538)
Is it a DANGER VALUE??

Happy holidays to all! ^_^
./ Lucas
Volker Halle
Honored Contributor

Re: Monitor CPU level

smsc,

a high BUFIO rate just indicates lots of buffered IOs (e.g. DECnet, LAT or TCPIP traffic or File System IOs) and is not 'dangerous'. The value 500 is not some kind of maximum, it is just an arbitrary value choosen by MONITOR as the x-axis maximum value in the display.

Merry Christmas,

Volker.
Hoff
Honored Contributor

Re: Monitor CPU level

What's your goal here? System-level issues? Performance of a specific application? Confirming what information your support folks are providing? Looking for a job tuning applications or operating systems?

400 or 500 buffered I/Os could be anything from operating at little more than an idle to a full-blown performance crisis, depending on application requirements and application timing.

Buffered I/O and Direct I/O differ in the use of a buffer allocated from within a pool of memory maintained by OpenVMS. Buffered I/O involves memory copies, while Direct I/O involves setting up DMA transfers. The former are for slower devices, the latter are for devices that offer DMA. In practical terms, they're all I/O operations. And some of these are slow, and some I/O operations are fast.

There's no simple answer to a performance question, and there's no chance that isolated monitoring will produce a meaningful result. I've found "hit-and-run tuning" can lead to issues with system performance; if you optimize for an unusual or boundary case, you might not get the intended benefits. Or you might miss a big win.

The answer usually arises with the trends and in the averages, and in factors such as the application response timing.

Consider: is the 400 or 500 BIO count really a spike, a trough, or an average? Does it tie in with the observed performance? What's the trend?

And as for application-level issues, I've regularly been surprised at the performance-limiting factor(s) within a typical non-trivial application. I might think "I/O" and end up looking at a one-page compute-bound critical loop buried in some corner of the code. DECset PCA, the SDA PC monitoring tools, and other such instruments are invaluable here.

Go systematic and go wide with your system monitoring. Go TDC and T4. Don't enter into the whole of the tuning discussion with preconceptions or assumptions. Monitor it all, and zero in from there. And work with your system staff here.