Re: How to read this

Douglas Austin · ‎11-01-2010

Hi all,
Im trying to figure out how to read this output, im not a VMS engineer just a Backup admin. But were runnign into some thru-put issues and the way i read this I/O statment is that were maxing a disk can you verify

# open files remaining 296/300 (98%)
Direct I/O count/limit 300/300 (100%)
Buffered I/O count/limit 4999/5000 (99%)
BUFIO byte count/limit 294880/294880 (100%)
ASTs remaining 2047/2048 (99%)
Timer entries remaining (99%)
PGFL quota count/limit 31049/32000 (97%)
ENQ quota count/limit 2977/3000 (99%)

The way i read this is that we are about maxing out this disk.. can someone verify the percentages are either Free or used?

RBrown_1 · ‎11-01-2010

How did you get this report?

These are per-process quotas and not system wide, and they all show free resources.

For overall system throughput, start with MONITOR SYSTEM, and also read OpenVMS Performance Management (http://h71000.www7.hp.com/doc/73final/6491/6491pro.html) in the VMS Document Set.

What version of VMS? What are you running it on? What are "thru-put issues"?

Robert Gezelter · ‎11-01-2010

Douglas,

Welcome to the HP ITRC OpenVMS Forum.

Most important initial question: How were these numbers produced? What command? Also, at what point was the command issued?

- Bob Gezelter, http://www.rlgsc.com

abrsvc · ‎11-01-2010

There really is not much to go on here. From the looks of the values/limits shown, the process was relatively idle when these were recorded.

Rather than chase numbers, please describe the actual problem that you are seeing. For example, you mention throughput issues. What exactly is the issue?

Please include:

1) Source for backup (assuming backup time is the issue)

2) Destination device (disk vs. tape) including device model.

3) Actual backup command used.

This info will be a good start. From there we may ask for additional info, but lets start here.

Thanks,
Dan

Douglas Austin · ‎11-01-2010

The backup app is Netbackup 7.0

Source is vmax disk
destination is lto4 tape

Problem is inconsistent speeds.. one day we move at 30 mb/sec the next day the same disk is 12mb/sec.. im trying to get information on how busy the system/disks are each day

Robert Gezelter · ‎11-01-2010

Douglas,

There can be many explanations for inconsistent performance. Some of them are within OpenVMS; others are often traced to network anomalies or traffic.

On the OpenVMS side, there are no direct commands that will directly produce useful performance information. My recommendation to most clients in this situation is to install the T4 kit (which gathers data using MONITOR/RECORD) and then chart the results to get aggregate information.

Similarly, at the same time, I might recommend using WireShark to monitor the LAN, so that there is a clear dataset of the backup operation from both perspectives. With the complete data, it is possible to identify the bottleneck.

An instantaneous display of quotas does not provide any substantive information, it is only an instant in time (and from these statistics, a relatively idle one at that).

One can do the analysis oneself, albeit there is a learning curve, or one can retain outside assistance [Disclosure: As do other frequent contributors to this forum, we offer such services.].

- Bob Gezelter, http://www.rlgsc.com

John Gillings · ‎11-01-2010

Douglas,
Answering your specific question...

> can someone verify the percentages are
> either Free or used?

The output you've posted looks like it's in the form:

count/limit / ()

In the VMS world "count" is the remaining quota and "limit" is the maximum the process is allowed to consume.

The quotas in question:

"open files remaining" - obvious?
"Direct I/O" - roughly speaking, simultaneous outstanding I/Os to disk
"Buffered I/O" - roughly speaking, simultaneous outstanding I/Os to terminal, network or other I/O devices
"BUFIO byte" non-paged pool used for buffered I/Os, and a few other things
"ASTs" Asynchronous System Traps - roughly speaking, asynchronous interrupts (signals?) queued to your process
"Timer Entries" pending timer events
"PGFL" page file quota measured in 512 byte pages
"ENQ" number of outstanding locks against system resources (can be in different states).

The snapshot looks more like an idle process than one which is maxing out anything.

A crucible of informative mistakes

Andy Bustamante · ‎11-01-2010

As John points out, this looks like snapshot of quotas for one process. Depending on which process this may or may not have anything to do with your backup.

>>>Problem is inconsistent speeds.. one day we move at 30 mb/sec the next day the same disk is 12mb/sec.. im trying to get information on how busy the system/disks are each day

To get quick idea of how busy a disk is, use:

$ monitor disk/item=queue

to see what the i/o back log is for each device. You'll need to describe what the hardware is for each disk and what version of OpenVMS you're running, but that will provide some sense of load.

For more detailed options in tracking performance, review T4, http://h71000.www7.hp.com/openvms/products/t4/index.html which is one of best ways to get a handle on what your system is doing.

If you're feeling swamped, ask for a consultation with one of fine forum members here who provide that service. Or invest in some course work.

Why is inconsistant speed an issue? If you push the backup will hat impact users or production? With netbackup, you're pushing data over the network, is the backup server in a similiar state, that is the same or no other jobs running? What's the network load from other sources on backup server and VMS system?

If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net

Hein van den Heuvel · ‎11-02-2010

>> im not a VMS engineer just a Backup admin.

ok, we'll try to go easy :-)
You are likely to have to venture outside your comfort zone and into OpenVMS. So you may want to request some internal or external help beyond the help offered here.

>> The way i read this is that we are about maxing out this disk..

No, but those settings would potentially allow the process to max out the disk. But that's goodness is it not?

The actually disk busyness can be seen with MONI DISK ... or T4 are replied by others.

The Symantec NetBackup OpenVMS 7.0.1 release notes, give some indications. They can be seen at
http://unix.derkeiler.com/Newsgroups/comp.os.vms/2010-09/msg00050.html

Spefically:
-- Support for the /PROGRESS qualifier
And
-- This release shows the ASTUSED usage (as shown above)
:
%NBU-I-NETWORK, 30.3 GB (59293184 blocks) in 0 00:11:20 secs
(44644.28 KB/sec)
%NBU-I-ASTUSED, used 800 of 800 biolm quota (astlm 1600)
%NBU-S-NETBACK, saved 87500 files (59293184 blocks)

Try to read/understand any paragraph with the word AST in it.

RBrown> How did you get this report?
Bob G> How were these numbers produced?

Looks like a simple SHOW PROC/CONT/ID=xxx and selecting the 'q' (for quota) option.
Clearly Douglas has a clue, but not all info.

>> Problem is inconsistent speeds.. one day we move at 30 mb/sec the next day the same disk is 12mb/sec.. im trying to get information on how busy the system/disks are each day

There is a wide range of potential explanations
- the Server does not have the power, or the feed, to keep the tape streaming.
- the network has concurrent activity slowing down the feed below streaming thresholds
- there client systems is busier and/or has more activity on the source disks (T4 or monitor!)
- The data is different: a few large file may skew the average up, many small file may shew it down... is the same data alsways targeted or could it change based on what changed on the disks?
- Variations in file fragmentation
- Files being 'hot' in caches still versus needing to be read.

Good luck!
Hein

Paul Gotkin · ‎11-02-2010

Hi Douglas,

All suggestions you have received are very good but require examination of monnitor/t4/network data. Not difficult but if you are new to this it may be somewhat confusing.

I have what I think is a simpler place to start. If you are running your backups in batch (hopefully), simply compare two batch logs: one where you get good thruput vs. one that you perceive to run more slowly. The batch logs contain job rundown information at the bottom of each batch logfile. Compare the elapsed time, cpu time, buffered and direct i/o counts between the "fast" backup and the "slow" one. If the number of i/o operations are comparable and the cpu usage is roughly the same the problem is most likely network bandwidth or a bottleneck at the tape drive/controller, all other things being equal. T4/monitor will tell you if your process is stalling due to high system usage or a local disk queue (bottleneck at the source drive), but I have found that simply comparing two batch logs offers a wealth of information that often gets overlooked and is readily available if you keep your batch logs around for a while.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: How to read this

How to read this