Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

How to read this

 
Douglas Austin
Occasional Visitor

How to read this

Hi all,
Im trying to figure out how to read this output, im not a VMS engineer just a Backup admin. But were runnign into some thru-put issues and the way i read this I/O statment is that were maxing a disk can you verify

# open files remaining 296/300 (98%)
Direct I/O count/limit 300/300 (100%)
Buffered I/O count/limit 4999/5000 (99%)
BUFIO byte count/limit 294880/294880 (100%)
ASTs remaining 2047/2048 (99%)
Timer entries remaining (99%)
PGFL quota count/limit 31049/32000 (97%)
ENQ quota count/limit 2977/3000 (99%)

The way i read this is that we are about maxing out this disk.. can someone verify the percentages are either Free or used?
10 REPLIES 10
RBrown_1
Trusted Contributor

Re: How to read this

How did you get this report?

These are per-process quotas and not system wide, and they all show free resources.

For overall system throughput, start with MONITOR SYSTEM, and also read OpenVMS Performance Management (http://h71000.www7.hp.com/doc/73final/6491/6491pro.html) in the VMS Document Set.

What version of VMS? What are you running it on? What are "thru-put issues"?
Robert Gezelter
Honored Contributor

Re: How to read this

Douglas,

Welcome to the HP ITRC OpenVMS Forum.

Most important initial question: How were these numbers produced? What command? Also, at what point was the command issued?

- Bob Gezelter, http://www.rlgsc.com
abrsvc
Respected Contributor

Re: How to read this

There really is not much to go on here. From the looks of the values/limits shown, the process was relatively idle when these were recorded.

Rather than chase numbers, please describe the actual problem that you are seeing. For example, you mention throughput issues. What exactly is the issue?

Please include:

1) Source for backup (assuming backup time is the issue)

2) Destination device (disk vs. tape) including device model.

3) Actual backup command used.


This info will be a good start. From there we may ask for additional info, but lets start here.

Thanks,
Dan
Douglas Austin
Occasional Visitor

Re: How to read this

The backup app is Netbackup 7.0

Source is vmax disk
destination is lto4 tape

Problem is inconsistent speeds.. one day we move at 30 mb/sec the next day the same disk is 12mb/sec.. im trying to get information on how busy the system/disks are each day
Robert Gezelter
Honored Contributor

Re: How to read this

Douglas,

There can be many explanations for inconsistent performance. Some of them are within OpenVMS; others are often traced to network anomalies or traffic.

On the OpenVMS side, there are no direct commands that will directly produce useful performance information. My recommendation to most clients in this situation is to install the T4 kit (which gathers data using MONITOR/RECORD) and then chart the results to get aggregate information.

Similarly, at the same time, I might recommend using WireShark to monitor the LAN, so that there is a clear dataset of the backup operation from both perspectives. With the complete data, it is possible to identify the bottleneck.

An instantaneous display of quotas does not provide any substantive information, it is only an instant in time (and from these statistics, a relatively idle one at that).

One can do the analysis oneself, albeit there is a learning curve, or one can retain outside assistance [Disclosure: As do other frequent contributors to this forum, we offer such services.].

- Bob Gezelter, http://www.rlgsc.com
John Gillings
Honored Contributor

Re: How to read this

Douglas,
Answering your specific question...

> can someone verify the percentages are
> either Free or used?

The output you've posted looks like it's in the form:

count/limit / ()

In the VMS world "count" is the remaining quota and "limit" is the maximum the process is allowed to consume.

The quotas in question:

"open files remaining" - obvious?
"Direct I/O" - roughly speaking, simultaneous outstanding I/Os to disk
"Buffered I/O" - roughly speaking, simultaneous outstanding I/Os to terminal, network or other I/O devices
"BUFIO byte" non-paged pool used for buffered I/Os, and a few other things
"ASTs" Asynchronous System Traps - roughly speaking, asynchronous interrupts (signals?) queued to your process
"Timer Entries" pending timer events
"PGFL" page file quota measured in 512 byte pages
"ENQ" number of outstanding locks against system resources (can be in different states).

The snapshot looks more like an idle process than one which is maxing out anything.
A crucible of informative mistakes
Andy Bustamante
Honored Contributor

Re: How to read this

As John points out, this looks like snapshot of quotas for one process. Depending on which process this may or may not have anything to do with your backup.

>>>Problem is inconsistent speeds.. one day we move at 30 mb/sec the next day the same disk is 12mb/sec.. im trying to get information on how busy the system/disks are each day

To get quick idea of how busy a disk is, use:

$ monitor disk/item=queue

to see what the i/o back log is for each device. You'll need to describe what the hardware is for each disk and what version of OpenVMS you're running, but that will provide some sense of load.

For more detailed options in tracking performance, review T4, http://h71000.www7.hp.com/openvms/products/t4/index.html which is one of best ways to get a handle on what your system is doing.

If you're feeling swamped, ask for a consultation with one of fine forum members here who provide that service. Or invest in some course work.

Why is inconsistant speed an issue? If you push the backup will hat impact users or production? With netbackup, you're pushing data over the network, is the backup server in a similiar state, that is the same or no other jobs running? What's the network load from other sources on backup server and VMS system?
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
Hein van den Heuvel
Honored Contributor

Re: How to read this

>> im not a VMS engineer just a Backup admin.

ok, we'll try to go easy :-)
You are likely to have to venture outside your comfort zone and into OpenVMS. So you may want to request some internal or external help beyond the help offered here.

>> The way i read this is that we are about maxing out this disk..

No, but those settings would potentially allow the process to max out the disk. But that's goodness is it not?

The actually disk busyness can be seen with MONI DISK ... or T4 are replied by others.

The Symantec NetBackup OpenVMS 7.0.1 release notes, give some indications. They can be seen at
http://unix.derkeiler.com/Newsgroups/comp.os.vms/2010-09/msg00050.html

Spefically:
-- Support for the /PROGRESS qualifier
And
-- This release shows the ASTUSED usage (as shown above)
:
%NBU-I-NETWORK, 30.3 GB (59293184 blocks) in 0 00:11:20 secs
(44644.28 KB/sec)
%NBU-I-ASTUSED, used 800 of 800 biolm quota (astlm 1600)
%NBU-S-NETBACK, saved 87500 files (59293184 blocks)

Try to read/understand any paragraph with the word AST in it.

RBrown> How did you get this report?
Bob G> How were these numbers produced?

Looks like a simple SHOW PROC/CONT/ID=xxx and selecting the 'q' (for quota) option.
Clearly Douglas has a clue, but not all info.

>> Problem is inconsistent speeds.. one day we move at 30 mb/sec the next day the same disk is 12mb/sec.. im trying to get information on how busy the system/disks are each day

There is a wide range of potential explanations
- the Server does not have the power, or the feed, to keep the tape streaming.
- the network has concurrent activity slowing down the feed below streaming thresholds
- there client systems is busier and/or has more activity on the source disks (T4 or monitor!)
- The data is different: a few large file may skew the average up, many small file may shew it down... is the same data alsways targeted or could it change based on what changed on the disks?
- Variations in file fragmentation
- Files being 'hot' in caches still versus needing to be read.

Good luck!
Hein

Paul Gotkin
Occasional Visitor

Re: How to read this

Hi Douglas,

All suggestions you have received are very good but require examination of monnitor/t4/network data. Not difficult but if you are new to this it may be somewhat confusing.

I have what I think is a simpler place to start. If you are running your backups in batch (hopefully), simply compare two batch logs: one where you get good thruput vs. one that you perceive to run more slowly. The batch logs contain job rundown information at the bottom of each batch logfile. Compare the elapsed time, cpu time, buffered and direct i/o counts between the "fast" backup and the "slow" one. If the number of i/o operations are comparable and the cpu usage is roughly the same the problem is most likely network bandwidth or a bottleneck at the tape drive/controller, all other things being equal. T4/monitor will tell you if your process is stalling due to high system usage or a local disk queue (bottleneck at the source drive), but I have found that simply comparing two batch logs offers a wealth of information that often gets overlooked and is readily available if you keep your batch logs around for a while.
Bob Blunt
Respected Contributor

Re: How to read this

Douglas, let's step back a bit... All the answers you've gotten have been valid and valuable. However all the basics have been skipped and you may have been handed a bag of snakes. Good for you that you're interested because you can build your knowledge using the investigation. There are some points of information that would make it easier for your research and general knowledge and for us to help you.

OpenVMS Version?
Hardware type (VAX/Alpha/Integrity/emulated)?
What type disk controllers?
How are they connected to the system?
How are the tape units connected?

You might ask "Why are these important to solve our problem?"

No operating system that receives regular updates is static and utilties are changed to (hopefully) make them work more efficiently and correctly. So knowing your version of OpenVMS is the most critical. Hardware configuration can be important because, to use a car analogy, a Chevette can't usually run as fast as a CORvette or haul as much as a tractor-trailer. So if your "problem" system is a workstation class system you get one performance expectation and a large system should run much differently. Configuration of your peripherals can also affect the operation of your system so knowing how it goes together helps you, and us, know where to look for bottlenecks.

All the previous replies have good suggestions. You'll need to combine all that into a concise solution to evaluate what factors are combining to cause the differences in the ways you're seeing your NetBackup jobs run.

Some general things you can check on OpenVMS *when* you see slow performance are:

SHOW USERS
SHOW SYSTEM
SHOW QUEUE/FULL/ALL
SHOW ERROR
SHOW MEMORY

I'm not familiar enough with NetBackup to tell you that there's a potential for slowdown due to, for example, network activity. If other jobs are running and your OpenVMS system is sharing it's disk and tape connection over a SAN fabric of some sort then the OpenVMS NetBackup jobs could easily be affected by workload induced by or on other systems sharing the SAN. If, however, your system is using directly connected disk (via SCSI for example) and tapes then the factors that are affecting the performance of the NetBackup jobs *should* only be limited to activity on the VMS system(s) and possibly the network (if NetBackup depends on community Ethernet for instance/example).

There's a LOT to be learned but it can be a fascinating process...

bob