Operating System - OpenVMS
1827794 Members
2851 Online
109969 Solutions
New Discussion

How to determine if a process has been in resource wait.

 
SOLVED
Go to solution
Roger Tucker
Occasional Advisor

How to determine if a process has been in resource wait.

I have a process that is set "resource wait enabled", and I cannot change that. This process is doing a lot of RMS asynchronous disk IO in soft real-time and is having some very random latency problems. We are using AXP OpenVMS 7.3-2. I’m trying to determine, if this process has ever had quota problems, where the process has had to wait while queuing an IO for an existing IO to complete.

As background, in sys$qio() when queuing an IO, if the process has insufficient quota (direct or buffered, depending upon the IO type), but there are outstanding IO’s that will complete and give back the quota, and the process is in resource wait, sys$qio() will eventually call exe$snglquota() which will eventually call the sch$... routines to wait the process. I think the process enters the MWAIT state, but I’m not sure.

So my question is, is there a counter I can get to from analy/sys or some other method of determining if a process has ever waited for quota? Catching it at just the right time would be difficuilt. A related question is, can I determine by a counter or something, how much time a process is waiting in RWSCS mode while waiting on a remote response from a remote system from the lock manager.

I appreciate any help in this area, but read carefully and try to stay on the subject as normal.
Thanks,
9 REPLIES 9
Antoniov.
Honored Contributor

Re: How to determine if a process has been in resource wait.

Roger,
I can just answer with DCL tips; you can easily apply to any C program or use on some other way.
You can monit some process quotas, for example BIOLIM and DIOLIM;
$ BIOLIM=F$GETJPI("","BIOLM")
$ DIOLIM=F$GETJPI("","DIOLM")
$ BIOCTR=F$GETJPI("","BIOCNT")
$ DIOCTR=F$GETJPI("","DIOCNT")
$ BIOAVA=BIOLIM-BIOCTR
$ DIOAVA=DIOLIM-DIOCTR
there are some other values in $getjpi service you can check for.

Antonio Vigliotti
Antonio Maria Vigliotti
Volker Halle
Honored Contributor
Solution

Re: How to determine if a process has been in resource wait.

Roger,

sorry, there are no such counters available. Even the PERF_COUNTER macro, which provides some performance counters (CNT_*) in the _MON execlets (loaded with SYSTEM_CHECK=1) does not provide counters for these types of events.

If a process runs out of quota (DIOLM, BIOLM, BYTLM), it's typically put in RWAST.

If a process seems to be in RWSCS for a 'long time', it usually means it's not hung in this state, but getting into and out of RWSCS so quickly, that you don't see the non-RWSCS state (just check increasing IO count or CPU cycles to confirm, that the process is not hung).

Do you know the IO$SDA IO trace SDA extension ? You could turn on IO tracing in the system dynamically and stop the trace 'if your problem happens'. See (minimal) help in SDA> IO

To watch your process, you could also write some DCL code (as suggested by Antonio) to see, if the process ever gets near zero xxxCNT. Or you can just bump xxxLM...

Volker.
Robert Gezelter
Honored Contributor

Re: How to determine if a process has been in resource wait.

Roger,

I agree with Volker, with one exception. I would not stop the process if this event occurs.

What I would do is to create some form of event log when this situation does happen. I might even extend this concept to "when the event NEARLY happens".

- Bob Gezelter, http://www.rlgsc.com
Volker Halle
Honored Contributor

Re: How to determine if a process has been in resource wait.

Bob,


... I would not stop the process if this event occurs.


I suggested to stop the IO TRACE, not the process, if the problem has happened, if you can detect the problem from the application itself or from an external 'watcher' process.

Volker.
Robert Gezelter
Honored Contributor

Re: How to determine if a process has been in resource wait.

Volker,

My apologies, I apparently wrote somewhat in haste.

I would, however, like to know with what frequency the problem is occuring.

- Bob Gezelter, http://www.rlgsc.com
Lawrence Czlapinski
Trusted Contributor

Re: How to determine if a process has been in resource wait.

Roger, you could also monitor process and/or job quotas for a process with Availabilty Manager. This way you could see if one or more values is close to it's limit. You could also check Wait States quotas to see if the process has waited for quotas.
Lawrence
John Gillings
Honored Contributor

Re: How to determine if a process has been in resource wait.

Roger,

There are no counters. Some "MWAIT" states can be classified as completely normal, but most times are so transient that you never notice them. In particular, RWSCS. If you make a lock request that goes off node, you WILL be put into RWSCS until the remote node responds.

I don't think there's a statistic for time spent in RWSCS, but you can MONITOR DLOCK and watch for high levels of outgoing requests. Compare with other cluster nodes to work out if you need to modify your weightings to try to localise locking.

In terms of quotas... remember that all quotas do is protect you from accidental or malicious (over) consumption of resources. There is no penalty for allocating a high quota to a "trusted" application.

DIOLM and BIOLM are less likely to be a resource bottle neck than BYTLM. If you suspect a process is suffering because it's hitting one of these process quotas, try increasing and watching DIOCNT, BIOCNT and BYTCNT over time. Make sure it's not leaking.

BYTLM is measured in BYTES. Keep in mind that memory is MUCH cheaper now than in the past, so don't be afraid to give a process a good healthy allocation.

As for SYS$SETRWM, please forget it exists. It doesn't do what you think it does, or what you want, so please don't go there!
A crucible of informative mistakes
Ian Miller.
Honored Contributor

Re: How to determine if a process has been in resource wait.

Monitoring the system using AMDS or Availabilbity Manager may catch the process going into a MWAIT state.
____________________
Purely Personal Opinion
Wim Van den Wyngaert
Honored Contributor

Re: How to determine if a process has been in resource wait.

I monitor 100+ VMS systems for MWAIT every 5 minutes. As John said, MWAIT is a perfectly good state to be in. For a short time.

So, I check for MWAIT every second during 1 minute but still I found processes that do that regularly. So, I give a "informational" alarm after 1 minute and an "error" alarm after 10 minutes.

I also monitor all quotas and channel usage because at our site, that's the reason of the MWAIT most of the time.

Wim
Wim