cancel
Showing results for 
Search instead for 
Did you mean: 

vms problems

 
tim lloyd_1
Frequent Advisor

vms problems

I maintain a legacy system which relies on qios to communicate with the outside world. The system runs on HP Itanium under VMS8.2. The language is pascal and makes heavy use of shared memory.

Action is initiated by logging in as a specific VMS user, the login command file associated with this user runs through some setup commands and then starts a number of detached processes. Each process has its quotas throttled at this point.

This is a transaction processing system which receives requests via ethernet from a number of external processing devices. The messages are processed sequentially and responses sent back to the originator via the same processing device.

There are also a number of unsolicited messages send out from Itanium to the processing devices. Ultimately, we are looking at fielding and responding to approx 100 requests/second.

The system has been in place since 1992 with minimal problems. Recently we added more processing devices to the setup, hence more load.

We have been experiencing problems when issuing qio's – process quota exceeded. This system is running at a customer's site so I can not monitor which quota was being exceeded. I increased the AST limit for the process experiencing problems. This has caused the error to shift and I now get an error on a call to lib$remqhi. Unfortunately the code does not highlight the exact error, simply crashes the program from a non-recoverable error. The program is restarted and does not experience the problem again during the trading day.

So, it looks like raising the AST limit addressed one problem but it looks like a number of other quotas should be increased in line with this change. A suggestion I received was to upgrade BIOLM, DIOLM and BYTLM.

I guess where I am coming from here is: does the explanation above make any sense? If so, can anyone comment on the quotas I have raised and those I would like to raise? Are there any other quotas which should rise in line with those mentioned? Unfortunately I have inherited a system which seems to have architectural problems and my experience of VMS is limited in this field.
12 REPLIES
Steven Schweda
Honored Contributor

Re: vms problems

I know nothing, but ...

> [...] can anyone comment on the quotas [...]

You could. I don't know what they were. I
don't know what they are. Actual values
might be interesting. Something could be
absurdly low, but my psychic powers are too
weak to let me discover anything from here.

If you're worried about the process quotas,
why not just try raising them? What could go
wrong? I might point a shotgun toward the
whole mess, and see if a factor of two
helped. If so, then a curious person could
try a set of more controlled increases to
see what actually mattered. If not, look
elsewhere.

Some basic SHOW PROCESS /QUOTA commands
scattered throughout a typical day might
reveal a pattern as something or other
shrinks toward zero.
tim lloyd_1
Frequent Advisor

Re: vms problems

Hi Steven, point taken about the lack of info. I have been monitoring the situation today and no problems at all. The quotas have remained constant over a 4 hr sample. the attachment has the last figures I recorded.

I have previously changed ASTs from 100 to 300 and I am planning to do the same with Buffered I/O limit and Direct I/O limit. Does that make sense? How do the other quotas look? TIA
Jan van den Ende
Honored Contributor

Re: vms problems

Tim,

to begin with:
WELCOME to the VMS forum!

Secondly:
Please rename any attachment to the extension .txt before posting. That way they should be more acceptable to most browsers.

Steven's suggestion about doubling is very reasonable. Without info on the exact quotum it is a hard guess otherwise.
You might do a SET PROCES/DUMP in the routine that kicks off the process, and the then generated dump MIGHT reveal more precise info (unless the programm explicitly obscures it, which, alas, is non too uncommon).

Bext to BIOLM and DIOLM you might also raise ENQLM at those transaction numbers. In todays memory-rich systems the impact on resource use is VERY minor.
For BYTLM I would start at 4-fould (or maybe 10-fold)

Happy hunting!

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Robert Gezelter
Honored Contributor

Re: vms problems

Tim

Welcome to the OpenVMS forum!

Raising the quotas is one stop-gap. If the problem is a short-term burst of activity, this may resolve the problem entirely.

However, it might be more productive in the long term to do also enable and then analyze process dumps to actually understand what is happening. This would also allow guidance to be developed as to future increases in resource requirements as workload further increases. I may have some gray hair, but I would rather increase quotas based upon a formula based on workload than just keep increasing it (e.g., I do get worried when I need to add oil to the car between scheduled changes; the same with coolant -- if fluids are disappearing, they must be going somewhere, at best there is a loose connection, at worst it is something serious, possibly fatal).

Applications of the general class described in the OP often have resource requirements denominated in circuits and transactions. A surge in resource usage beyond that formula often indicates something amiss.

Applications using shared memory are also very demanding of correctness. Higher performance systems often change the timing and uncover latent problems. Are there any other anomalies besides quotas being exceeded?

- Bob Gezelter, http://www.rlgsc.com
tim lloyd_1
Frequent Advisor

Re: vms problems

Hi Folks, thanks for the responses so far. I appreciate the thoughts. I have tried to dump the contents of the sh proc/quota, the presentation is not the best but I think the salient details are there.

If you have any comments based on the below I would be grateful. I have to duck out for the night.

Noted about tuning the solution. First step is to get the customer off my back so I can work without people breathing down my neck :)

Process Quotas:

CPU limit: Infinite Direct I/O limit: 100
Buffered I/O byte count quota: 535648 Buffered I/O limit: 100
Timer queue entry quota: 472 Open file quota: 1167
Paging file quota: 2242800 Subprocess quota: 13
Default page fault cluster: 64 AST quota: 299
Enqueue quota: 3997 Shared file limit: 0
Max detached processes: 0 Max active jobs: 0
Robert Gezelter
Honored Contributor

Re: vms problems

Tim,

Using ANALYZE/SYSTEM might be more useful than the results of SHOW PROCESS/QUOTA. The SHOW PROCESS command in ANALYZE will display BOTH the limits and the currently in use numbers for the quotas.

- Bob Gezelter, http://www.rlgsc.com
Steven Schweda
Honored Contributor

Re: vms problems

> Please rename any attachment to the
> extension .txt [...]

Changing the file name won't turn a Microsoft
Word document into a plain-text file. Some
VMS users don't have a convenient way to work
with MS Word documents. If all you have to
exhibit is plain text, then it would be
helpful to provide it as plain text.
Jan van den Ende
Honored Contributor

Re: vms problems

Steven,

totally agreed!

But the effect of a .COM file that is presented as such are not exactly informational either...

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Richard J Maher
Trusted Contributor

Re: vms problems

Hi Tim,

What errors from lib$remqhi does your code deem acceptable? I had a quick look at the ref. manual and there didn't appear to be a whole lot of quota related statii returned. Have we moved from the quota problem to a logic error or race (empty-queue) condition?

Does the error status get reported as process termination status?

As well as the quota suggestions you were given I would have also checked at pgflquota but yours does not look unreasonable. What is reasonable for your system though is an unknown. You could try what others have suggested and hit the quotas with a hammer, if the system grinds to a halt then you know you've gon to far :-)

VMS "Pascal" yikes! I thought this was a European-only phenomenon inflicted on sites such as LIFFE and Reuters by those that left DEC Europe's internal systems. Was this developed in Oz?

Cheers Richard Maher

PS. Do you have a presence in Perth?
John Gillings
Honored Contributor

Re: vms problems

Tim,
Quotas have only two purposes:

1) To protect the system from malicious intent by (over) consumption of resources

2) To protect the system from accidental over consumption of resources ("leaks").

In your case you presumably trust the application, and it's not in direct control of anyone who is likely to be suspected of malicious intent, so that knocks out 1.

If the application has been running since 1992 (though obviously NOT all that time on Itanium), we can probably assume all the potential leaks are plugged, so that knocks out 2. You can therefore arguably raise all your quotas to effectively infinite with no ill effects.

First thing to consider is you've moved to Itanium which can potentially consume more memory. It is almost certainly faster and has significantly more resources than the 1992 platform (Alpha?) for which the application was written.

Looking at your quotas, BYTLM is only 500K. That's almost laughable for a process that's intended to process network messages. I'd be raising that AT LEAST 10 fold. 5MB of BYTLM is peanuts on a modern system. Go further if you think it's an issue. If the application doesn't need extra it won't cost to over allocate, but if you hit the wall you die, or hang in MUTEX state. DIOLM probably doesn't matter (you're probably not doing much in the way of asynch disk I/O), but it won't hurt to raise the quota if you suspect it's a problem. BIOLM might be of interest, but if all your samples say "100" maybe you're not doing any asynch network I/O either? I'd probably raise them too, and set ASTLM to 150%-200% of BIOLM, on the assumption that each BIO will consume one AST, and you may need ASTs for other things.

Pagefile quota looks fine, you've got about 1GB headroom, which should be plenty.

Another thing to check is the SYSGEN parameter MAXBUF. It's possible your previous system had a non-default value which was not copied to the new system.

One other area to check (very long shot), you may be filling P0 space. Check F$GETJPI(pid,"FREP0VA"). It will return the hex address of the next available P0 page. You only need be concerned if the top digit is 3, or you see it growing over time (limit is 40000000).

If you want to instrument your code for better diagnostics, put calls to $GETJPI before and after suspect $QIOs, sampling BIOLM, BYTLM, ASTLM and TQELM. You should be able to get an idea of the cost of each I/O, and also see what, if anything, is being depleted.
A crucible of informative mistakes
tim lloyd_1
Frequent Advisor

Re: vms problems

thanks everyone for the responses. Especially John for explaining quotas more deeply.

I will have a look at and adjust the quotas. Next time I visit site I can do deeper analysis based on the knowledge gathered here.

BTW, Richard, no Perth presence, Sydney only. And, Pascal for about 30 years!

Cheers

Tim

Re: vms problems

> PS. Do you have a presence in Perth?

Richard, we have a presence in Perth, if that is of any interest.

lester dot dreckow at denver dot com dot au