Re: Did some testing of pagefile usage under 7.3

Wim Van den Wyngaert · ‎06-06-2005

FWIW : just found out that malloc only adjusts your quotas and doesn't use the physical memory directly. A malloc of 200 MB increased physical memory usage with 0.5 MB.

Wim

Wim

Antoniov. · ‎06-06-2005

I guess with prio = 20, your process became real time process and it is not submitted to round robin.

Antonio Vigliotti

Antonio Maria Vigliotti

Antoniov. · ‎06-06-2005

a process can request memory against its own pagefile quota. I have never heard that the system checks against what is still free in the pagefile(s).

Agree.
The 4.th process try to allocate more memory than it's avaiable so OS suspend it.
Wim killed 2 processes to react system. I tought there is a threshold on pagefile to reactivate mwait processes.

Antonio Vigliotti

Antonio Maria Vigliotti

Marc Van den Broeck · ‎06-06-2005

Antonio,

i dont think the process is suspended. As long as the program does not actually use memory, vms does not know (as Wims observations prove, onlu 0.5 Mb added).
But what happens is that the systems gets in a near hang when the memory is used beyond pagefile limit.

Rgds
Marc

Uwe Zessin · ‎06-06-2005

That's right. SUSP is a voluntary wait state (well, not if it is forced from another process ;-), but the only process I am aware that SUSPs other processes is AUDIT_SERVER.

If you have used up your pagefile quota, then the next request will fail (you do check the return from malloc()? ) with an error status, but the process is not put into a wait state.

.

Wim Van den Wyngaert · ‎06-06-2005

I share Uwe's expierence. That's why I do "set audit/excl" for e.g. my monitoring process. No use of a watch dog if it get suspended when there are problems.

Wim

Wim

John Gillings · ‎06-07-2005

re: Wim, "Not a single message ..."

The system almost certainly DID write messages, but they're direct to OPA0, not to OPERATOR.LOG. But since you're running this on a workstation, OPA0 messages are usually lost. OPA0 I/Os don't cost any exta memory because the console driver is resident, but I/O's to the log file do. You don't want to excerbate the problem by trying to report it!

Page file allocation is done exponentially, that is we attempt to allocate one page file cluster (PFC), if that fails, we try half, then half again, and again, until we get down to single blocks. Next allocation attempt we start back at PFC again. So, when the page file starts to get full (& fragmented), allocations take a LOOONG time because the attempts to get large blocks have to fail on each request. (you got a problem with that? Here's a nickle, go buy yourself a few more GB of page file space!)

When the pagefile reaches a point where allocations are taking longer than "normal", the message:

PAGEFRAG, page file is badly fragmented, system continuing

is written to the console, OPA0. If the allocation attempts reach single blocks, the message:

PAGECRIT, page file space critical, system trying to continue

is written.

If you've missed the messages, you can test if they've ever been issued by examining the system global cell EXE$GL_FLAGS in SDA (or in a crash dump). Bit 20 is set when the PAGEFRAG message is issued and bit 21 when PAGECRIT is issued. Left as an exercise to write some DCL code to test the bits. EXE$GL_FLAGS can be read from user mode.

"Put my prio on 20 while pagefile was full. Reacted faster. "

Things get more interesting at priority 20. First, you're higher priority than the modified page writer (SWAPPER runs at 16), and because ou're real time, you don't get any working set adjustment.

The bottom line here is that NO OpenVMS system should EVER see PAGEFRAG or PAGECRIT errors. It's simply bad economics to NOT give your system enough page file space that it never becomes a problem. Consider the cost of downtime, and the cost of the system manager's time - even the cost of the time of the people reading and writing this thread. You'd spend all that to save a few cents worth of disk space? This is a rare case where, even in a world run by accountants, common sense will prevail.

A crucible of informative mistakes

Marc Van den Broeck · ‎06-07-2005

Well my messages were far from exact but i am glad someones confirms they do occur.

Rgds
Marc

Wim Van den Wyngaert · ‎06-07-2005

John,

Thanks for the info. I never had the message on my servers because I get an alarm at 40% used. And even 40% is never reached.

But I wanted to know how VMS reacts in case something goes wrong in the application (already happened in 98).

Wim

Wim

Wim Van den Wyngaert · ‎06-07-2005

But I also tested on 6.2. This machine is connected to console manager.

Not a single message in operator.log and not in console manager extract (and it is connected !).

???

Wim

Wim

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Did some testing of pagefile usage under 7.3