Excessive Hard Faulting

sartur · ‎09-26-2005

Hello

My Alpha Server have excessive hard faulting probably caused by too small page cache.
The last time I increased the secondary page cache by increasing the values of MPW_HILIMIT, MPW_TRESH and MPW_WAITLIMIT.

See attach

But I continue with the same problem... excessive hard faulting

A rough guideline is to provide between 4 and 12 percent of memory usable by processes in the page
cache, the smaller being for large memory configurations.
How could I obtain that value or the best value according to my system ( Alpha Server 8400 with OpenVMS 7.3-2, 6 cpu's and 12GB RAM ?

Thanks

Peter Quodling · ‎09-26-2005

Autogen?

Leave the Money on the Fridge.

sartur · ‎09-26-2005

Peter

The autogen was the method that I use.
The autogen adjusted the values and I in addition increased these in a 25 %.
But I continued with the same problem.
What I want to know is how I could calculate the better value for these parameters.

Andy Bustamante · ‎09-26-2005

How much free memory do you have during these hard faults? How much memory is allocated to XFC? If you don't have a memory shortfall then hard page faults could indicate your process working sets are too small. See

@sys$examples:working_set

for one method of monitoring this.

Best value for your system depends on what the users or application is doing.

Andy

If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net

Hein van den Heuvel · ‎09-26-2005

Excessive hard page faults suggest either a severe physical memory shortage, or excessive image activation without the benefit of (installed) shared images.

If one didles with the mpw values then more, or less, pages get flushed out end end up on the free list... from where they will softfault back in. Not hard.

Ofcourse it could also be application design. For example, if one maps a 10GB file on a 4Gb system and then walks that file, clearly hard page fault will happen as requested.

btw... I failed to see the attachment you mentioned. Try that again?

hth,

Hein.

Wim Van den Wyngaert · ‎09-26-2005

I would first search which processes are doing hard faults and check why they are doing it. E.g.
Are they starting big exe's every second ? Are the exe's installed ?
Can't the process stay in the exe ?
Do show mem/cac=file=dev:*>* and check if the exe's are well cached.

Wim

Wim

Willem Grooters · ‎09-26-2005

It can of course be a matter of data size, program activation, process creation, file mapping...
But it can also be a matter of program design or coding. If you're using Java programs, you need LOTS of memory available (for each user) so that might eventually casue heavy hard paging. HP recommends Unix settings (on a VMS system!) in http://h71000.www7.hp.com/ebusiness/optimizingsdkguide/optimizingsdkguide.html

Willem Grooters
OpenVMS Developer & System Manager

comarow · ‎09-28-2005

It would be nice to know what kind of applications you are using.

For example, if you are using XFC and Oracle, set the Oracle Databases to /nocache.

Are you sharing as many images as possible?

sartur · ‎09-28-2005

comarow

We used ACMS applications and Rdb database

Thanks

Hein van den Heuvel · ‎09-28-2005

Hi Arthuro,

That is a very specific environment. I would of course welcome suggestions from readers here, but expect you will need dedicated support.

It is a fun environment, and potentially a very well performing one, as a small number of (server) images can process many user requests. In fact I woudl expect less pagefault issues in that environment then 'normal' ones.

Those hard faults are (by definition) going to a an file on disk. Your most critical missions is to find out which file(s) they are going to. You'll need some 'hot file' monitorring tool, an io trace or something like that. A first drill down could be MONI CLUS to spot the hot disk(s) and SHOW DEV /FILE for those disks.

hope this helps a little,
Hein.

comarow · ‎10-02-2005

Thanks for the input.

RDB can do row caching. Remember to set the RDB files/nocache if you use XFC.

In general, it is obvious, to reduce hard caching add memory. Working sizes grow larger, caches grow larger, modified pages will be flushed less.

It will insure hard faults are reduced.

Shared images will reduce hard faults. One way to help identify images that should be shared is show dev/files and see files open by multiple users. If they are not installed shared each will get their own copy.

When you do monitor page, where are most of your faults coming from?

Lawrence Czlapinski · ‎10-03-2005

Arturo S.: You need to look at which processes are pagefaulting a lot figure out why they are pagefaulting. If you have images that are used by a number of users, it may help to install the images/share. Image activiations cause a lot of hard page faults. As others have stated, your working set defaults and working set quotas may be too small for some processes. Attached is workset.txt which can be renamed to workset.com on the Alpha. It will show the total page faults by process. If you could run it and attach the output, we may be able to give more specific advice.
Lawrence

sartur · ‎10-05-2005

Hi Lawrence

Attach is the run log output from the worksets script

Karl Rohwedder · ‎10-05-2005

At 1st glance the WSDEF of 26000 pages seems very high.

The SQLSRV has 4 RMUEXEC71 started with 128000 pages WSsize, do you really need 4 of them prestarted (MC SQLSRV_MANAGE71 to remove them)?

regards Kalle

Lawrence Czlapinski · ‎10-05-2005

Arturo S:
1. Measure and save your page faulting rates before and after. Save your WORKSET.COM. You can check whether a user's processes have less page faults with the new values. This is a crude measurement. This won't mean much for a user with many image activations but some users may use a single application continuously or you might have application processes.
$MONITOR PAGE
I would monitor for awhile and cut and paste several final screens into a word processing or mail application so that you have some idea of what your current page faulting is.
2. Whether your processes stay around or come and go makes a big difference in page faults. When processes start you will get a lot of page faults.
3. $MONITOR PROC/TOPFAULT may or may not be helpful. It would tell you which processes are currently doing the most page faulting.
4. For processes where the page faults are high and Pages in Working Set are less than WSQUOTA, consider increasing the WSDEF and WSQUOTAs where possible. This increases the chances that shared global pages are already in memory. This will only help with hard page faults due to too small working sets. This works best for applications that stay around and run the same image continously.
For interactive processes, you would modify the UAF (authorization file). For batch processes, you would need to check whether your batch queues have limits. For detached processes, you need to look at PQL_DWSDEF, PQL_MWSDEF, PQL_DWSQUO and PQL_MWSQUO and whether WSDEF and WSQUOTA are hard coded.
5. If there are applications, that are used by multiple users you want to have them installed shared so that it is more likely that the pages are already in memory.
6. Some images get a lot of page faults that are unrelated to their working set size. You probably won't be able to do anything about them. DCL procedures often do a lot of paging because of image activations. Compiled programs are often more efficient. If you have Java applications, they often do a lot of paging. Dec Windows does a lot of page faulting too.
7. Having high WSDEF and WSQUOTA doesn't usally hurt unless you are tight on memory. Still you will have to use your judgement as too how much and how quickly you change things. I'll typically look at how much memory the process is getting now as a consideration. Also consider the processes priority. If it's a high priority process, by all means be generous, you want it to have the resources it needs and not be paging a lot. Some of your processes have pages in working sets are quite large and could benefit from increased sizes for WSDEF and WSQUOTA.
8. It may take awhile for enough processes that could benefit from increased quotas to be running with the new quotas.
Lawrence

Ian Miller. · ‎10-05-2005

I see you are running PSDC - Is that where you saw the report of excessive hard faulting?

For the processes that are doing the faulting then do they run many images or just one?

____________________
Purely Personal Opinion

sartur · ‎10-06-2005

Ian

Yes, is in PSDC where I saw the report of excessive hard faulting... and also in Availability Manager

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Excessive Hard Faulting

Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting

Re: Excessive Hard Faulting