Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Pool expansion failed

 
SOLVED
Go to solution
Toine_1
Regular Advisor

Pool expansion failed

Hi,

I got this error on I64 Server.
SYSTEM-W-POOLEXPF, Pool expansion failed -- insufficient NPAGEVIR.

I'm running VMS 8.3-1H1 and I'm using these settings:

SYS_NVJ$ mc sysgen sho npag
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
NPAGEDYN 778313728 4194304 163840 1879048192 Bytes
NPAGEVIR 1006632960 16777216 163840 1879048192 Bytes
NPAG_BAP_MIN 40960 0 0 -1 Bytes
NPAG_BAP_MAX 131072 0 0 -1 Bytes
NPAG_BAP_MIN_PA 0 0 0 -1 Mbytes
NPAG_BAP_MAX_PA 2147483647 -1 0 -1 Mbytes
NPAG_RING_SIZE 2048 2048 0 -1 Entries
NPAGECALC 0 1 0 2 Coded-valu
NPAGERAD 0 0 0 -1 Bytes
NPAG_INTERVAL 30 30 0 -1 Seconds D
NPAG_GENTLE 100 100 1 100 Percent D
NPAG_AGGRESSIVE 100 100 1 100 Percent D

When I look at the pool I don't understand why I have this problem.

SYS_NVJ$ sho mem/pool
System Memory Resources on 3-AUG-2010 23:21:39.93

Dynamic Memory Usage: Total Free In Use Largest
Nonpaged Dynamic Memory (MB) 960.00 899.25 60.75 0.31
Bus Addressable Memory (KB) 128.00 94.87 33.12 88.00
Paged Dynamic Memory (MB) 95.36 74.35 21.01 74.25
Lock Manager Dyn Memory (MB) 93.20 28.76 64.44
SYS_NVJ$ sho mem/pool/full
System Memory Resources on 3-AUG-2010 23:21:48.44

Nonpaged Dynamic Memory (Lists + Variable)
Current Size (MB) 960.00 Current Size (Pagelets) 1966096
Initial Size (MB) 742.25 Initial Size (Pagelets) 1520144
Maximum Size (MB) 960.00 Maximum Size (Pagelets) 1966080
Free Space (MB) 899.22 Space in Use (MB) 60.78
Largest Var Block (KB) 319.68 Smallest Var Block (bytes) 64
Number of Free Blocks 100253 Free Blocks LEQU 64 bytes 1206
Free Blocks on Lookasides 98456 Lookaside Space (MB) 897.10

Bus Addressable Memory (Lists + Variable)
Current Size (KB) 128.00 Current Size (Pagelets) 256
Initial Size (KB) 128.00 Initial Size (Pagelets) 256
Free Space (KB) 94.87 Space in Use (KB) 33.12
Largest Var Block (KB) 88.00 Smallest Var Block (KB) 6.87
Number of Free Blocks 2 Free Blocks LEQU 64 bytes 0
Free Blocks on Lookasides 0 Lookaside Space (bytes) 0

Paged Dynamic Memory
Current Size (MB) 95.36 Current Size (Pagelets) 195312
Free Space (MB) 74.34 Space in Use (MB) 21.01
Largest Var Block (MB) 74.25 Smallest Var Block (bytes) 16
Number of Free Blocks 1387 Free Blocks LEQU 64 bytes 1274

Lock Manager Dynamic Memory
Current Size (MB) 93.20 Current Size (Pages) 11930
Free Space (MB) 27.90 Hits 6606796
Space in Use (MB) 65.29 Misses 692
Number of Empty Pages 3546 Expansions 11930
Number of Free Packets 94133 Packet Size (bytes) 0

CLUE MEM/STAT
Memory Management Statistics:
-----------------------------
Pagefaults: Non-Paged Pool:
Total Page Faults 18450334 Successful Expansions 1742
Total Page Reads 3348526 Unsuccessful Expansions 935
I/O's to read Pages 1933679 Failed Pages Accumulator 606
Modified Pages Written 2800 Total Alloc Requests 300161
I/O's to write Mod Pages 78 Failed Alloc Requests 501
Demand Zero Faults 8879499
Global Valid Faults 4156906 Paged Pool:
Modified Faults 1145626 Total Failures 0
Read Faults 0 Failed Pages Accumulator 0
Execute Faults 0 Total Alloc Requests 177031
Failed Alloc Requests 0

Direct I/O 57407951 Cur Mapped Gbl Sections 1613
Buffered I/O 66606063 Max Mapped Gbl Sections 1615
Split I/O 272566 Cur Mapped Gbl Pages 483829
Hits 13161984 Max Mapped Gbl Pages 485702
Logical Name Transl 51815822 Maximum Processes 978
Dead Page Table Scans 0 Sched Zero Pages Created 0

Two questions:
1) Which sysgen parameter should I change to avoid pool expansion failures? (Only NPAGEVIR ?)
2) Is Bus Addressable Memory still needed on I64 servers or can I keep the BAP sysgen parameters on teh default values?

/Toine
12 REPLIES 12
John Gillings
Honored Contributor

Re: Pool expansion failed

/Toine,

>I got this error on I64 Server.
>SYSTEM-W-POOLEXPF, Pool expansion failed -- insufficient NPAGEVIR.

Where did the message appear, and under what circumstances?

If you're sure it's not a random value being incorrectly interpreted as an error, start by running:

$ @SYS$UPDATE:AUTOGEN SAVPARAMS SAVPARAMS

and inspect the report. You may need to repeat this near the time you get the pool expansion failure. This should include the highwater marks for each memory pool.

Executing

$ @SYS$UPDATE:AUTOGEN GETDATA GENPARAMS

should give a report which shows AUTOGEN's recommendations, without actually setting them. See if there are any significant increases in any of the pool parameters

> 1) Which sysgen parameter should I change

You should only ever make changes in MODPARAMS.DAT, then let AUTOGEN make any compensatory changes to other parameters.
A crucible of informative mistakes
Hoff
Honored Contributor

Re: Pool expansion failed

Um, did I miss it, or was the quantity of physical memory in the box not posted?

$ SHOW MEMORY /PHYSICAL

Usual path for correcting this setting is via AUTOGEN, but it can be reasonable to suspect the box lacks sufficient physical memory. AUTOGEN can and should bump the settings up to a threshold percentage of your physical memory. And yes. NPAGEVIR is the primary target.

Whether BAP was needed depends on what controllers are kicking around. Doc here was murky but I have a vague recollection there were a few drivers around that wanted it. One or two of the Qlogic SCSI controllers, IIRC?

If you have a dumpfile from the crash around, dig out the following information:

$ ANALYZE/CRASH SYS$SYSTEM:SY
SDA> CLUE MEMORY/STATISTICS
SDA> SHOW POOL/NONPAGED/SUMMARY
SDA> SHOW MEMORY/POOL/FULL

This sequence gets you various details on the memory usage and some idea of what's filling non-paged pool.

Check the old AUTOGEN logs and see if there's mention of BAP in there; if AUTOGEN indicated a rationale for the setting.

In various of the cases I've encountered, the box just doesn't have enough physical memory configured.

And yes; in what context is the POOLEXPF showing?
Toine_1
Regular Advisor

Re: Pool expansion failed

Hi,

The RX6600 has 32 Gbyte RAM.
The system didn't crash it just reported this error on the console.

$ sho mem/phys
System Memory Resources on 4-AUG-2010 00:18:26.08

Physical Memory Usage (pages): Total Free In Use Modified
Main Memory (32.00GB) 4194304 463862 3699438 31004

Of the physical pages in use, 655400 pages are permanently allocated to OpenVMS.

I will run autogen and see what is changed.

/Toine
Toine_1
Regular Advisor

Re: Pool expansion failed


I did a AUTOGEN getdata setparams feedback.

It only changed npagedyn.
NPAGEDYN was increased to the same value as npagevir.

Is this a normal?
Shoudn't npagevir be higher then npagedyn?

SYSGEN> SHO NPAG
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
NPAGEDYN 1006649344 4194304 163840 1879048192 Bytes
NPAGEVIR 1006649344 16777216 163840 1879048192 Bytes

/Toine
John Gillings
Honored Contributor

Re: Pool expansion failed

/Toine

> I did a AUTOGEN getdata setparams feedback.

You really need to do SAVPARAMS to store the statistics gathered for feedback. Look at the values with:

$ SEARCH SYS$SYSTEM:PARAMS.DAT PAGEDYN

Also look in SYS$SYSTEM:AGEN$PARAMS.REPORT. The report should give explanations for values chosen.

A crucible of informative mistakes
Hoff
Honored Contributor

Re: Pool expansion failed

What's in your MODPARAMS.DAT, and what shows up (of relevance) in your AUTOGEN log?

IIRC, the limit was 50 % of available memory, at least via a default AUTOGEN pass. Which implies something might have overridden the calculation. Though a quick look at the AUTOGEN DCL shows this little gem:

$IF (npagevir .GT. 1024*1024*1024) THEN npagevir = 1024*1024*1024

Which is 1073741824.

Which is pretty close.

Which implies you've hit the proverbial wall here.

I don't have source listings handy to take a look at the rationale in the comments...)

And then there's the question of what's filling the pool.

I really have to build myself a calculator for VMS nerds; this whole bytes to gigabytes to pages to pagelets stuff gets old.
Toine_1
Regular Advisor

Re: Pool expansion failed

John and Hoff,

I did first
$ @SYS$UPDATE:AUTOGEN SAVPARAMS SAVPARAMS

and then

$ @SYS$UPDATE:AUTOGEN GETDATA SETPARAMS FEEDBACK

It is not so easy to find out which application is filling the pool.

I think Hoff is correct some application is filling the pool.

I use TCPIP 5.6 ECO 4 and have many active BG devices.
Also someone advised me to disable resue of sockets. (sysconfig -r socket soinp_resue=0)


Below the requested info:

$ sea sys$system:params.dat pagedyn
PAGEDYN_INUSE = 22043552
PAGEDYN_CUR = 99999744
PAGEDYN_ALLOCFAIL = 0
PAGEDYN_ALLOCFAILPAGES = 0
PAGEDYN_REQUESTS = 182035
NPAGEDYN_CUR = 778313728
NPAGEDYN_PEAK = 1006641152
NPAGEDYN_ALLOCFAIL = 501
NPAGEDYN_ALLOCFAILPAGES = 606
NPAGEDYN_REQUESTS = 301870
DW_MOTIF$MIN_NPAGEDYN = 4194304 !SET BY DW-MOTIF
DW_MOTIF$ADD_NPAGEDYN = 300000 !SET BY DW-MOTIF
DW_MOTIF$MIN_PAGEDYN = 4194304 !SET BY DW-MOTIF
DW_MOTIF$ADD_PAGEDYN = 180000 !SET BY DW-MOTIF
DECNET_PLUS$ADD_NPAGEDYN = 3800000 !SET BY DECNET-PLUS
MIN_NPAGEDYN = 300000000
MIN_PAGEDYN = 100000000

AGEN$PARAMS.REPORT

NPAGEDYN parameter information:
Feedback information.
Old value was 778313728, New value is 1006649344
Maximum observed non-paged pool size: 1006641152 bytes.
Non-paged pool request rate: 43 requests per 10 sec.

/Toine
Hoff
Honored Contributor

Re: Pool expansion failed

That was a request for the MODPARAMS.DAT file.

That file tends to be where "creative" settings are secreted.

The AUTOGEN report can be useful for figuring out where a setting or an override or a limit came from.

PARAMS.DAT is somewhat less interesting; yeah, that's what you're going to get, but not really how you got to the values there.
Toine_1
Regular Advisor

Re: Pool expansion failed

Hoff,

In the attachment the modparams.dat file.

Toine
Shruthi K Prakashan
Occasional Visitor
Solution

Re: Pool expansion failed

Toine,

You can collect the information that Hoffman suggested on your running system itself. You don't need a dump.

$anal/system
SDA> CLUE MEMORY/STATISTICS
SDA> SHOW POOL/NONPAGED/SUMMARY ( What is the % of 'Total space utilization' here ? )
SDA> SHOW MEMORY/POOL/FULL

Additionally, also do a

SDA> CLUE MEMORY/LOOKASIDE

Do you see any lookaside list of a particular size being heavily populated ?

Also, there seems to be a large part of the pool free ( 899.25 MB in 960 MB ).

Dynamic Memory Usage: Total Free In Use Largest
Nonpaged Dynamic Memory (MB) 960.00 899.25 60.75 0.31

It may be useful to figure out what might have triggered pool expansion in the first place.

Can you also check,

SDA> SHOW POOL/NONPAGED/STATISTICS

How many packets do you see on the variable list ?

This could give you an idea on the amount of pool fragmentation.

You can enable POOLCHECK using :

$ mc sysgen
SYSGEN> USE ACTIVE
SYSGEN> SET POOLCHECK %X61640000
SYSGEN> WRITE ACTIVE
SYSGEN> exit

And then, examine the content of the ring buffer periodically to track the allocations and deallocations.

$ anal/sys
SDA> SHOW POOL/RING

- Shruthi
SDIH1
Frequent Advisor

Re: Pool expansion failed

>I think Hoff is correct some application is >filling the pool.

>I use TCPIP 5.6 ECO 4 and have many active >BG devices.

We had similar problems were a TCP client on a remote machine kept wrting to the TCP socket on an IA64 OpenVMS system running TCPIP 5.6, but at the VMS side the connection was never read.

On TCPIP 5.4 this was never a problem, as there once the buffer of the TCPIP socket was filled up (to 512 bytes?), it stopped accepting data from the remote system.

In TCPIP 5.6 these buffers will be filled up until all NPAGEDYN is used.

We found the devices filling up by doing pipe ucx show device /full | search sys$input "alloc". Some devices showed this:

QLIMIT 0 Total buffer alloc 32143360 0

We found 2 solutions: read the device at the VMS side, stop writing at the client side.

Toine_1
Regular Advisor

Re: Pool expansion failed

Hi,

The problem was caused by then network interface. There were many packets on the lookaside list (LAL). All these packets were VCR packets. I think that one server has sent a lot of data to this Rx6600 server and couldn't be handled in time.

After rebooting the server the non paged pool became normal.
In OpenVMS 8.4 there will be an enhancement that there will be max 5000 LAN packets outstanding for a application to avoid pool expansion problems.

Thank you all for the help.

/Toine