Operating System - OpenVMS
1753524 Members
5063 Online
108795 Solutions
New Discussion юеВ

Re: Different VMS V8.3 I64 crashes

 
SOLVED
Go to solution
Volker Halle
Honored Contributor

Re: Different VMS V8.3 I64 crashes

Dario,

if you look into SYSGEN, you'll see, that the maxium allowed value for NPAGEDYN (and NPAGEVIR) is 1879048192. On both Alpha and Itanium, try $ MC SYSGEN SHOW NPAG

You should have a readable system dumpfile for both of your Integrity rx2620 servers. Try to have a look at the dumps:

$ ANAL/CRASH SYS$SYSTEM:
SDA> CLUE MEM/STAT

do you see non-zero nonpaged pool expansion counters on the first page ?

SDA> SHOW POOL/NONP/SUMM

which packets consume a large amount of nonpaged pool (more than e.g. 10%) ?

SDA> SHOW MEM/POOL/FULL

what values retured for nonpaged pool in the dump ?

Try to find out, how much nonpaged pool your system are using during 'normal' operations. AUTOGEN with FEEDBACK should be able to do this for you. Please note that there could also be software problems (i.e. some nonpaged pool leak), which could cause your pool to be quickly consumed, when a certain event happens. If this would be the casue for your INSF_NONPAGED crashes, tuning won't help !

Volker.
Dario Karlen
Frequent Advisor

Re: Different VMS V8.3 I64 crashes

Result of
$ ANAL/CRASH SYS$SYSTEM:
SDA> CLUE MEM/STAT

Memory Management Statistics:
-----------------------------
Pagefaults: Non-Paged Pool:
Total Page Faults 912339467 Successful Expansions 725
Total Page Reads 115969375 Unsuccessful Expansions 3812
I/O's to read Pages 79915087 Failed Pages Accumulator 535703
Modified Pages Written 0 Total Alloc Requests 11071129
I/O's to write Mod Pages 0 Failed Alloc Requests 3537
Demand Zero Faults 365961673
Global Valid Faults 275523740 Paged Pool:
Modified Faults 30812568 Total Failures 0
Read Faults 0 Failed Pages Accumulator 0
Execute Faults 0 Total Alloc Requests 213561776
Failed Alloc Requests 0

Direct I/O 3675816600 Cur Mapped Gbl Sections 1083
Buffered I/O 2587534610 Max Mapped Gbl Sections 1089
Split I/O 2557 Cur Mapped Gbl Pages 17018
Hits 3526914412 Max Mapped Gbl Pages 17097
Logical Name Transl 2690695599 Maximum Processes 210
Dead Page Table Scans 0 Sched Zero Pages Created 0


Result of
SDA> SHOW POOL/NONP/SUMM^
Non-Paged Dynamic Storage Pool
------------------------------

NPOOL address: 8F018480
Pool map address: 8802C650
Number of lookaside lists: 160.
Granularity size: 64.
Ring buffer address: 8FC00000
Most recent ring buffer entry: 8FC01160

LSTHDS(s)
---------

LSTHDS Variable Lookaside
address listhead listheads
----------------- ----------------- -----------------
FFFFFFFF.8F017A28 FFFFFFFF.8F017A34 FFFFFFFF.8F017A60

Segment(s)
----------

Start End Length
-------- -------- --------
8801E000 8941DFFF 01400000
8942A000 8EC21FFF 057F8000

Non-Paged total: 06BF8000

Non-Paged Dynamic Storage Pool
------------------------------

Summary of Non-Paged Pool contents
----------------------------------

Packet type/subtype Packet count Packet bytes Percent
--------------------------- ---------------- ---------------- --------
Unknown 00001447 02E35440 (43.2%)
ADP 0000000D 00000AC0 (0.0%)
ACB 00001EE6 0007B980 (0.5%)
AQB 00000005 00000140 (0.0%)
CEB 00000003 00000180 (0.0%)
CRB 00000037 00003C40 (0.0%)
DDB 00000026 00001300 (0.0%)
FCB 00001177 001A3280 (1.5%)
FRK 00000061 000D8B40 (0.8%)
IDB 00000035 00002040 (0.0%)
IRP 00000085 00012B40 (0.1%)
PCB 000000CD 00073500 (0.4%)
RVT 0000003A 0004A040 (0.3%)
TQE 00000086 00002280 (0.0%)
UCB 0000012B 00048700 (0.3%)
VCB 00000004 00000700 (0.0%)
WCB 00000344 00030A40 (0.2%)
BUFIO 00000056 00001E80 (0.0%)
TYPAHD 00000022 00005000 (0.0%)
MVL 00000018 00004300 (0.0%)
NET 00000035 0001B180 (0.1%)
CXB 000005CE 01B16300 (25.3%)
NDB 00000001 00000280 (0.0%)
PFL 0000001E 00016B40 (0.1%)
PTR 0000000D 00003740 (0.0%)
JIB 00000041 000030C0 (0.0%)
TWP 0000000D 00002240 (0.0%)
VCA 0000037F 00CD1080 (12.0%)
CDRP 00000018 000018C0 (0.0%)
CIDG 00000012 00004DC0 (0.0%)
CIMSG 00000048 00004600 (0.0%)
ACL 00000002 00000100 (0.0%)
PMB 00000003 00000180 (0.0%)
ORB 00000165 0000F9C0 (0.1%)
FKB 00001F6D 00180C80 (1.4%)
DCB 00000001 00000080 (0.0%)
VCRP 00000366 00761B80 (6.9%)
......
Total space used: 06B14480 (112280704.) bytes out of 06BF8000 (113213440.) bytes
in 0000B23D (45629.) packets

Total space utilization: 99.2%

Result of SDA> SHOW MEM/POOL/FULL
System Memory Resources from Crashdump on 6-MAR-2008 04:55:28.61
-----------------------------------------------------------------

Nonpaged Dynamic Memory (Lists + Variable)
Current Size (MB) 107.96 Current Size (Pagelets) 221120
Initial Size (MB) 20.00 Initial Size (Pagelets) 40960
Maximum Size (MB) 108.01 Maximum Size (Pagelets) 221216
Free Space (MB) 0.88 Space in Use (MB) 107.07
Largest Var Block (By) 448.00 Smallest Var Block (By) 64.00
Number of Free Blocks 5569 Free Blocks LEQU 64 bytes 1231
Free Blocks on Lookasides 0 Lookaside Space (By) 0.00

Bus Addressable Memory (Lists + Variable)
Current Size (By) 0.00 Current Size (Pagelets) 0
Initial Size (By) 0.00 Initial Size (Pagelets) 0
Free Space (By) 0.00 Space in Use (By) 0.00
Largest Var Block (By) 0.00 Smallest Var Block (By) 0.00
Number of Free Blocks 0 Free Blocks LEQU 64 bytes 0
Free Blocks on Lookasides 0 Lookaside Space (By) 0.00
(Not all BAP data accessible)

Paged Dynamic Memory
Current Size (MB) 9.82 Current Size (Pagelets) 20112
Free Space (MB) 5.18 Space in Use (MB) 4.63
Largest Var Block (MB) 5.16 Smallest Var Block (By) 16.00
Number of Free Blocks 416 Free Blocks LEQU 64 bytes 349

Lock Manager Dynamic Memory
Current Size (MB) 7.12 Current Size (Pages) 912
Free Space (MB) 1.15 Hits 348613
Space in Use (MB) 5.96 Misses 1083
Number of Empty Pages 0 Expansions 1152
Number of Free Packets 4193

What kind of software could cause this problem? We have a lot of self-written software running on the server. How would it be possible?

Volker Halle
Honored Contributor

Re: Different VMS V8.3 I64 crashes

Dario,

thanks for providing the requested detailled data:

There have been: Unsuccessful Expansions 3812

Nonpaged pool has expanded to it's virtual maximum limit:

Current Size (MB) 107.96 Current Size
Initial Size (MB) 20.00 Initial Size
Maximum Size (MB) 108.01 Maximum Size

And then when SYS$SHDRIVER asked for some pool packet and pool couldn't be expanded anymore, the system crashed with INSF_NONPAGED.

The heaviest consumers of pool are:

Unknown 00001447 02E35440 (43.2%)
CXB 000005CE 01B16300 (25.3%)
VCA 0000037F 00CD1080 (12.0%)
VCRP 00000366 00761B80 (6.9%)

These packets together consume 98 out of 112 million bytes of nonpaged pool.

These types of packets seem to all be related to some network protocol and LAN operations. Let me guess: you're using TCPIP ? Do you have the most recent patches installed ?

I'm assuming, that your 'selfwritten software' is not privileged (kernel mode) code. Has something changed before the first INSF_NONPAGED crashes ? Were there some unusual system load conditions immediately preceeding the crashes ?

To find out, which software/driver etc. has allocated all those packets, one would need to analyse the contents of those pool packets.

Volker.
Hakan Zanderau ( Anders
Trusted Contributor

Re: Different VMS V8.3 I64 crashes

I want to make a comment to Vladimirs suggestion about setting parameters directly in SYSGEN ( it was a 10p answer).

It's OK to set parameters directly in SYSGEN, but DON'T FORGET to add the changes to MODPARAMS.DAT ( don't think of it as an option )

If you don't.....next time you run AUTOGEN they will be gone. Autogen is executing the command "USE DEFAULT" ( reset all values ).....and then use the parameters in MODPARAMS.DAT

I have been bitten by this more than once, because its faster to use SYSGEN instead of AUTOGEN.

regards,

Hakan Zanderau
HA-solutions
Don't make it worse by guessing.........
Dario Karlen
Frequent Advisor

Re: Different VMS V8.3 I64 crashes

Thanks Volker for your answer.
YES, you're right we are using TCPIP. How can I check if the latest patches are installed? I can only see
ina > tcpip sh ver

HP TCP/IP Services for OpenVMS Industry Standard 64 Version V5.6
on an HP rx2620 (1.60GHz/6.0MB) running OpenVMS V8.3

do you have a hint how to find out which software/driver allocates those packets?

@Hakan.
thanks for your comment. I made some changes with our I64 in the office, resulted in a crash during startup. The server is repaired now and I will test it with autogen feedback.
labadie_1
Honored Contributor

Re: Different VMS V8.3 I64 crashes

Tcpip sh version, shows, for me
V5.4 - ECO 5

So you do not have any Tcpip patch it appears.

Take it on ftp.itrc.hp.com, and apply it.
Dario Karlen
Frequent Advisor

Re: Different VMS V8.3 I64 crashes

I installed the latest patch ECO 2 on our testserver. I will do the same with the productive one.

Do you think the "insufficient nonpaged pool" problem will be solved with this action?

at the moment show mem/pool/full is
ina > sh mem/poo/fu
System Memory Resources on 16-APR-2008 14:32:11.96

Nonpaged Dynamic Memory (Lists + Variable)
Current Size (MB) 23.50 Current Size (Pagelets) 48128
Initial Size (MB) 20.00 Initial Size (Pagelets) 40960
Maximum Size (MB) 108.01 Maximum Size (Pagelets) 221216
Free Space (MB) 8.69 Space in Use (MB) 14.80
Largest Var Block (KB) 587.93 Smallest Var Block (bytes) 64
Number of Free Blocks 8281 Free Blocks LEQU 64 bytes 1
Free Blocks on Lookasides 8230 Lookaside Space (MB) 3.00

(Minimum Bus Addressable Memory allocated from Nonpaged Dynamic--run Autogen)

Bus Addressable Memory (Lists + Variable)
Current Size (bytes) 0.00 Current Size (Pagelets) 0
Initial Size (bytes) 0.00 Initial Size (Pagelets) 0
Free Space (bytes) 0.00 Space in Use (bytes) 0.00
Largest Var Block (bytes) 0 Smallest Var Block (bytes) 0
Number of Free Blocks 0 Free Blocks LEQU 64 bytes 0
Free Blocks on Lookasides 0 Lookaside Space (bytes) 0

Paged Dynamic Memory
Current Size (MB) 9.82 Current Size (Pagelets) 20112
Free Space (MB) 5.20 Space in Use (MB) 4.61
Largest Var Block (MB) 5.18 Smallest Var Block (bytes) 16
Number of Free Blocks 425 Free Blocks LEQU 64 bytes 356

Lock Manager Dynamic Memory
Current Size (MB) 4.62 Current Size (Pages) 592
Free Space (MB) 0.81 Hits 36459
Space in Use (MB) 3.81 Misses 533
Number of Empty Pages 0 Expansions 602
Number of Free Packets 3002 Packet Size (bytes) 0
labadie_1
Honored Contributor

Re: Different VMS V8.3 I64 crashes

>>>Do you think the "insufficient nonpaged pool" problem will be solved with this action?

No

What specific software do you start on this node ?
Dario Karlen
Frequent Advisor

Re: Different VMS V8.3 I64 crashes

ina > ss
OpenVMS V8.3 on node ALESA1 16-APR-2008 15:01:58.40 Uptime 6 03:22:13
Pid Process Name State Pri I/O CPU Page flts Pages
20200401 SWAPPER HIB 16 0 0 00:01:12.74 0 0
20200407 CLUSTER_SERVER HIB 14 11 0 00:00:00.05 127 164
20200408 SHADOW_SERVER HIB 5 1120008 0 00:00:29.15 111 151
20200409 CONFIGURE HIB 9 36 0 00:00:00.01 101 105
2020040A USB$UCM_SERVER HIB 5 155 0 00:00:00.09 205 434
2020040B LANACP HIB 14 78 0 00:00:03.12 157 207
2020040D FASTPATH_SERVER HIB 10 8 0 00:00:00.00 108 134
2020040E IPCACP HIB 10 8 0 00:00:00.09 78 109
2020040F ERRFMT HIB 8 33789 0 00:00:02.56 159 196
20200410 CACHE_SERVER HIB 16 5 0 00:00:00.00 66 87
20200411 OPCOM HIB 7 6708 0 00:00:00.59 211 101
20200412 AUDIT_SERVER HIB 10 867 0 00:00:00.09 169 213
20200413 JOB_CONTROL HIB 9 1991450 0 00:00:57.24 120 186
20200417 SECURITY_SERVER HIB 10 1100562 0 00:01:06.58 449 594
20200418 ACME_SERVER HIB 10 79 0 00:00:02.95 381 522 M
20200419 QUEUE_MANAGER HIB 10 2714 0 00:00:00.74 197 271
2020041B SMISERVER HIB 9 43 0 00:00:00.05 238 272
2020041C TP_SERVER HIB 9 35471 0 00:00:05.88 426136 112
2020041D NETACP HIB 10 4858 0 00:00:00.16 132 231
2020041E EVL HIB 6 9697 0 00:00:00.65 229 206 N
2020041F REMACP HIB 8 8 0 00:00:00.00 70 72
20200426 TCPIP$INETACP HIB 10 8600 0 00:02:06.78 395 362
2028A027 RS4 LEF 4 3083 0 00:00:00.38 539 447
2020142B A_bb_Restart HIB 6 57251 0 00:00:00.64 532 332
2020142C A_bb_b_r LEF 4 27593494 0 00:06:19.21 291 304 S
20284C38 RO1 LEF 6 799639 0 00:00:44.82 580 529
2020043D INAL02 LEF 9 1298 0 00:00:00.19 3390 147
20276449 RS1 LEF 4 498477 0 00:00:24.99 593 557
20201453 A_bb_sup HIB 4583944070 0 00:58:49.99 403 411 S
20201454 A_bb_dl01 LEF 6 102 0 00:00:00.04 219 219 S
20201455 A_bb_dl02 LEF 6 102 0 00:00:00.07 219 219 S
20201456 A_bb_dl03 LEF 6 102 0 00:00:00.04 219 219 S
20201457 A_bb_dl04 LEF 5 102 0 00:00:00.01 219 219 S
20201458 A_bb_dl05 LEF 6 102 0 00:00:00.07 219 219 S
20201459 A_bb_dl06 LEF 6 102 0 00:00:00.04 219 219 S
2020145A A_bb_dl07 LEF 6 102 0 00:00:00.02 219 219 S
2020145B A_bb_dl08 LEF 6 102 0 00:00:00.08 219 219 S
2020145C A_bb_dl09 LEF 6 102 0 00:00:00.02 219 219 S
2020145D A_bb_dl10 LEF 6 102 0 00:00:00.05 219 219 S
2020145E A_bb_dl11 LEF 6 102 0 00:00:00.04 219 219 S
2020145F A_bb_dl12 LEF 6 102 0 00:00:00.04 219 219 S
20201460 A_bb_dl13 LEF 6 102 0 00:00:00.05 219 219 S
20201461 A_bb_dl14 LEF 6 102 0 00:00:00.07 219 219 S
2028A475 RS5 LEF 4 163310 0 00:00:10.78 585 541
2028AC9D RS2 LEF 4 613259 0 00:00:32.09 620 577
2028B4DD TCPIP$FTPC1C730 LEF 8 220 0 00:00:00.06 477 427 N
2028FCDF TCPIP$FTPC1C732 LEF 9 3335 0 00:00:00.28 477 427 N
202904E0 TCPIP$FTPC1C733 LEF 8 221 0 00:00:00.05 477 427 N
202904E1 TCPIP$FTPC1C734 HIB 8 111 0 00:00:00.01 414 372 N
2026B8E2 TCPIP$FTPC1C735 HIB 9 97 0 00:00:00.01 415 373 N
202610E3 TCPIP$FTPC1C736 LEF 10 98 0 00:00:00.04 414 372 N
2028E0E4 TCPIP$FTPC1C737 LEF 8 104 0 00:00:00.00 375 333 N
2028FCE8 _TNA717: LEF 4 21567 0 00:00:01.80 547 468
2028F0FE RS3 LEF 8 186546 0 00:00:09.92 568 498
2026CD11 _TNA532: LEF 4 1729346 0 00:02:30.14 597 568
2027AD17 _TNA627: LEF 4 3422286 0 00:02:35.45 605 571
2028BD52 _TNA718: LEF 4 18135 0 00:00:01.88 547 468
2028D96A _TNA711: LEF 4 141926 0 00:00:06.29 577 519
2027AD92 _TNA628: LEF 4 790455 0 00:01:07.73 537 451
202721B5 _TNA565: LEF 4 3244064 0 00:03:18.07 643 639
202879BE _TNA681: LEF 6 538629 0 00:00:32.95 603 570
2028EDD5 INA CUR 0 4 166693 0 00:00:11.45 2191 216
2028C610 RO2 LEF 9 1464392 0 00:00:51.18 558 481
20204A11 CPU LEF 4 2265191 0 00:03:29.44 537 451
2027C657 Ina_Process_Log HIB 5 28652479 0 00:11:51.26 1031 854
2027C658 Ina_Restart HIB 6 228102 0 00:00:11.27 518 326
20279E59 Ina_Params HIB 5 652854 0 00:00:09.92 150 190 S
2027525A Ina_Nets HIB 4 85977 0 00:00:06.19 221 248 S
2027D25B Ina_Meas HIB 4 478363 0 00:00:52.06 159 199 S
20275A5C Alarms_html_log HIB 5 38 0 00:00:00.01 148 185 S
2027C65D Wago_Log HIB 6 38 0 00:00:00.02 141 178 S
2027C65E Ina_Monitor LEF 4 3386115 0 00:04:01.82 157 197 S
2027C65F WAGO01 HIB 4 2038 0 00:00:00.15 219 241 S
2027C660 WAGO02 HIB 5 353013 0 00:00:05.87 211 233 S
2027C661 WAGO03 HIB 5 631687 0 00:00:12.55 211 233 S
2027C662 WAGO04 HIB 4 352338 0 00:00:05.59 218 240 S
2027C663 WAGO05 HIB 4 2041 0 00:00:00.14 219 241 S
2027C664 WAGO06 HIB 4 2051 0 00:00:00.14 219 241 S
2027C665 ALARMS_HTML_01 HIB 5 2594835 0 00:07:42.03 178 218 S
2027C666 ALARMS_HTML_02 HIB 4 2587009 0 00:09:06.60 178 232 S
2027C667 ALARMS_HTML_03 HIB 4 2559123 0 00:11:53.76 178 218 S
2027C668 ALARMS_HTML_04 HIB 4 2550258 0 00:13:20.80 185 239 S
2027C269 ALARMS_HTML_05 HIB 4 2543430 0 00:16:39.42 178 232 S
20276E6A ALARMS_HTML_06 HIB 4 2544772 0 00:17:18.26 178 232 S
2028C286 _TNA690: LEF 5 79657 0 00:00:07.20 555 483
2028FA92 INFO LEF 4 84204 0 00:00:04.75 593 560
202612AE TCPIP$FTP_1 LEF 8 14199212 0 00:06:26.11 1536 1358 N
202826B0 _TNA663: LEF 5 272821 0 00:00:20.07 578 527
20283AB8 _TNA649: LEF 4 1640098 0 00:01:30.83 596 555

TCPIP with a some FTP connections, TELNET for the userlogin, some WAGO TCPIP connections, some TCPIP BITBUS connections, and some processes which generate HTML files (Alarms_html).

How can I monitor the nonpaged pool? Which are the critical limits?
Thanks for you help and time.
Volker Halle
Honored Contributor

Re: Different VMS V8.3 I64 crashes

Dario,

these are the 3 numbers to watch for nonpaged pool:

Current Size (MB) 23.50
Initial Size (MB) 20.00
Maximum Size (MB) 108.01

In this case, the initial value of NPAGEDYN was a little bit too small, as nonpaged pool has already been extended, go for 25 MB, if this data was obtained after typical usage of the system.

It is becoming critical, if the current value is about to reach maximum size (NPAGEVIR), then you're are likely to see crashes.

Volker.