Operating System - OpenVMS
1752782 Members
6680 Online
108789 Solutions
New Discussion юеВ

Re: CRASH ANALYSIS of VAX/VMS server

 
Volker Halle
Honored Contributor

Re: CRASH ANALYSIS of VAX/VMS server

Let's see, if we can find useful information in the RPB (Restart Parameter Block):

$ ANAL/CRASH SYS$SYSTEM:
SDA> READ SYS$SYSTEM:SYSDEF
SDA> FORM @EXE$GL_RPB/TYPE=RPB

Volker.
Sk Noorul  Hassan
Regular Advisor

Re: CRASH ANALYSIS of VAX/VMS server

Volker,
Please find the RPB report
Volker Halle
Honored Contributor

Re: CRASH ANALYSIS of VAX/VMS server

The data in the RPB is inconclusive.

A short test with a program-incuded HALT restart crash under OpenVMS VAX V7.3 on a CHARON-VAX 4000-108A shows that the only reliable info about the real HALT PC is on the console terminal !

CHARON $ run halt

?06 HLT INST
PC = 00000218
Restarting system software.

**** Fatal BUG CHECK, version = V7.3 HALT, Halt instruction restart
...

The data in the crash and in the CLUE file produced afterwards do not provide the actual HALT PC. They look similar to your case, but at least the HALT code in AP (and on the INT stack) is correctly stored as 00000006 (HALT instruction executed in kernel mode).

SDA> SHOW CALL
SDA> SHOW CALL/NEXT
SDA> SHOW CALL/NEXT

at least allowed to traverse the call stack. IF the problem is indeed a software-induced HALT, this may also work in your case. Please provide the SHOW CALL data (repeat SHOW CALL/NEXT until the Return PC is in P0 space).

Consider to connect some hardcopy terminal or console manager type application to the console terminal to capture the real HALT pc and reason code.

Volker.
Sk Noorul  Hassan
Regular Advisor

Re: CRASH ANALYSIS of VAX/VMS server

Volker, Please find the Application Error of the process which crashed the application & VMS.
Volker Halle
Honored Contributor

Re: CRASH ANALYSIS of VAX/VMS server

This error seems to correlate with the non-fatal HALT bugcheck immediately preceeding the system crash.

There should be another 2 such application error messages from processes DMEDIS and PPTTBF within the last 5 minutes before the crash (they could be seen in the ANAL/ERR output). Can you find them ?

This traceback data indicates, that the the instruction at PC=867C9EC0 tried to access some P1 space address (7FF89D70 - most likely a stack address, this address is also in R9) and failed. The failing PC is in MESSAGE_ROUTINES+22C0

Could this be some kind of stack overflow or underflow ?

The crash-PC reported in the crash (on the kernel stack) is 195FEB, quite near to 195D37 on top of the traceback call stack.

Can you do the following in SDA (in the dump):

SDA> READ/EXEC
SDA> EXA/INS 867C9EC0
SDA> EXA/INS 195D37
SDA> EXA/INS 195FEB

SDA> SHOW CALL
SDA> SHOW CALL/NEXT
SDA> SHOW CALL/NEXT
SDA> SHOW CALL/NEXT

You may also want to run your application processes with process dump enabled. Put a SET PROC/DUMP in their login procedures or run them with RUN/DUMP. This would cause process dump files to be written on unhandled exceptions.

Volker.
Sk Noorul  Hassan
Regular Advisor

Re: CRASH ANALYSIS of VAX/VMS server

Hi,

Please find the SDA output asked by you
Volker Halle
Honored Contributor

Re: CRASH ANALYSIS of VAX/VMS server

The ACCVIO has happened at

EXE$PUTMSG+23: CMPZV #10,#0C,(R9),#01

with R9 pointing to an in-accessible address in P1 space (stack ?), which should contain the message code to be reported.
Your application seems to have called SYS$PUTMSG (from PC=005E8ED3). Saved R3 = 0001827A = %RMS-E-EOF - this could make sense.

The ACCVIO in EXE$PUTMSG seems to have triggered a call to an application condition handler (?).

00195D37: BLBS R0,00195D4D looks like a valid typical return address. Where does the call go ?

Please try:

SDA> EXA/INS 00195D37-n;20

Start with n=10 and vary n, until you get a valid instruction stream.

SDA> exa/ins 195feb
00195FEB: XFC - looks like the entry mask, please also show:

SDA> EXA/INS 195FEB;50

Volker.
Volker Halle
Honored Contributor

Re: CRASH ANALYSIS of VAX/VMS server

Could you also please provide the stack ranges:

SDA> EXA CTL$AL_STACK;10 ! bottom of stacks
SDA> EXA CTL$AL_STACKLIM;10 ! top of stacks

Volker.
Sk Noorul  Hassan
Regular Advisor

Re: CRASH ANALYSIS of VAX/VMS server

Volker,
Pls find the report attached as aked by you.
Volker Halle
Honored Contributor

Re: CRASH ANALYSIS of VAX/VMS server

195D37 is a valid return PC from a CALLS:

SDA> exa/ins 00195d37-10;20
00195D27: BRW 00195DEF
00195D2A: PUSHL #10000002
00195D30: CALLS #01,@0019C250
00195D37: BLBS R0,00195D4D < Return PC

Where does that call go ?

SDA> EXA 19C250
SDA> EXA @19C250
SDA> EXA/INS @19C250;10

The following is the code stream from the interrupted PC found on the kernel stack:

SDA> exa/ins 195feb;50
%SDA-W-INSKIPPED, unreasonable instruction stream - 1 bytes skipped
00195FEC: HALT
00195FED: PUSHL R0
00195FEF: PUSHL R1
00195FF1: ADDL3 #10,08(AP),R1
00195FF6: JSB @#EXE$ALONONPAGED
00195FFC: BLBC R0,00196032

Please also provide the stack limits (see my previous entry).

Volker.