Operating System - OpenVMS

Re: F$PID() DCL process crash - fixed/guarded against in a version of OpenVMS more recent than VAX v

 
Mark_Corcoran
Frequent Advisor

F$PID() DCL process crash - fixed/guarded against in a version of OpenVMS more recent than VAX v6.2?

I have a DCL command procedure that I use to get bullet-point information about a process, and which is invoked with a short DCL symbol and a user-specified P1 parameter that is the process ID to be examined.

[It's shorter to type, and gives me pertinent information without having to pick out what I want from SHOW PROCESS (DCL or SDA)]


Because I'd made it available to others for use, originally, if P1 .EQS. "", it would output usage information then exit.

I'd gotten fed up of it not defaulting to using my own process ID if I don't specify a P1, so I made some changes earlier this week to alter the behaviour.

Unfortunately, a misplaced '.' resulted in a DCL warning, and failure to execute an ELSE path...

$ IF (P1 .EQS. "?") .OR. (P1 .EQS. "HELP") .OR. (P1 .EQS. "USAGE") .OR. (P1. EQS. "INFO")
$ THEN
$        !Output help text here
$ ELSE
$        P1 = F$GETJPI("", "PID")
$ ENDIF

resulted in:

$ IF (P1 .EQS. "?") .OR. (P1 .EQS. "HELP") .OR. (P1 .EQS. "USAGE") .OR. (P1. EQS. "INFO")
%DCL-W-IVOPER, unrecognized operator in expression - check spelling and syntax
\.\
$ THEN
$ ENDIF


As it was only a warning, code execution continued, until the code later on got to this:

$ IF F$TYPE(CONTEXT) .NES. "" THEN DELETE /SYMBOL CONTEXT
$ DUMMY = F$CONTEXT("PROCESS", CONTEXT, "PRCNAM", P1, "EQL")
$ PID = F$PID(CONTEXT)


When the F$CONTEXT() call was executed, the following warning was output (not unexpected, after all, P1=""):
%DCL-W-ARGREQ, missing argument - supply all required arguments


However, when F$PID() was called, my interactive process was terminated after a stack dump was attempted:

Improperly handled condition, bad stack or no handler specified.

Signal arguments Stack contents

Number = 00000005 7FED35A0
Name = 0000000C 7FED3428
00000001 7FED3428
000001B8 7FF5148D
7FF514B5 7FED341C
0A800000 00000000
7FF533EC
FFFFFFFF
03190004
7FFEC939

Register dump

R0 = 042A3528 R1 = 000001B8 R2 = 00003528 R3 = 000001B8
R4 = 00000001 R5 = 7FFEC96C R6 = 7FED5720 R7 = 00000201
R8 = 00000019 R9 = 00000000 R10= 7FFED7D4 R11= 7FFE2BDC
AP = 7FFEC960 FP = 7FFED7D4 SP = 7FFEC919 PC = 7FF514B5
PSL= 0A800000

[Apologies, I don't have the patience to reformat the tabbing/spaces that forums insists on removing...]


The problem stems from a typo in the code, and arguably failure to check the status of the F$CONTEXT call.

Clearly, the pointer value of the CONTEXT symbol (of type PROCESS_CONTEXT) was trash, and this led to F$PID() having a bad day.

 

I was just curious as to whether or not a later version of (presumably) DCL.EXE (or on a different architecture) now guards against this particular instance of users shooting themselves in the foot?

This is OpenVMS/VAX v6.2, running under CharonVAX.

The small reproducer is as follows (but only within the context of a .COM which you @ - if you issue the commands interactively, there's no big bang):


$ IF F$TYPE(CONTEXT) .NES. "" THEN DELETE /SYMBOL CONTEXT
$ DUMMY = F$CONTEXT("PROCESS", CONTEXT, "PRCNAM", P1, EQL)
$ PID = F$PID(CONTEXT)

As you might expect, the reproducer also works if this:
$ DUMMY = F$CONTEXT("PROCESS", CONTEXT, "PRCNAM", P1, EQL)

is changed to this:
$ DUMMY = F$CONTEXT("PROCESS", CONTEXT, "PRCNAM", , EQL)

or this:
$ DUMMY = F$CONTEXT("PROCESS", CONTEXT, "PRCNAM", "", EQL)

 

I did the compulsory Google search, but the only real reference I could find to the Bad Stack was under AXP, when interrupting an image with ^Y, defining a logical name, then CONTINUE-ing execution.

I'm not looking/expecting to get this fixed (the offending code has since been fixed), just curious as to whether or not the problem persists in later versions, and on AXP/Itanium?

 

Vaguely, related...

In this case, the code doesn't check $STATUS after the F$CONTEXT call because the P1 should have had an acceptable value, but this was tripped up by a typo during testing and prior to release.

Does anyone religiously check the return status of every single lexical function call they make in a command procedure?

[I do tend to have /ERROR handling for OPEN, READ, WRITE, but not CLOSE;  mostly don't check the status for COPY, RENAME, DIRECTORY, SORT or DIFFERENCES, but do check on SEARCH]

 

Mark

[Formerly appearing as woeisme]
2 REPLIES 2
Volker Halle
Honored Contributor

Re: F$PID() DCL process crash - fixed/guarded against in a version of OpenVMS more recent than VAX v

Mark,

after fixing the typo in the reproducer "EQL" instead of just EQL, I could reproduce the problem on OpenVMS VAX V6.2 and V7.3

On OpenVMS Alpha (V8.2) or OpenVMS I64 (V8.2), there is an error message and the process does not crash:

I64VMS $ @x
%SYSTEM-F-IVBUFLEN, invalid buffer length
\CONTEXT\

Volker.

Mark_Corcoran
Frequent Advisor

Re: F$PID() DCL process crash - fixed/guarded against in a version of OpenVMS more recent than VAX v

>after fixing the typo in the reproducer "EQL" instead of just EQL
Apologies.  I had copied the original code verbatim, and had actually tried the variant reproducers  (without typo), but had typed them by hand (a ?bad? habit that stems from M$ Word altering the presentation of code when pasting it in, particularly in the middle of an indented, bullet-pointed or numbered paragraph).

>I could reproduce the problem on OpenVMS VAX V6.2 and V7.3
That's reassuring (in a sense) to know - as I just mentioned in another post, there's little prospect of us upgrading to v7.3, so knowing that the problem persists in the last available version doesn't provide any more incentive to attempt an upgrade (it was down to a DCL coding issue which wouldn't ordinarily have occurred, and which was picked up in testing).

[I would like to attempt the uprade on a test VM instance, just to see whether or not the apps would conceivably work, and whether there would be RTL issues with C & Pascal, and whether the jurassic (non-DEC) compiler might still work.

We'd have access to more recent improvements (though necessity is the mother of invention, so in some cases, I've already written .COMs to provide the same functionality as lexical functions that we don't yet have), but the systems are due to be replaced, so my time is better spent on maintaining the applications, fixing what few bugs remain, and adding new features to breathe life into the old dog.

Glad to see that the problem doesn't persist in AXP/IA64 versions (though whether that was a by-product of porting & architectural changes, or specifically fixing the bug, is another matter).

 

Mark

 

 

[Formerly appearing as woeisme]