HPE Community read-only access December 15, 2018
This is a maintenance upgrade. You will be able to read articles and posts, but not post or reply.
Hours:
Dec 15, 4:00 am to 10:00 am UTC
Dec 14, 10:00 pm CST to Dec 15, 4:00 am CST
Dec 14, 8:00 pm PST to Dec 15, 2:00 am PST
Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Batch process going straight to error label with no error message in the logfile

 
SOLVED
Go to solution
Maddog1
Advisor

Batch process going straight to error label with no error message in the logfile

We are running OpenVMS 8.3-1h1 V10 on a RX2660 connected to an EVA4400 disk storage unit.

Two batch processes have started to fail, going to the on error label without displaying the error it generated.

After adding a Write Sys$output $Status after the error label, the status came back as follows.

$ WRITE SYS$OUTPUT "SY1:[ITPS.DAT]VDO1COL.WRK;1"

SY1:[ITPS.DAT]VDO1COL.WRK;1

$ IF MULTI_WORK .EQS. "" THEN GOTO DAYMULTIWRK_END

$ EOF = F$FILE_ATTRIBUTES(MULTI_WORK,"EOF")

$FATAL:

$write sys$output $STATUS

%X100184D4

This is the DME error.

The only changes which would affect these procedures are added logicals to the process and group tables.

 

After reading some other threads, it looks like I will need to increase the PIOPAGES and CTLPAGES system parameters via MODPARAMS.DAT, but can anyone advise how much I can safely increase them by.

Current settings below.

 

Parameter Name      Current      Default        Minimum      Maximum Unit                 Dynamic

--------------                       -------         -------                -------              -----------                            -------

PIOPAGES                        975            975                  410                   -1 Pagelets                       D

CTLPAGES                     1056         1056                    64                    -1 Pagelets

 

Also, I don't use the SET RMS in any logins, so I am not sure whether any RMS_DEFAULT settings should also be modified.

 

The only non-zero values are

32 for system multi-block count

8 for system Network-block count.

  

 

 

 

 

3 REPLIES
Hoff
Honored Contributor
Solution

Re: Batch process going straight to error label with no error message in the logfile

With 2097,152 pages (pagelets) per gigabyte of physical memory, and with the far later virtual address space available with backing storage on hundeds of gigabytes to terabyte-scale disk storage arrays, you have some room to increase these values substantially.  Five or ten-fold would likely have no impact on the system, unless you're right on the edge of some (other) limit.

 

 But for the sake of discussion, start by doubling the values in MODPARAMS.DAT file and AUTOGEN (with FEEDBACK) and try again.

 

See HELP /MESSAGE /FACILITY=RMS DME for related details, if you've not already reviewed that.

 

Given the logical names that are referenced in this DCL, this environment could involve some ancient RSX-11 code that's been ported across, and the norms from the vintage of RSX-11 code can introduce various constraints on the environment.

 

Do enable DCL procedure verification and see exactly where the "jump" happened from the in-line code to the error handling.  That might tell you more about what was happening when this DCL face-planted.  

 

Do go looking for channel leaks, too; that's one of the potential triggers for DME errors.

John Gillings
Honored Contributor

Re: Batch process going straight to error label with no error message in the logfile

Looks like it's the F$FILE_ATTRIBUTES that's breaking, so the most likely cause is running out of PPF channels - that is, too many OPENs with different file logical names. Here's a simple reproducer that generates your symptom:

 

$ c=1
$ self=F$PARSE(";",F$ENVIRONMENT("PROCEDURE"))
$ ON WARNING THEN GOTO Error
$ loop:
$   SHOW SYM c
$   OPEN/READ f'c' 'self'
$   WRITE SYS$OUTPUT "F''c' = ",F$LOGICAL("f''c'")
$   c=c+1
$   WRITE SYS$OUTPUT F$FILE("SYS$LOGIN:LOGIN.COM","EOF")
$ GOTO loop
$ Error: WRITE SYS$OUTPUT $STATUS
$ SHOW LOG/PROCESS

 

 and here's the last few lines of the run:

 

$ loop:
$   SHOW SYM c
  C = 58   Hex = 0000003A  Octal = 00000000072
$   OPEN/READ f58 USERS:[JG]BREAKDME.COM;
$   WRITE SYS$OUTPUT "F58 = ",F$LOGICAL("f58")
F58 = __DSA30
$   c=c+1
$   WRITE SYS$OUTPUT F$FILE("SYS$LOGIN:LOGIN.COM","EOF")
1
$ GOTO loop
$ loop:
$   SHOW SYM c
  C = 59   Hex = 0000003B  Octal = 00000000073
$   OPEN/READ f59 USERS:[JG]BREAKDME.COM;
$   WRITE SYS$OUTPUT "F59 = ",F$LOGICAL("f59")
F59 = __DSA30
$   c=c+1
$   WRITE SYS$OUTPUT F$FILE("SYS$LOGIN:LOGIN.COM","EOF")
1
$ GOTO loop
$ loop:
$   SHOW SYM c
  C = 60   Hex = 0000003C  Octal = 00000000074
$   OPEN/READ f60 USERS:[JG]BREAKDME.COM;
$   WRITE SYS$OUTPUT "F60 = ",F$LOGICAL("f60")
F60 = __DSA30
$   c=c+1
$   WRITE SYS$OUTPUT F$FILE("SYS$LOGIN:LOGIN.COM","EOF")
$ Error: WRITE SYS$OUTPUT $STATUS
%X000184D4
$ SHOW LOG/PROCESS

 

 

Notice that the SHOW LOG/PROCESS doesn't work. I was running this interactively, so I can execute a SHOW LOG/PROCESSS after the run (since the command procedure has been closed, I've freed up a channel to which the output can be written). It shows the open PPF channels:

 

JG> show log/proc

(LNM$PROCESS_TABLE)

  "F1" = "_DSA30"
  "F10" = "_DSA30"
  "F11" = "_DSA30"
  "F12" = "_DSA30"
  "F13" = "_DSA30"
  "F14" = "_DSA30"
  "F15" = "_DSA30"
  "F16" = "_DSA30"
  "F17" = "_DSA30"
  "F18" = "_DSA30"
  "F19" = "_DSA30"
  "F2" = "_DSA30"
  "F20" = "_DSA30"
  "F21" = "_DSA30"
  "F22" = "_DSA30"
  "F23" = "_DSA30"
  "F24" = "_DSA30"
  "F25" = "_DSA30"
  "F26" = "_DSA30"
  "F27" = "_DSA30"
  "F28" = "_DSA30"
  "F29" = "_DSA30"
  "F3" = "_DSA30"
  "F30" = "_DSA30"
  "F31" = "_DSA30"
  "F32" = "_DSA30"
  "F33" = "_DSA30"
  "F34" = "_DSA30"
  "F35" = "_DSA30"
  "F36" = "_DSA30"
  "F37" = "_DSA30"
  "F38" = "_DSA30"
  "F39" = "_DSA30"
  "F4" = "_DSA30"
  "F40" = "_DSA30"
  "F41" = "_DSA30"
  "F42" = "_DSA30"
  "F43" = "_DSA30"
  "F44" = "_DSA30"
  "F45" = "_DSA30"
  "F46" = "_DSA30"
  "F47" = "_DSA30"
  "F48" = "_DSA30"
  "F49" = "_DSA30"
  "F5" = "_DSA30"
  "F50" = "_DSA30"
  "F51" = "_DSA30"
  "F52" = "_DSA30"
  "F53" = "_DSA30"
  "F54" = "_DSA30"
  "F55" = "_DSA30"
  "F56" = "_DSA30"
  "F57" = "_DSA30"
  "F58" = "_DSA30"
  "F59" = "_DSA30"
  "F6" = "_DSA30"
  "F60" = "_DSA30"
  "F7" = "_DSA30"
  "F8" = "_DSA30"
  "F9" = "_DSA30"
  "SYS$COMMAND" = "_FTA2:"
  "SYS$DISK" = "DISK_USER:"
  "SYS$ERROR" = "_FTA2:"
  "SYS$INPUT" = "_FTA2:"
  "SYS$OUTPUT" [super] = "_FTA2:"
  "SYS$OUTPUT" [exec] = "_FTA2:"
  "TT" = "_FTA2:"

 

Since you're in a batch job, you can't do that. However, you CAN jacket your procedure to get a similar effect. Instead of submitting your suspect procedure, submit one like this:

 

$ ON WARNING THEN CONTINUE
$ @<yoursuspect> "''p1'" "''p2'" "''p3'" "''p4'" "''p5'" "''p6'" "''p7'" "''p8'"
$ SHOW SYM $STATUS
$ SHOW LOGICAL/PROCESS
$ EXIT

This should write the logical names of the open files to the log file, and hopefully point you in the right direction to solve the problem. If that doesn't work, scatter some SHOW LOGICAL/PROCESS commands throughout the procedure and watch for leaks.

 

Note that messing with quotas is unlikely to help. Although PPF files are constrained by quotas, I seem to recall there's also a hardcoded magic number (64?) independent of quotas.  You're most likely hitting the solid wall.

 

If you want to experiment with my procedure above, I recommend you execute it in a SPAWNed subprocess, since hitting a DME wall typically "poisons" a process. Recovery isn't necessarily achieved by just closing the files. It's easier to LOGOUT and start a new process.

 

A crucible of informative mistakes
Maddog1
Advisor

Re: Batch process going straight to error label with no error message in the logfile

I increased the PIOPAGES by 10% and the error has not re-occurred.

This has been added to MODPARAMS.DAT