cancel
Showing results for 
Search instead for 
Did you mean: 

System Error

 
SOLVED
Go to solution
Riverhawk
Advisor

System Error

I received the following error message.

Entry Jobname Username Blocks Status
----- ------- -------- ------ ------
3730 MONTHLY_RESET SYSTEM Retained on error
%DELETE-W-FILNOTPUR, error deleting !AS
On available batch queue SYS$BATCH_NRCAVA
Submitted 1-APR-2008 17:00:01.58 /KEEP /NOPRINT /PRIORITY=100
File: _$1$DGA253:[CLUSTER_COMMON.SYSMGR]MONTHLY_RESET.COM;18
Completed 1-APR-2008 17:00:08.35 on queue SYS$BATCH_NRCAVA

3731 MONTHLY_RESET SYSTEM Pending (queue stopped)
On stopped batch queue SYS$BATCH_NRCAVC
Submitted 1-APR-2008 17:00:01.61 /KEEP /NOPRINT /PRIORITY=100
File: _$1$DGA253:[CLUSTER_COMMON.SYSMGR]MONTHLY_RESET.COM;18


It looks like I could just cancel the job but I’m not exactly sure what its doing.
6 REPLIES 6
Hoff
Honored Contributor

Re: System Error

Part of a site-specific batch job failed.

Specifically, a PURGE command failed to locate a file.

You'll need to figure out whether this is a problem, or a minor corner case in some DCL that wasn't handled quite right.

Based on the /KEEP /NOPRINT in the batch entry, the batch job was created and has been preserved.

Read the log file contents, and see what happened with the PURGE.

The log is probably MONTHLY_RESET.LOG, and I'd look for this file over in _$1$DGA253:[CLUSTER_COMMON.SYSMGR] and in SYS$MANAGER:. You might have to expand the search from there.
Riverhawk
Advisor

Re: System Error

After doing some browsing I came across this:
I found that a process (proc/cont/id=2060080F) has the SECURITY.AUDIT file locked for the log file SYS$SYSROOT:[SYSMGR]MONTHLY_RESET.LOG;69.

Current processes that have any SECURITY file in use:
00000000 [VMS$COMMON.SYS$LDR]SECURITY.EXE;1
00000000 [VMS$COMMON.SYSLIB]DECW$SECURITY_VMS.EXE;1
SECURITY_SERVER 20600815 [VMS$COMMON.SYSEXE]SECURITY_SERVER.EXE;1
SECURITY_SERVER 20600815 [SYS0.SYSEXE]NET$PROXY.DAT;1
00000000 [VMS$COMMON.SYSLIB]DECW$SECURITY.EXE;1
AUDIT_SERVER 2060080F [SYS0.SYSMGR]SECURITY.AUDIT$JOURNAL;133
The error is:
( 1-APR-2008 17:00:08.22)$ if FINDFILE .nes. "" then Delete CLUSTERDISK:[SYSMGR.OLDLOGS]SECURITY_NRCAVA.AUDIT$JOURNAL_OLD;*
( 1-APR-2008 17:00:08.22)$ Set audit/journal=security-
/destination=sys$manager:security.audit$journal
( 1-APR-2008 17:00:08.25)$ Set audit/server=new_log
( 1-APR-2008 17:00:08.28)$ copy sys$manger:security.audit$journal;-0 CLUSTERDISK:[SYSMGR.OLDLOGS]SECURITY_NRCAVA.AUDIT$JOURNAL_OLD
%COPY-E-OPENIN, error opening SYS$MANGER:[SYSMGR]SECURITY.AUDIT$JOURNAL;-0 as input
-RMS-F-DEV, error in device name or inappropriate device type for operation
( 1-APR-2008 17:00:08.30)$ Purge sys$manager:security.audit$journal
%PURGE-W-FILNOTPUR, error deleting SYS$SYSROOT:[SYSMGR]SECURITY.AUDIT$JOURNAL;132
-RMS-E-FLK, file currently locked by another user
%PURGE-I-FILPURG, SYS$SYSROOT:[SYSMGR]SECURITY.AUDIT$JOURNAL;131 deleted (29602 blocks)
( 1-APR-2008 17:00:08.34)$!
Art Wiens
Respected Contributor

Re: System Error

"copy sys$manger"

Your output looks like a cut and paste so I assume this is a typo in your procedure?

Art
labadie_1
Honored Contributor

Re: System Error

>>> "copy sys$manger"

You have forgotten to translate from french to english, I guess it should be
$ copy sys$eat

:-)
Jan van den Ende
Honored Contributor

Re: System Error

Riverhawk,

I suggest giving Gerard (Labadie) a generous handful of points for the laugh I got out of that!
French "manger" = English "to eat"

Seriously, SYS$MANGER can perhaps exist if you care to define it, but the default location of SECURITY.AUDIT$JOURNAL is SYS$MANAGER.
The error message refers to SYS$MANGER:[SYSMGR]SECURITY.AUDIT$JOURNAL , which is exactly what is to be expected if the current default would be SYS$MANAGER (which tranlates to SYS$SYSROOT:[SYSMGR] )
From the sysntax, the parser considers a filespec part before the colon to be a device or a directory. If it is no logival name, it must be a device. So, add to that the directory spec from the default, and yo get SYS$MANGER:[SYSMGR]

In summary: there is a typo in your procedure, when it uses SYS$MANGER where it should be SYS$MANAGER

hth

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Hoff
Honored Contributor
Solution

Re: System Error

This batch procedure has not worked "right". Not after the edit that left it this way. The procedure has always exited on error.

More simply, the DCL code here is broken.

DCL is reminiscent of a BASIC interpreter of an obscure and apostrophe-unbalanced language dialect that's used as a command shell, but if you're going to work on OpenVMS, you're going to need to know about it.

The following is implementing an audit log roll-over.

---
$ Set audit/server=new_log
--

Candidate passes.

--
$ copy sys$manger:security.audit$journal;-0 CLUSTERDISK:[SYSMGR.OLDLOGS]SECURITY_NRCAVA.AUDIT$JOURNAL_OLD
%COPY-E-OPENIN, error opening SYS$MANGER:[SYSMGR]SECURITY.AUDIT$JOURNAL;-0 as input
-RMS-F-DEV, error in device name or inappropriate device type for operation
--

That's a real error.

It's not the error that's triggering the batch job to be held, though. (That's the "last error" that causes that.)

--
$ Purge sys$manager:security.audit$journal
%PURGE-W-FILNOTPUR, error deleting SYS$SYSROOT:[SYSMGR]SECURITY.AUDIT$JOURNAL;132
-RMS-E-FLK, file currently locked by another user
%PURGE-I-FILPURG, SYS$SYSROOT:[SYSMGR]SECURITY.AUDIT$JOURNAL;131 deleted (29602 blocks)
--

There's your real error.

That's a "normal" case when aiming a PURGE command at an active file. (There are ways to avoid generating the error or to mask it, but the DCL required is somewhat more complex than most folks want to embed into their procedures.)

As for what to do to check the execution of a batch procedure (automatically), I've been known to use a SEARCH (often for the targets "-W-", "-E-" and "-F-") and then a grep or a DIFFERENCES on the connected errors from an operation. This analogous to what is found within DECset MMS to determine if the last run looked like the reference run; some errors are normal and expected, and others are not.

VMS error handling is comparatively primitive, but it's functional and fairly easy to deal with using some simple site-specific add-on tools.

And do take a look at how OpenVMS generates its signal arrays (it's quite elegant -- look for the % and then the - on subsequent messages from the signal) and at the OpenVMS User's Guide and (as you get going) the DCL Dictionary. Knowing this will help you code in DCL.

Another and subtle case that's latent here is ;32767 version. But with a monthly roll-over, it'll take a very long time to get there. On files that roll, you can and usually do encounter problems when the version reaches ;32767, as there can be no higher version. (There's a two or three line command sequence to rename a whole stack of versions from a high range into the same sequence of versions with lower version numbers in the OpenVMS FAQ http://www.hoffmanlabs.com/vmsfaq

The Freeware DFU tool can spot big file version numbers, as can DCL itself starting around OpenVMS V8.2 and its DIRECTORY /SELECT mechanism.)

Do spend some time with DCL and with the DCL documentation. If you're working with OpenVMS at all regularly, this will be time well spent.

Oh, and need a patch for this DCL? If you don't want to wade into the DCL code, but you do want the PURGE error to go away, simply zonk the error with a successful EXIT added immediately after the PURGE -- this assumes the PURGE is the last command in the procedure. This will suppress the "last" error that would otherwise be picked up by the queue manager:

$ EXIT 1

Stephen Hoffman
HoffmanLabs LLC