1826446 Members
4060 Online
109692 Solutions
New Discussion

CMS problem: %CMS-F-BUG

 
John McL
Trusted Contributor

CMS problem: %CMS-F-BUG

Usually we have no problems with our CMS system but under one account, when we try to run $ CMS SHOW HISTORY we're getting errors from CMS saying

%CMS-F-BUG, there is something wrong with CMS or something it calls
-CMS-F-NOQIO, $QIO failed
-RMS-F-WER, file write error
-SYSTEM-F-BADPARAM, bad parameter value

Some investigations showed
(a) the problem exists for CMS SHOW ELEMENT and other commands
(b) the UIC group for this account is different to owner of library but all CMS files have protextion (,,,RE)

Any thoughts would be appreciated, solutions even more so.
17 REPLIES 17
Volker Halle
Honored Contributor

Re: CMS problem: %CMS-F-BUG

John,

by just reading the error message you've posted:

RMS-F-WER indicates WRITE access failed

Protection: RE indicated only READ access granted

What if that account would have SYSPRV or BYPASS granted temporarily ? Would the CMS SHOW still fail ?

Volker.
John McL
Trusted Contributor

Re: CMS problem: %CMS-F-BUG

It made no difference, Volker. The same error message appeared. (I know BYPASS can enable some special things in CMS, but apparently it won't solve this problem.)

(Sorry for delayed response. I was away from work on Friday.)
John McL
Trusted Contributor

Re: CMS problem: %CMS-F-BUG

I should add that in a Batch job everything works fine, but failed in a detached job (a web server) reportedly even when the privileges for the owner account of the detached job was given BYPASS and SYSPRV as default privileges.

(I'm relaying this for someone without direct access to this forum.)
John Gillings
Honored Contributor

Re: CMS problem: %CMS-F-BUG

John,

>I should add that in a Batch job everything
>works fine, but failed in a detached job

Interesting... I'd check for logical names or other actions in LOGIN.COM that may be missing in the detached job.

I'd also check where the quotas are set - explicit PQL or from SYSGEN? Perhaps BYTLM is too low?
A crucible of informative mistakes
Hein van den Heuvel
Honored Contributor

Re: CMS problem: %CMS-F-BUG

>> I should add that in a Batch job everything works fine, but failed in a detached job (a web server) reportedly even when the privileges for the owner account of the detached job was given BYPASS and SYSPRV as default privileges.

John, ask them if it ever worked. I doubt it.

I suspect CMS might be using SYS$SCRATCH or SYS$LOGIN, neither of which is setup for a detached job.
So either make the job start LOGINOUT and then chain to the real task, or define a suitable SYS$LOGIN and/or SYS$SCRATCH in a suitable logical name table.

For now, just to confirm the suspicion, define both in LNM$SYSTEM.
Any 'normal' use will resolve above that.

hth,
Hein

John McL
Trusted Contributor

Re: CMS problem: %CMS-F-BUG

The failing process, APACHE$WW_nnnn, is running as a subprocess to process APACHE$SWS0000, which I think is detached.

APACHE$WW_nnnn has a SCRIP$LOGIN and SCRIP$SCRATCH defined.

The right hand column of SHOW QUOTA says
Direct I/O limit: 300
Buffered I/O limit: 300
Open file quota: 279
Subprocess quota: 19
AST quota: 609
Shared file limit: 0
Max active jobs: 0


John
Hein van den Heuvel
Honored Contributor

Re: CMS problem: %CMS-F-BUG

>>> APACHE$WW_nnnn has a SCRIP$LOGIN and SCRIP$SCRATCH defined.

That's just great, but why would CMS care? It would only care about CMS$mumbles and SYS$mumbles... if it cares at all.

How about running the success case using SET WATCH FILE/CLA=MAJOR and explain each file access witnessed trying to understand whether the same file/directory could be accessed from the other context?

fwiw,
Hein
John McL
Trusted Contributor

Re: CMS problem: %CMS-F-BUG

Sorry Hein, my mistake. I was distracted. I should have written SYS$SCRATCH and SYS$LOGIN.
Brad McCusker
Respected Contributor

Re: CMS problem: %CMS-F-BUG

Your base note said "under one account" you get errors. That of course implies that under other accounts you do not get errors.

For the two accounts in question (one that gets errors, and one that does not get errors) is everything else the same? You are using the same application (apache), you are performing the same tasks, etc? I suspect the answer is no, but please clarify.

The reason for my question is to try to better understand the situation you are in.

I'd also like to see the answer to someone's previous statement questioning if it ever worked as you expect it should.




Brad McCusker
Software Concepts International
John McL
Trusted Contributor

Re: CMS problem: %CMS-F-BUG

Brad, the answer is of course no, or there would be no problem.

I have a list of the differences that I'll have to work through, but that will take some time now that the person with this problem will be away for the rest of this week.

Can we work back from the other end and maybe get some clues as to what we should look for, especially given that SYS$SCRATCH and SYS$LOGIN look okay? What exactly does the error message mean and why is it returning a non-specific "BADPARAM" message? I'm surprised that CMS doesn't check parameters or doesn't catch the error and produce something more informative.
John Gillings
Honored Contributor

Re: CMS problem: %CMS-F-BUG

John,

> I'm surprised that CMS doesn't check
>parameters or doesn't catch the error and
>produce something more informative.

Can't check everything! Think about what's happening here. (note that I know very little about CMS, so I may make some invalid assumptions...)

CMS SHOW HISTORY is presumably pulling some data out of one or more files, formatting and displaying the output. I'd assume all the accesses to CMS files are read only, so that really just leaves the displaying output part as a suspect for WER and BADPARAM.

What are the devices SYS$OUTPUT, SYS$ERROR, SYS$COMMAND and SYS$INPUT for your apache process? Is CMS assuming the output is a terminal device perhaps? Maybe it's using some terminal driver function code which isn't working?

If the CMS command is in a command procedure, a quick check might be something like:

$ CMS SHOW HISTORY/OUTPUT=tmpfile
$ TYPE tmpfile

or even:

$ PIPE CMS SHOW HISTORY | TYPE SYS$PIPE

Maybe there are qualifiers, or logical names which "dumb down" the CMS output (like DFU$NOSMG)?

You could also try SET WATCH/CLASS=MAJ for clues.
A crucible of informative mistakes
John Gillings
Honored Contributor

Re: CMS problem: %CMS-F-BUG

John,

>What exactly does the error message mean
>and why is it returning a non-
>specific "BADPARAM" message?

Just expanding on this...

$ HELP/MESSAGE BADPARAM
...

BADPARAM, bad parameter value

Facility: SYSTEM, System Services

Explanation: A value specified for a system function is not valid. Several conditions can cause this error:
...(bunch of possibilities, none of which look like good candidates for your case)...

$ HELP/MESS WER
...
WER, file write error

Facility: RMS, OpenVMS Record Management Services

Explanation: An error occurred during an RMS file system write operation.

User Action: The status value (STV) field of the RAB contains a system status code that provides more information about the condition. Take corrective action based on this status code.

So the sequence of events is...

a system service, probably $QIO found something wrong and returned BADPARAM to RMS, which put that in the STV and returned WER to CMS. The CMS output layer built a signal array with the RMS and system service conditions, then added NOQIO and signalled it. Since there weren't any condition handlers which recognised the condition, the CMS last chance handler caught it, added CMS$_BUG and resignalled to VMS.

It's non-specific because it was detected inside $QIO. You're in inner mode, possibly at high IPL when the condition is detected. You don't have time or cycles to be more specific, it's just a case of "let's get out of here safely". Maybe $QIO and/or the lower level device drivers could be changed to give better, more specific messages, but realise that they're not signalling the condition (if you did, you'd crash the system), so you've only got the return status and the IOSB to communicate (rather than a signal array with space for parameters).

That means you would need to define specific condition codes for each possible error condition. Remember there are lots of device drivers with different uses for different parameters. It's a combinatorial explosion of things that can go wrong!

Generic conditions like BADPARAM, NOPRIV, EXQUOTA and others are a pain in the proverbial, but the reality is, it's not always possible to do much better.
A crucible of informative mistakes
John McL
Trusted Contributor

Re: CMS problem: %CMS-F-BUG

Rather than provide log files, which would be useful. the person with the problem emailed me a html file of differences.

From that I see the following logicals for the failing job:
"SYS$COMMAND" [super] = "_BG53295"
"SYS$COMMAND" [exec] = "_NLA0:"
"SYS$DISK" [super] = "apache$root:"
"SYS$DISK" [exec] = "apache$root:"
"SYS$ERROR" [super] = "_BG53300"
"SYS$ERROR" [exec] = "_BG53297:"
"SYS$INPUT" [exec] = "_NLA0:"
"SYS$OUTPUT" [super] = "_BG53297:"
"SYS$OUTPUT" [exec] = "_BG53297:"
"SYS$SCRATCH" = "APACHE$ROOT:[000000]"
"TT" = "_NL:"

Where nulls (NL: NLA0:) appear here, the process that teh comparison is made to, although it looks to be interactive, has legitimate devices (e.g. terminal).

David B Sneddon
Honored Contributor

Re: CMS problem: %CMS-F-BUG

Have you recently upgraded CMS?
A while ago I upgraded DECset to the latest and
it broke the callback mechanism.
I reinstalled the previous version of CMS.
I seem to recall the error was also a BADPARAM error.
On investigating it, it seems some parameters were pushed on the stack in the wrong order.
Don't know why it would have changed but there you go.
I'll see if I can track down my notes on it.

Dave
John Gillings
Honored Contributor

Re: CMS problem: %CMS-F-BUG

John,

Note that your SYS$OUTPUT points directly to a network device. Perhaps CMS is writing to it as if it were a terminal? That might account for a BADPARAM. Why that might happen for one user and not others, I don't know.

See if dumping the output to a temp file and TYPEing the file makes a difference.

Is everything happy with 5 digit unit numbers?
A crucible of informative mistakes
John McL
Trusted Contributor

Re: CMS problem: %CMS-F-BUG

The person has returned to work and I've now done some further investigation.

As John G suggested, it looks like CMS doesn't like writing to a device with the characteristics listed below (as per a SHOW DEV/FULL SYS$OUTPUT)

Device BG11125:, device type unknown,
is online, mounted,
record-oriented device,
carriage control,
network device, mailbox device.

and "Default buffer size 32767"

I wonder what the bad param is on the QIO call ... an unknown device type, buffer size??

The workaround is to direct the CMS output to a file and then just TYPE the file (to copy it to SYS$OUTPUT).
John McL
Trusted Contributor

Re: CMS problem: %CMS-F-BUG

See posting above