Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Correct combination of INIT/QUEUE/BATCH and ENABLE AUTOSTART/QUEUE

 
SOLVED
Go to solution
Highlighted
Jeremy Begg
Trusted Contributor

Correct combination of INIT/QUEUE/BATCH and ENABLE AUTOSTART/QUEUE

Hi,

A customer site has two clustered AlphaServer DS25s running VMS 8.2. On the weekend they had a power outage and when power was restored two batch queues didn't start.

The queues in question are configured with /AUTOSTART_ON=(NODE1::,NODE2::) so that (hopefully) they would fail over from one machine to the other in the event the "current" machine went down for some reason.

The system startup procedure does this:

$ INIT/QUEUE/BATCH/AUTOSTART_ON=(NODE1::,NODE2::) -
PROD$DETACH /BASE=3 /JOB=10
$ INIT/QUEUE/BATCH/AUTOSTART_ON=(NODE1::,NODE2::) -
TEST$DETACH /BASE=2 /JOB=20
$ ENABLE AUTOSTART/QUEUES

Is that incorrect (i.e. is that the wrong way to set up an autostart queue at system boot)? The OpenVMS "Cluster Systems" manual doesn't make it entirely clear (in my opinion).

Or was it a result of the sudden failure of both systems, e.g. because the queue manager got confused when they rebooted?

Thanks,
Jeremy Begg
19 REPLIES 19
Volker Halle
Honored Contributor

Re: Correct combination of INIT/QUEUE/BATCH and ENABLE AUTOSTART/QUEUE

Jeremy,

did the customer capture the output of the startup procedure or the INIT/QUEUE commands ? Any error messages ?

Did the queues get re-created ? If so, you would be missing the START/QUE command to actually start the queue.

ENABLE AUTOSTART/QUEUES would only start AUTOSTART_ON queues, which had actually been started. If such a queue had manually been stopped, the above command would not have started that queue.

Volker.
Jeremy Begg
Trusted Contributor

Re: Correct combination of INIT/QUEUE/BATCH and ENABLE AUTOSTART/QUEUE

Hi Volker,

Logs from both machines show the message
%JBC-I-QUENOTMOD, modifications not made to running queue
but don't indicate which queues the messages refer to. This is before the ENABLE AUTOSTART/QUEUE command. (There are multiple batch queues being started besides the two autostart queues. I had better see why the startup procedure tries to start the "other"node's batch queues.)

What do you mean by "did the queues get re-created"? The startup procedure does an INIT/QUEUE/BATCH command (see my original query, above) but the queues would have existed prior to the INIT/QUEUE commands. The queues would have been running before power was removed from the systems.

(Note that the systems were *not* shut down; someone just turned off the power, and the systems rebooted by themselves when power was restored.)

Thanks,
Jeremy Begg
Volker Halle
Honored Contributor

Re: Correct combination of INIT/QUEUE/BATCH and ENABLE AUTOSTART/QUEUE

Jeremy,

%JBC-I-QUENOTMOD can only refer to queues already running, so this message couldn't have been related to those 2 autostart batch queues.

With 're-created' I meant to say: if the queues would not have existed during startup, the commands shown would not have started them. If there were existing jobs in those queues, that would prove they had existed before.

Did you check OPERATOR.LOG for any QMAN related error messages ?

Does SHOW QUE/MANA/FULL show a cluster-wide UNIQUE location of the queue-manager database ?

Volker.
Jeremy Begg
Trusted Contributor

Re: Correct combination of INIT/QUEUE/BATCH and ENABLE AUTOSTART/QUEUE

Hi Volker,

The queues existed during startup and had jobs retained in them. (I know this because the jobs currently in the PROD$DETACH queue were submitted before the reboot, and the queue is not the target of a generic queue.)

No suspicious messages in OPERATOR.LOG on either node.

Queue manager database is same on both nodes:

%SYSMAN-I-OUTPUT, command execution on node NODE1
Master file: DISK2:[SYSTEM.FILES]QMAN$MASTER.DAT;
Queue manager SYS$QUEUE_MANAGER, running, on NODE1::
/ON=(*)
Database location: DISK2:[SYSTEM.FILES]
%SYSMAN-I-OUTPUT, command execution on node NODE2
Master file: DISK2:[SYSTEM.FILES]QMAN$MASTER.DAT;
Queue manager SYS$QUEUE_MANAGER, running, on NODE1::
/ON=(*)
Database location: DISK2:[SYSTEM.FILES]

Thanks,
Jeremy Begg
Volker Halle
Honored Contributor

Re: Correct combination of INIT/QUEUE/BATCH and ENABLE AUTOSTART/QUEUE

Jeremy,

did the ENABLE AUTOSTART/QUEUES command actually get executed during startup ? Are these the only autostart queues on the system ?

What did you do the start the queue again ?

Sorry for those questions, but in general I assume that OpenVMS is working and I'm trying to find the error elsewhere.

Volker.
Jeremy Begg
Trusted Contributor

Re: Correct combination of INIT/QUEUE/BATCH and ENABLE AUTOSTART/QUEUE

Hi Volker,

The ENABLE AUTOSTART/QUEUES command was run on both nodes during startup. There are a number of autostart printer queues which appear to have started OK, it's only the two autostart batch queues which didn't start.

To get them going I used $ START/QUEUE.

Regards,
Jeremy Begg
Volker Halle
Honored Contributor

Re: Correct combination of INIT/QUEUE/BATCH and ENABLE AUTOSTART/QUEUE

Jeremy,

what was the state of the queues before your started them manually ?

You could add the /START qualifier to the INIT command, but then that would start the autostart-queues, even if they had been manually stopped before (by a system manager).

Do you have a test system, where you can test the various permutations ?

Volker.
Jeremy Begg
Trusted Contributor

Re: Correct combination of INIT/QUEUE/BATCH and ENABLE AUTOSTART/QUEUE

They were "stopped, autostart inactive" before I started them.

I don't currently have a test cluster to reboot at will. (Perhaps I should consider building an emulated one.)

Thanks
Jeremy
Volker Halle
Honored Contributor

Re: Correct combination of INIT/QUEUE/BATCH and ENABLE AUTOSTART/QUEUE

Jeremy,

'stopped, autostart inactive' means that the queue had DEFINITELY been stopped somehow. 'autostart inactive' means, that that queue would NOT be started by ENA AUTO/QUEUES.

This is the expected state, if you have issued your INIT/QUEUE... command (and the queue had been just created).

Or if you had previously stopped the queue manually.

As a workaround, consider to add a START/QUEUE command to your startup.

You should try to reproduce this with a single test system first. If you can't reproduce it, you might need a cluster. Using an emulator would make much sense here, you now have the choice between PersonalAlpha and FreeAXP.

Volker.