Operating System - OpenVMS
1822036 Members
3609 Online
109639 Solutions
New Discussion юеВ

Re: Moving the queue manager to a non-system disk

 
SOLVED
Go to solution
MarkOfAus
Valued Contributor

Moving the queue manager to a non-system disk

Hi all,

Queue Manager has been moved to dkc300:[common.system]. VMS=7.3. The queue manager fails to start up. I set startup_p2 to "D" in modparams.dat, and observed the following in the startup.log.

There is a common disk, called COMMON$DISK. Within sylogicals.com, QMAN$MASTER is defined to common$disk:[common.system].

In sylogicals.com, the disk is mounted using clu_mount_disk.com. The disk mounts ok, BUT, has %MOUNT-I-REBUILD, volume was improperly dismounted; rebuild in progress.

Later, the following command is issued within sylogicals.com:
START/QUEUE/MANAGER QMAN$MASTER
At this point this error occurs:
%SYSTEM-F-DEVOFFLINE, device is not in configuration or not available

(This is not true as a show device and a directory of the device (within sylogicals.com) show it is very much alive and running)

Anticipating that this error message could somehow be due to the disk rebuild, I modified clu_mount_disk.com and added /NOREBUILD. Then, after the START/QUEUE/MANAGER QMAN$MASTER, I put a SET VOLUME/REBUILD COMMON$DISK:. Still the same error message appears and the queue manager fails to start. I suspect this is some sort of timing issue.

The disk is running ok, and has reported no errors or other anomalies. I guess I could just as easily start a new queue manager from scratch pointing to the newly defined QMAN$MASTER, but I would much rather solve this problem.

Does anyone have any clues or hints as to the obious thing I am missing here?

Thanks,
Mark.
23 REPLIES 23
Thomas Ritter
Respected Contributor

Re: Moving the queue manager to a non-system disk

Please execute the following command and post the results.

For example,

Nodea> sh que/manager/full
Master file: COMMON:[SYSEXE]QMAN$MASTER.DAT;

Queue manager SYS$QUEUE_MANAGER, running, on nodea::
/ON=(*)
Database location: COMMON:[SYSEXE]

Our queue manager uses a common cluster wide disk. I cannot recall the command required to move the .dat file but it was straight forward.
modparams did not get involved.
Thomas Ritter
Respected Contributor

Re: Moving the queue manager to a non-system disk

From VMS help
start/queue/manager

3.$ START/QUEUE/MANAGER/NEW_VERSION -
_$ /ON=(SATURN,VENUS,NEPTUN,*) DUA5:[SYSQUE])
.


In our case we specified the new location common:[sysmgr].

Simon Fedele
Advisor

Re: Moving the queue manager to a non-system disk

Can you check if QMAN$QUEUE_MANAGER.EXE is in SYS$SYSTEM. I've seen a similiar problem like this before.
MarkOfAus
Valued Contributor

Re: Moving the queue manager to a non-system disk

Thomas,

sho /queue/manager/full:
Master file: DKC300:[COMMON.SYSTEM]QMAN$MASTER.DAT;

Queue manager SYS$QUEUE_MANAGER, running, on EMU2::
/ON=(EMU2)
Database location: DKC300:[COMMON.SYSTEM]

The file(s) exists, as I can start the queue manager manually after startup, it just won't start up within sylogicals.com.

All our batch queues are autostart, all our print queues are lpd symbiont controlled.

Thanks,
Mark.
MarkOfAus
Valued Contributor

Re: Moving the queue manager to a non-system disk

Simon,
" Can you check if QMAN$QUEUE_MANAGER.EXE is in SYS$SYSTEM. I've seen a similiar problem like this before."

Yes:
Directory SYS$COMMON:[SYSEXE]

QMAN$QUEUE_MANAGER.EXE;1 350KB/379KB 14-APR-2006 09:01:58.24


As I stated to Thomas, and omitted initially, the queue manager can be started up without a problem AFTER startup, just not during.

Thanks,
Mark
Simon Fedele
Advisor

Re: Moving the queue manager to a non-system disk

I may be wrong here but i think sylogicals is to early in the startup. It runs before the job controller process has completed and that is where start/queue/manager sends a request to job controller to start the queue manager. Maybe try it in systartup_vms
Thomas Ritter
Respected Contributor

Re: Moving the queue manager to a non-system disk

Mark, our sequence is to mount the disk in the fist step of sylogicals.
Extract

In
sys$manager:sylogicals.com

$ mount/system/noassist/norebuild DSA15: /shadow=($1$dga15:,$1$dga1501:) common common

Then we run syslogicals.

$ @common:[sysmgr]sylogicals
Simon Fedele
Advisor

Re: Moving the queue manager to a non-system disk

This might help also, from Chapter 13 of VMS system manager's manual

http://www.itec.suny.edu/scsys/vms/OVMSDOC073/V73/6017/6017pro_054.html#start_q_mgr

13.3.1 Specifying the Location of the Queue Master File

"If the location you specify is on a disk other than the node's system disk, add a command in SYLOGICALS.COM to mount the disk. SYLOGICALS.COM is normally used to define logical names; however, it is important that SYLOGICALS.COM contain the command to mount the disk holding the master file so that the master file is available before the job controller starts the queue manager. "

Volker Halle
Honored Contributor
Solution

Re: Moving the queue manager to a non-system disk

Mark,

there is no need to manually issue the START/QUEUE/MANAGER command (since approximately V5.5-2) during startup.

All you need to do in SYLOGICALS.COM is:

- define QMAN$MASTER logical to point to the disk and directory where QMAN$MASTER.DAT resides. The location of the queue-manager database files are stored in QMAN$MASTER.DAT.

- MOUNT the QMAN disk

The %MOUNT-I-REBUILD operation is a synchronous operation, so the MOUNT command does not return to DCL until the volume has been rebuilt.

Volker.
MarkOfAus
Valued Contributor

Re: Moving the queue manager to a non-system disk

Simon, the disk is mounted before the queue manager is started via the clu_mount_disk command provided by HP. All of this is done is sylogicals.



Incidentally,

If I move the files back to sys$system, remove the start/queu/manager from sylogicals, the system starts the queue manager correctly at startup.
Volker Halle
Honored Contributor

Re: Moving the queue manager to a non-system disk

Mark,

the DEVOFFLINE error is most likely NOT related to the access to the qman disk - as one might think - but to an non-existing mailbox used to communicate with the JOB_CONTROL process, which does not exist when SYLOGICASL.COM is being invoked !

Just leave the START/QUE/MANA command out of SYLOGICALS.COM and try again. It will work (of you have the disk mounted and the logical assigned).

Volker.
MarkOfAus
Valued Contributor

Re: Moving the queue manager to a non-system disk

Volker,

"there is no need to manually issue the START/QUEUE/MANAGER command (since approximately V5.5-2) during startup.
"

Ok. So I will remove that from the commands.
It still doesn't start up. Ack!

"All you need to do in SYLOGICALS.COM is:

- define QMAN$MASTER logical to point to the disk and directory where QMAN$MASTER.DAT resides. The location of the queue-manager database files are stored in QMAN$MASTER.DAT."

Done it!

"- MOUNT the QMAN disk"

Done it!

"The %MOUNT-I-REBUILD operation is a synchronous operation, so the MOUNT command does not return to DCL until the volume has been rebuilt."

As I thought, hence I added norebuild "just in case"...


Volker.
Volker Halle
Honored Contributor

Re: Moving the queue manager to a non-system disk

Mark,

is there an error message - please also check OPERATOR.LOG !

Did you manually do a STOP/QUE/MANA/CLUSTER before moving the data to the common disk ? Try manually starting the queue manager ONCE:

$ START/QUE/MANAGER DKC300:[COMMON.SYSTEM]

Volker.
MarkOfAus
Valued Contributor

Re: Moving the queue manager to a non-system disk

Volker,

"is there an error message - please also check OPERATOR.LOG !"

The only error message is in the startup.log. There is 1 error in the operator.log referring to decdtm. This has never been an issue though.


"Did you manually do a STOP/QUE/MANA/CLUSTER before moving the data to the common disk ? "

Yes, the order of details was:
stop the queue, wait for it to drop its process (and close the files), then copy the files from the current location to the new one. It should have been a simple thing.



"Try manually starting the queue manager ONCE:

$ START/QUE/MANAGER DKC300:[COMMON.SYSTEM]"

Ok, I have done this, but I will try it again.

I moved the files back to their original position and then rebooted. Viola, the queue manager started! No logical defined, either. Is that the problem?

Here's the sylogicals.com exerpt:
$ THIS_NODE = F$GETSYI("NODENAME")
$ if THIS_NODE .eqs. "EMU2"
$ then
$ @sys$sysdevice:[vms$common.sysmgr]clu_mount_disk dkc300 sys3
$ sho dev/mount
$ dir dkc300:[000000]
$ define/system/exec common$disk dkc300:
$ endif
$ if THIS_NODE .eqs. "EMU1"
$ then
$ @sys$sysdevice:[vms$common.sysmgr]clu_mount_disk dka300 set02
$ define/system/exec common$disk dka300:
$ endif
$
$ define/system/exec comsys common$disk:[common.system]
$ define/system/exec datsys common$disk:[common.system]
$ define/system/exec comadm common$disk:[common.admin]
$
$ DEFINE/SYSTEM/EXECUTIVE SYSUAF comsys:sysuaf.dat
$ DEFINE/SYSTEM/EXECUTIVE RIGHTSLIST comsys:rightslist.dat
$ DEFINE/SYSTEM/EXECUTIVE NETPROXY comsys:NETPROXY.DA
$ define/system/executive LMF$LICENSE comsys:LMF$LICENSE.LDB
.
.
.
$ write sys$output "SYLOGICALS> Defining Qman$Master"
$ DEFINE/SYSTEM/EXECUTIVE QMAN$MASTER common$disk:[common.system]
$ sho log qman$master
$ write sys$output "SYLOGICALS> Setting rebuild of common$disk"
$ SET VOLUME/REBUILD COMMON$DISK:


So there is nothing startling in here - except a lot of debugging info.

The startup.log:

%STDRV-I-STARTUP, OpenVMS startup begun at 2-AUG-2007 15:30:23.26
Searching for disk DKC300...
%MOUNT-I-MOUNTED, SHAD03B mounted on _$4$DKC300: (EMU2)
%MOUNT-I-REBLDREQD, rebuild not performed; some free space unavailable; diskquo

Device Device Error Volume Free Trans Mnt
Name Status Count Label Blocks Count Cnt
$4$DKC0: (EMU2) Mounted 0 ALPHASYS 57731334 188 1
$4$DKC300: (EMU2) Mounted 0 SYS3 66528972 1 1

Directory DKC300:[000000]

000000.DIR;1 BACKUP.DIR;1 BACKUP.SYS;1 BADBLK.SYS;1
BADLOG.SYS;1 BITMAP.SYS;1 COMMON.DIR;1 CONTIN.SYS;1
CORIMG.SYS;1 INDEXF.SYS;1 ORACLE_CONV.DIR;1 RESTORE.DIR;1
RSDBLIST.LIS;1 SECURITY.SYS;1 SMART.DIR;1 SMART_BASE.BCK;1
VOLSET.SYS;1

Total of 17 files.
SYLOGICALS> Defining Qman$Master
"QMAN$MASTER" = "COMMON$DISK:[COMMON.SYSTEM]" (LNM$SYSTEM_TABLE)
SYLOGICALS> Starting Queue Manager.
%SYSTEM-F-DEVOFFLINE, device is not in configuration or not available
SYLOGICALS> Queue Manager started.
SYLOGICALS> Setting rebuild of common$disk


MarkOfAus
Valued Contributor

Re: Moving the queue manager to a non-system disk

Apologies the startup.log is an older one, but essentially all the information is the same except the volume label of the disk.
Volker Halle
Honored Contributor

Re: Moving the queue manager to a non-system disk

Mark,

you still seem to issue the START/QUE/MANA command in SYLOGICALS.COM - remove it.
Then reboot - any error message ?

$ SHO QUE/MANA/FULL

If it's not running: $ START/QUE/MANA COMMON$DISK:[COMMON.SYSTEM]

Then reboot - any errors ? Is the queue manager running now ?

Volker.
MarkOfAus
Valued Contributor

Re: Moving the queue manager to a non-system disk

Volker,

Apologies, the log was an old one. Sorry.

"you still seem to issue the START/QUE/MANA command in SYLOGICALS.COM - remove it.
Then reboot - any error message ?"

No errors.


"$ SHO QUE/MANA/FULL"

SRS$USER:MARK> SHO QUE/MANA/FULL
Master file: DKC300:[COMMON.SYSTEM]QMAN$MASTER.DAT;

Queue manager SYS$QUEUE_MANAGER, running, on EMU2::
/ON=(EMU2)
Database location: DKC300:[COMMON.SYSTEM]

So, the queue manager is now running!!

Therefore, the sylogicals.com explanation is incorrect? You should only run the START/QUEUE/MANAGER QMAN$MASTER once after the copy?

To quote sylogicals.com:
"$! default. To place the queue manager database in another location, enter
$! the START/QUEUE/MANAGER command with a parameter. For example, to put
$! the queue database files in the same cluster common directory as the
$! queue master file, enter the command:
$!
$! $ START/QUEUE/MANAGER/ON=(NodeX,NodeY,NodeZ) QMAN$MASTER
"

Thanks,
Mark.
Volker Halle
Honored Contributor

Re: Moving the queue manager to a non-system disk

Mark,

you need to ONCE run the START/QUE/MANA disk:[dir] manually after moving the files after STOP/QUE/MANA/CLUSTER.

So the comments in SYLOGICALS.TEMPLATE are a little bit misleading...

The disk:[dev] information is stored in QMAN$MASTER.DAT, so the QUEUE_MANAGER will find this information, if you correctly define QMAN$MASTER and mount the disk.

You need to issue the START/QUE/MANA disk:[dir] command only ONCE to WRITE that info into QMAN$MASTER.DAT.

Volker.
John Gillings
Honored Contributor

Re: Moving the queue manager to a non-system disk

Mark,

Could you please show your exact and complete logical name:

$ SHOW LOG/FULL QMAN$MASTER*

Make sure it's EXEC mode.

I suspect the problem is the disk name in the definition. "dkc300" may work for most things, but the queue manager is a bit more fussy than most. The definition of QMAN$MASTER must be identical across all nodes, and must point to the same file. Some ways of referring to a device work differently across nodes. It looks like cluster wide working name should be $4$DKC300
A crucible of informative mistakes
MarkOfAus
Valued Contributor

Re: Moving the queue manager to a non-system disk

Volker,

You are correct. Running it once, after moving the files and removing the start command in the sylogicals.com fixed the problem and the queue manager starts up as expected.

However, more for curiosity sake, I wonder why the message %SYSTEM-F-DEVOFFLINE appears? It is misleading, is it not?

Playing around with the queues (I'm real rusty on all this stuff - it's been TOO MANY years), the stop/queue/manager/cluster is the only way to completely kill the queue manager and stop queues being displayed. Perhaps this has something to do with the apparent non-sensical message?

But, once this is done, a start/queue/manager qman$master and enable autostart gets things back on track.

Thanks Volker for your insight and wisdom. Well done, once again.

Thanks, Mark.
MarkOfAus
Valued Contributor

Re: Moving the queue manager to a non-system disk

John,
Mark,

"Could you please show your exact and complete logical name:

$ SHOW LOG/FULL QMAN$MASTER*

Make sure it's EXEC mode.
"

Sure:
SRS$USER:MARK> SHOW LOG/FULL QMAN$MASTER*

(LNM$PROCESS_TABLE) [kernel]
[no protection information]

(LNM$JOB_811ACB00) [kernel] [shareable] [Quota=(3552,4096)]
[Protection=(RWCD,RWCD,,)] [Owner=[COMCEN,MARK]]

(LNM$GROUP_000001) [kernel] [shareable,group]
[Protection=(RWCD,R,R,)] [Owner=[COMCEN,*]]

(LNM$SYSTEM_TABLE) [kernel] [shareable,system]
[Protection=(RWC,RWC,R,R)] [Owner=[COMCEN,SYSTEM]]

"QMAN$MASTER" [exec] = "COMMON$DISK:[COMMON.SYSTEM]"

(LNM$SYSCLUSTER_TABLE) [kernel] [shareable,system]
[Protection=(RWC,RWC,R,R)] [Owner=[COMCEN,SYSTEM]]

(DECW$LOGICAL_NAMES) [exec] [shareable]
[Protection=(RWCD,RWCD,R,R)] [Owner=[COMCEN,SYSTEM]]

And COMMON$DISK:

SRS$USER:MARK> sho log/full COMMON$DISK*

(LNM$PROCESS_TABLE) [kernel]
[no protection information]

(LNM$JOB_811ACB00) [kernel] [shareable] [Quota=(3552,4096)]
[Protection=(RWCD,RWCD,,)] [Owner=[COMCEN,MARK]]

(LNM$GROUP_000001) [kernel] [shareable,group]
[Protection=(RWCD,R,R,)] [Owner=[COMCEN,*]]

(LNM$SYSTEM_TABLE) [kernel] [shareable,system]
[Protection=(RWC,RWC,R,R)] [Owner=[COMCEN,SYSTEM]]

"COMMON$DISK" [exec] = "DKC300:"

(LNM$SYSCLUSTER_TABLE) [kernel] [shareable,system]
[Protection=(RWC,RWC,R,R)] [Owner=[COMCEN,SYSTEM]]

(DECW$LOGICAL_NAMES) [exec] [shareable]
[Protection=(RWCD,RWCD,R,R)] [Owner=[COMCEN,SYSTEM]]



"I suspect the problem is the disk name in the definition. "dkc300" may work for most things, but the queue manager is a bit more fussy than most. The definition of QMAN$MASTER must be identical across all nodes, and must point to the same file."

For the cluster, they all point to the same device, but this is not yet in the cluster.

I just wanted to move the queue manager (among other things, such as UAF and RIGHTSLIST.DAT) off the system disk onto a faster disk.

" Some ways of referring to a device work differently across nodes. It looks like cluster wide working name should be $4$DKC300"

So, redefine it as $4$DKC300 instead? Could this then be causing this %SYSTEM-F-DEVOFFLINE? Still, all in all, it seems the start/queue/manager qman$master in sylogicals.com is the culprit, it is just that the error message is deceiving at best.

This server is not clustered as yet, but is in the process of being added to the cluster - when my time allows (which isn't much...)

Thanks,
Mark
Volker Halle
Honored Contributor

Re: Moving the queue manager to a non-system disk

Mark,

please do not only think of the disk device when trying to find an explanation of the %SYSTEM-F-DEVOFFLINE error message.

The START/QUE/MANA command needs to communicate with the job controller (JOB_CONTROL). This communication is done via the Job Controller Mailbox MBA1: - which is a fixed device name and it's being created by the exectuvie during system initialization.

I've just reproduced your scenario by adding the following commands to SYLOGICALS.COM:

$ SHOW DEV/FULL MBA1:
$ SHOW SYS/PROC=JOB_CONTROL
$ START/QUE/MANA

MBA1: does exist
JOB_CONTROL does NOT exist
START/QUE/MANA fails with %SYSTEM-F-DEVOFFLINE

If you look at the description of the possible return status values of the $SNDJBC system service, you'll find:

SS$_DEVOFFLINE The job controller process is not running.

Volker.
MarkOfAus
Valued Contributor

Re: Moving the queue manager to a non-system disk

Thanks to all for their assistance, especially Volker, for the solution. Also, again thanks to Volker for his explanation of the machinations of queue manager.