- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Two Queuemanagers without shared db on cluster...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2007 04:30 AM
07-17-2007 04:30 AM
We are building a multisite cluster, the main machines located on two sites, using a SAN also stretched over these 2 sites. We have foreseen placement of quorum node on a third site.
Due to economics it is not possible or even desirable to attach the quorum system to the SAN, it just boots from local disks, and provides a vote in the cluster.
Starting a separate queuemanger (i.e. a queuemanager not using the common queuemanager database) on this quorum node is unsupported, but it would be very handy to be able to print and submit batch jobs on this node. We also have considered MSCP mounting the disk where the queue database is residing on, but this is thought to be undesirable, considering the network load and instability this might cause.
From experience ( doing things wrong :-) ) I know having 2 queue managers using independent queue manager databases in a cluster does work,
but what are the actual risks of doing this?
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2007 05:22 AM
07-17-2007 05:22 AM
Re: Two Queuemanagers without shared db on cluster.
---
Huh? What instability might be caused by MSCP-serving a disk?
In general, supported configurations are preferable to unsupported ones, for a production system.
Unless you are doing a staggering amount of queue manager stuff from the remote node, I can't imagine that the I/O load will be an issue.
-- Rob
-- Rob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2007 06:36 AM
07-17-2007 06:36 AM
Re: Two Queuemanagers without shared db on cluster.
I fully agree with Rob!
MSCP ,ount is hardly ever an issue.
And considering you obviously do not intend to mount your SAN disks on the quorum node, I expect not many jobs to run on the quorum node.
The occasional (print-, batch) job that IS run on the quorum node, certainly do NOT warrant the extra complexity!
Just MSCP-mount your QUEMAN$MASTER disk, and forget about it!
hth
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2007 07:01 AM
07-17-2007 07:01 AM
Re: Two Queuemanagers without shared db on cluster.
If you want to know if this is supported, you might want to contact HP directly and more formally. (AFAIK, there can be only one queue manager master database; all queue managers operating in a cluster must know about each other.)
Host queue files and such can certainly be local to a lobe, but I'd park the queue manager master file on the same disk with the authorization database.
Stephen Hoffman
HoffmanLabs LLC
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2007 07:29 PM
07-17-2007 07:29 PM
Re: Two Queuemanagers without shared db on cluster.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2007 05:15 AM
07-18-2007 05:15 AM
Re: Two Queuemanagers without shared db on cluster.
I would still be very interested in anyone having ever seen any problems using an independent second queuemanager.
I will also submit a formal request to HP
(with a pointer to this discussion) as to what exactly is supported and what not. The documentation says it's not, but this might
be not the current view anymore.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2007 05:32 AM
07-18-2007 05:32 AM
Re: Two Queuemanagers without shared db on cluster.
Note that many things that are not "supported" may still work, sometimes even correctly. However, things that seem to "work" can mysteriously fail, frequently at very inconvenient times.
I still vigorously asset that MSCP-serving is the correct choice here.
-- Rob (ex-VMS Engineering, still an HP employee, who spent a fair amount of time digging into the MSCP server and DUDRIVER (the MSCP client)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2007 01:20 PM
07-18-2007 01:20 PM
Re: Two Queuemanagers without shared db on cluster.
>I would still be very interested in anyone
>having ever seen any problems using an
>independent second queuemanager.
Yes I've seen problems. BIG problems, numerous times.
There is no such thing as an "independent second queue manager". Having two separate queue manager data bases is a very dangerous configuration. Although it SEEMS to work when you initially set it up, WHEN (not IF) you have a queue manager failover event you will lose all queues, entries, and forms. Splat! Gone, vapourised!
Why? Because no matter what you do, the multiple queue managers know about each other. On failover, the algorithm to recover doesn't understand that two nodes have completely different "views" of the data, therefore assumes it's all bad and deletes it all.
This is NOT a bug. The configuration does not work, was never intended to work, and will never work.
If you have a cluster, you can only have a single physical queue manager master file (the managers themselves have their own journal and queue definition files). There are numerous other files which MUST be physically shared between all cluster nodes. That's how clusters work. You need to have some shared storage area visible to all cluster nodes in which the common files live.
The queue manager has an additional constraint - the file specification used to reference QMAN$MASTER.DAT must be identical on all nodes.
(my preference would be to have queue managers refuse to start if they don't correctly reference the QMAN$MASTER used by existing queue managers)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2007 11:03 PM
07-18-2007 11:03 PM
Re: Two Queuemanagers without shared db on cluster.
Interesting. So, if there never is a failover from the nodes running queuemanager 1 to nodes running queuemanager 2 it's ok?
That's configurable behaviour, did you ever try that?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2007 11:17 PM
07-18-2007 11:17 PM
Re: Two Queuemanagers without shared db on cluster.
>>>That's configurable behaviour, did you ever try that?
<<<
No, that is _NOT_ configurable.
Any time a node running a queue manager goes down, the manager is transfered. And if the node crashes, any surviving node takes over.
-- that is one of the things that make VMS clusters so resilient to various ways of failing hardware.
And John,
>>>
(my preference would be to have queue managers refuse to start if they don't correctly reference the QMAN$MASTER used by existing queue managers)
<<<
If you ever need any backing vote on this, mine is given herewith!
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2007 11:38 PM
07-18-2007 11:38 PM
Re: Two Queuemanagers without shared db on cluster.
The quorum node is not participating in providing production services, it's just there to keep the cluster from splitting.
Never will any jobs running on the 'real' production machines run on the quorum node or the other way around. Also this node will share almost no users, identifiers, processes or what you have with the 'real' production machines.
In the case of real trouble I am afraid of what MSCP would do when nodes serving the common disk start to get unreachable.
Initially we just went for this node having no queuemanager at all, as per the guidelines in the documentation. This proved to be a major issue for the management sofware that is supposed to run on all machines.
I'll try to schedule some tests this weekend
or early next week to see what the exact behaviour of the independent queue managers is in case of nodes leaving the cluster etc,
and respond with the results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-18-2007 11:54 PM
07-18-2007 11:54 PM
Re: Two Queuemanagers without shared db on cluster.
1st queue manager:
Master file: CLUSTER$COMMON:[SYSEXE]QMAN$MASTER.DAT;
Queue manager SYS$QUEUE_MANAGER, running, on NodeA::
/ON=(NodeA,NodeB)
Database location: CLUSTER$COMMON:[SYSEXE]
second manager:
Master file: SYS$SYSROOT:[SYSEXE]QMAN$MASTER.DAT;
Queue manager SYS$QUEUE_MANAGER, running, on NodeQ::
/ON=(NodeQ)
Database location: SYS$COMMON:[SYSEXE]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2007 12:19 AM
07-19-2007 12:19 AM
Re: Two Queuemanagers without shared db on cluster.
>>>
Using MSCP would introduce a dependency of the quorum node on the other nodes serving the common disk.
<<<
If _NE_ of the production nodes is reachable, that will also mean MSCP will work. _IF_ the nodes are both not there (as seen from the quorum node), then your whole production environment is gone, making the disk holding queman irrelevant, or the production nodes still se one another, and happily go on "producing". Interestingly, the latter case is the one John warns about: the continuing part of the cluster no longer sees the quorum node, and so WILL be FORCED to start that quemanager on one of the prd nodes: the fatal scenario.
Please take John G's advise!
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2007 12:52 AM
07-19-2007 12:52 AM
Re: Two Queuemanagers without shared db on cluster.
Considering the contradictory answers and experiences of people, not even to mention the unclear status of support, I will do some testing, to actually see if it works, or not.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2007 01:13 AM
07-19-2007 01:13 AM
Re: Two Queuemanagers without shared db on cluster.
You certainly cannot use the queues/printers on the other nodes. But for having a local batch queue on the 'quorum' node, this setup seems to work. The location of the qman database files as well as the nodes to be run on is stored in QMAN$MASTER.DAT. This file is opened by JOB_CONTROL, which then creates the QUEUE_MANAGER process. The related locks have a parent lock, which includes the device name on which the QMAN$MASTER.DAT file resides. This is supposed to be unique in a cluster !
If you want to run a 'pure' quorum node, you don't want to mount the disks in the SAN.
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2007 01:42 AM
07-19-2007 01:42 AM
Re: Two Queuemanagers without shared db on cluster.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2007 02:17 AM
07-19-2007 02:17 AM
Re: Two Queuemanagers without shared db on cluster.
If I understand Volker correctly, I have to take care not to put the 'quorum' queue database on a device called dsa3?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2007 02:25 AM
07-19-2007 02:25 AM
Re: Two Queuemanagers without shared db on cluster.
Just put the 'local' qman database onto the system disk of your quorum node (in SYS$SYSTEM: as by default). The disk device name must be unique within the cluster anyway...
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2007 02:44 AM
07-19-2007 02:44 AM
Re: Two Queuemanagers without shared db on cluster.
This was a hypothetical question. In a cluster all disknames are supposed to be unique, aren't they? Local disks get the allocation class before the device name, shared disks are seen by all members, and so have to be unique.
So I didn't really understand your point, as by design the device name should be unique anyway.
Would there be a problem if you would make another qman$master.dat in another directory on the same disk?
No idea what happens if you would attach the quorum node to another SAN having the same DGA devices as the 'production' SAN, probably you would get 'unpredictable results', but that's another discussion.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2007 03:57 AM
07-19-2007 03:57 AM
Re: Two Queuemanagers without shared db on cluster.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2007 04:09 AM
07-19-2007 04:09 AM
Re: Two Queuemanagers without shared db on cluster.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-19-2007 04:54 AM
07-19-2007 04:54 AM
Solution
Would there be a problem if you would make another qman$master.dat in another directory on the same disk?
No problem, the parent resource name includes the file-id of the QMAN$MASTER.DAT file. Not that this would make sense...
Volker.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-22-2007 03:38 AM
07-22-2007 03:38 AM
Re: Two Queuemanagers without shared db on cluster.
like this (nodename obfuscated) :
start/que/manager/new/on=(qnode)
show queue/manager/full shows this :
Master file: SYS$SYSROOT:[SYSEXE]QMAN$MASTER.DAT;
Queue manager SYS$QUEUE_MANAGER, running, on
QNODE::
/ON=(QNODE)
Database location: SYS$COMMON:[SYSEXE]
On the productioon node ( there are 4, A,B,C and D) the queuemanager was running on node C. A reboot of node C made the 'production' queuemanager shift to node 'A'.
A stop/queue/manager/cluster command on QNODE
made it stop the queuemanager on QNODE, but
NOT on the production nodes, as hoped.
A reboot of the Quorum node did not affect the queuemanager on the production nodes,
and in the places I looked ( operator.logs)
there was no evidence of any queuemanager
panicking on what to do.
I could have repeated this test 100 times, and rebooted all cluster nodes 100 times, but there really was no indication this would change the results, so I didn't.
Conclusions :
1 An independent queuemanager, provided the start/queue/manager has a carefully crafted nodelist in the /on qualifier works without problems.
2 Although this is expected behaviour given the qualifiers available for starting the queuemanager, it is rather challenging to extract this from the documentation.
Many people weren't up to this challenge.
Anyone suggestions for more tests or things to be tested?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-22-2007 03:55 AM
07-22-2007 03:55 AM
Re: Two Queuemanagers without shared db on cluster.
STOP/QUEUE/MANAGER is a rather controlled way of terminatinh a queue manager.
As far as I understoog John Gillings' description, the real potential for trouble is when another node notices a remote queue manager gone, (because the queue manager crashed, the node crashed, of connectivity disappeared)
You did not report on any such "catastropy" scenario.
I am still very much in doubt on the wisdom of this confiruration.
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-22-2007 05:06 AM
07-22-2007 05:06 AM
Re: Two Queuemanagers without shared db on cluster.
try a STOP/ID of the QUEUE_MANAGER process, it will just be restarted.
The major issue is the correct specification of the QMAN$MASTER file location and it's contents, i.e. the node(s) to run on and the physical location of the QMAN database files.
I see one lock (QMAN$ORB_LOCK), which is not a child of the Master File Access Lock and therefore is NOT unique for each QMAN$MASTER.DAT file in the cluster. This could be a potential problem, but it's only being used, if you set ACLs on the queues.
Volker.