Operating System - OpenVMS
1827825 Members
1913 Online
109969 Solutions
New Discussion

Re: Moving to one system disk

 
SOLVED
Go to solution
Willem Grooters
Honored Contributor

Moving to one system disk

Something I may have missed in the manuals, but I don't get what I want...

I have 3 (older) systems with their own system disk, each VMS 8.2 and up-to-date pathched. All systems connect to a HSZ50 (shared SCSI). Each in itself can access all disks. All systems have 2 NICs, of which one is intended for cluster communication only.

Now I want to use one disk on the HSZ50 as a system disk for all systems. I set up Node A from scratch on that disk, determined it as a cluster member. I did a basic configuration, and then ran CLUSTER_CONFIG to create the cluster. After reboot, the system nicely came upo as a cluster member.
Next I started CLUSTER_CONFIG to add node B (not a satellite). All seemes to go smoothly - [SYS1] is created and it is asked to boot node B, the procedure keeps waiting.
I booted B from the newly created [SYS1], it comes up, is shown in SHOW CLUSTER on node A, will try to connect, starts MSCP - everything seems to go Ok. Node A is contacted - but node A keeps waiting and node B keeps trying to connect. None of the nodes signals an error....

The other approach: creating a system root (SYS1] for a new node and adding all required, missing files manually (or by procedure) on the shared disk, allows me to boot node B but for some reason, it will not finish the boot sequence, I deduct while accessing the pagefile, also on that disk (see aother thread).

What did I miss?
Willem Grooters
OpenVMS Developer & System Manager
23 REPLIES 23
Robert Brooks_1
Honored Contributor

Re: Moving to one system disk

On the node(s) that fail to completely boot,
please boot conversationally. While in SYSBOOT, issue the command SHOW /CLUSTER
and post that output here.

-- Rob
Willem Grooters
Honored Contributor

Re: Moving to one system disk

See attachment.
Willem Grooters
OpenVMS Developer & System Manager
Willem Grooters
Honored Contributor

Re: Moving to one system disk

and then:

SYSBOOT> cont
%SYSBOOT-W-NOERLDUMP, Unable to locate SYS$ERRLOG.DMP
%SYSBOOT-W-NODUMP, unable to locate system dump file


OpenVMS (TM) Alpha Operating System, Version V8.2
(C) Copyright 1976-2005 Hewlett-Packard Development Company, L.P.

%DECnet-I-LOADED, network base image loaded, version = 05.12.00

%DECnet-W-NOOPEN, could not open SYS$SYSROOT:[SYSEXE]NET$CONFIG.DAT

%SYSINIT-I- waiting to form or join an OpenVMS Cluster
%VMScluster-I-LOADSECDB, loading the cluster security database
%EWA0, Twisted-Pair mode set by console
%EWB0, Twisted-Pair mode set by console
%SYSTEM-I-MOUNTVER, $116$DKA100: (DIDO PKB) is offline. Mount verification in p
rogress.

%SYSTEM-I-MOUNTVER, $116$DKA100: (DIDO PKB) has completed mount verification.

%CNXMAN, Sending VMScluster membership request to system DIANA
%CNXMAN, Sending VMScluster membership request to system DIANA
%SYSTEM-I-MOUNTVER, $116$DKA100: (DIDO PKB) is offline. Mount verification in p
rogress.

%SYSTEM-I-MOUNTVER, $116$DKA100: (DIDO PKB) has completed mount verification.

%MSCPLOAD-I-CONFIGSCAN, enabled automatic disk serving
%CNXMAN, Sending VMScluster membership request to system DIANA
%CNXMAN, Sending VMScluster membership request to system DIANA
%CNXMAN, Sending VMScluster membership request to system DIANA
%CNXMAN, Sending VMScluster membership request to system DIANA
%CNXMAN, Sending VMScluster membership request to system DIANA
%CNXMAN, Sending VMScluster membership request to system DIANA
%CNXMAN, Sending VMScluster membership request to system DIANA
%CNXMAN, Sending VMScluster membership request to system DIANA
%CNXMAN, Have connection to system DIANA
%CNXMAN, Sending VMScluster membership request to system DIANA
%CNXMAN, Sending VMScluster membership request to system DIANA

On DIANA (where the whole acitivity was started) it keeps reading it waits for DIDO being booted.
However, DIDO is seen as a clustermember:
$ sho cluster
View of Cluster from system ID 21505 node: DIANA 28-DEC-2005 21:54:32
+-----------------------+---------+
| SYSTEMS | MEMBERS |
+--------+--------------+---------+
| NODE | SOFTWARE | STATUS |
+--------+--------------+---------+
| DIANA | VMS V8.2 | MEMBER |
| DIDO | VMS V8.2 | NEW |
+--------+--------------+---------+


Willem Grooters
OpenVMS Developer & System Manager
Robert_Boyd
Respected Contributor

Re: Moving to one system disk

Have you tried booting node B with bootflags using 20000? There may be some additional information there.

Also, do a REPLY/ENABLE=CLUSTER on NODE A (and C) and see what kind of messages (if any) are coming out.

I don't know if you're being affected by this, but in the past I have had problems where if a node goes away and comes back with a different nodename but is using the same NIC or same nodename on a different NIC (or CI) that the other members of the cluster would refuse to let the new member join.

This has been less of a problem with pure NIclusters, but when there are other hardware links I believe it can still be a problem.

Robert
Master you were right about 1 thing -- the negotiations were SHORT!
Robert_Boyd
Respected Contributor

Re: Moving to one system disk

Willem,

Did you intend for the system to have 0 votes?
That was the only odd thing I noticed in your /CLUSTER parameter list. You might also post the results of SHOW /SCS just for grins. There might be something interesting there.

Robert
Master you were right about 1 thing -- the negotiations were SHORT!
Willem Grooters
Honored Contributor

Re: Moving to one system disk

Robert:


%%%%%%%%%%% OPCOM 28-DEC-2005 22:13:51.88 %%%%%%%%%%%
22:13:51.88 Node DIANA (csid 00010001) received VMScluster membership request from node DIDO

$
%%%%%%%%%%% OPCOM 28-DEC-2005 22:13:51.88 %%%%%%%%%%%
22:13:51.88 Node DIANA (csid 00010001) proposed addition of node DIDO

$
%%%%%%%%%%% OPCOM 28-DEC-2005 22:13:51.88 %%%%%%%%%%%
22:13:51.88 Node DIANA (csid 00010001) completed VMScluster state transition

for each request received....
Willem Grooters
OpenVMS Developer & System Manager
Willem Grooters
Honored Contributor

Re: Moving to one system disk

Robert: the second: 0 votes = satellite. but I deliberately stated NO on the quiestion if this (node B) was to be a satelite! I'll retry and set this to 1 (during sysboot).
Willem Grooters
OpenVMS Developer & System Manager
Uwe Zessin
Honored Contributor
Solution

Re: Moving to one system disk

The output from SHOW/CLUSTER says that:
> EXPECTED_VOTES = 2
and
> VOTES = 0

If the first node that has created the cluster has VOTES = 1, the new node WILL NOT be let into the cluster, because the current cluster QUORUM is 1 and EXPECTED_VOTES = 2 would require a QUORUM of 2 for the cluster to continue.

I have seen this many years ago and it took me some time to figure this out - too bad that VMS has still not learnt to give out at least a helpful message :-(
.
Jan van den Ende
Honored Contributor

Re: Moving to one system disk

Willem,

maybe Uwe already has the tiger by the tail, but it struck me, that you have
ALLOCLASS = 2
but the disk is seen as $116$DK...
Seems DIANA has Allocl = 116.
Unless you have the alternate naming scheme (but that requires you to have build SYS$SYSTEM:SYS$DEVICES.DAT, which I did not understand), you need DIDO's alloclass to be the same as DIANA's

hth

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Art Wiens
Respected Contributor

Re: Moving to one system disk

"What did I miss?"

Uwe's "10 point" help that he provided me a short while ago:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=980824

(EV = 2) + (V = 0) == wait forever

Cheers,
Art
Willem Grooters
Honored Contributor

Re: Moving to one system disk

Sorry for the late reply.
Indeed: votes = 1 did the trick. The system came up indeed and did it's autogen - and next failed to boot - due to problems with Shared SCSI (see http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=987016)
Jan:
ALL systems have a DKA0 device that is to be considered "local" - though served. FWIK, this same Id measn each system needs to have it allocated locally. I _could_ disable MSCP but enabling it is very handy seting up the system (copying data is somewhat simpler).
The HSZ50 however (on shared SCSI) controls all the disks tobe accessable by all machines. I have used Cluster_config to set this port allocation to `116' - an arbitrary value taken straight from the manuals ;-)
Willem Grooters
OpenVMS Developer & System Manager
Dale A. Marcy
Trusted Contributor

Re: Moving to one system disk

Is your alloclass set to a unique value on each system?
Uwe Zessin
Honored Contributor

Re: Moving to one system disk

> ALL systems have a DKA0 device that is to be considered "local" -

Yes, this used to be a big problem in early versions of VMS before port allocation classes were developed. Sometimes the only workaround was to put the devices on different SCSI addresses.

The requirement that the (VMS) host allocation had to match the allocation class of a DSSI or CI storage controller was lifted quite some time ago. From my memory, the latest problems got solved in V7.2-1.
.
Jan van den Ende
Honored Contributor

Re: Moving to one system disk

Willem,

IMMSMW, the devices behind HSZxx have a disk ID of DKLx0y (L is controller sequence number, x is SCSI channel, y is LUN-in-Channel-ID.

LOCAL drives however, are $$DKetc

The LOCAL drives get their alloclass fom the local SYSGEN param, and those should those differ to differentiate any local devices otherwise inheriting a conflicting name.

The SCSI card having the HSZ defines the controller letter for drives behind the HSZ, which NEEDS TO BE IDENTICAL ON ALL NODES, unless you have ( VMS > 7.2) DEVICE_NAMING activated, in which case you need to specify to divergating info in SYS$SYSTEM:SYS$DEVICES.DAT.

If you need more specific details: you know how to contact me directly

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Uwe Zessin
Honored Contributor

Re: Moving to one system disk

Isn't that exactly what Willem said he did?
.
Jan van den Ende
Honored Contributor

Re: Moving to one system disk

Willem,

Did some history checking.
Found these points:
- the systems connected to the shared SCSI bus MUST have the SAME Alloclass.
- the SCSI connectors must all have the SAME controller letter (but: see SYSGEN param DEVICE_NAMING to circumvent this)
- ALL diskdrives (including CDs) on ALL nodes must have CLUSTERWIDE unique slotnumbers! (1 MK and 1 MD decice are allowed to the same number)

If any of the above conflicts, there is no valid cluster config.

HTH

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Uwe Zessin
Honored Contributor

Re: Moving to one system disk

> the systems connected to the shared SCSI bus MUST have the SAME Alloclass.

http://h71000.www7.hp.com/doc/731FINAL/4477/4477pro_010.html

says:
"""6.2.4.1 Port Allocation Classes for Devices Attached to a Multi-Host Interconnect
...
5. Each node for which MSCP serves a device **should** have the same nonzero allocation class value."""

from the wording it sounds like "Each node" means an MSCP client.
The same class like ... what?
- the CI/DSSI/SCSI/FC device?
- the same Alpha server?

And why is it **should**?
.
Jan van den Ende
Honored Contributor

Re: Moving to one system disk

Uwe,

the underlying reason:

_IF_ a device has multiple access paths (like disks on a dual-(multi-)hosted SCSI bus, then the NAMING via all access paths should be the same.
Thus, as long as the allocation class part of the naming is taken from SYSGEN, the alloclasses should be identical on all systems that share that bus.
If the naming derives some other way (port allo class in effect) then it must be arranged such that the naming condition is met, and system alloclass is not important. For SAN disks alloclass always is 1, so that solves the issue as well.

hth

Proost.

Have on on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Uwe Zessin
Honored Contributor

Re: Moving to one system disk

Come on, Jan, haven't you read any previous replies from me?

I have been working with multipath environments since VAX/VMS V4 - you should really expect me to know why allocation classes exist and how they work!
.
Jan van den Ende
Honored Contributor

Re: Moving to one system disk

Sorry Uwe,

no offence meant.

I am a bit ashamed :-(

My common sense must have taken a holiday.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Uwe Zessin
Honored Contributor

Re: Moving to one system disk

Welcome back, hope you enjoyed you holiday ;-)
.
Willem Grooters
Honored Contributor

Re: Moving to one system disk

Uwe,

Your comment is correct: figure 6.8 points out my situation: Multiple nodes on one, shared SCSI bus (BTW - that is where "116" comes from).
I guess the issue now is not the naming - that's Ok and I was able to boot the second node from this shared disk but it failed basically because the shared SCSI problems - see the thread on that. I think I can't go on before this is settled.
Willem Grooters
OpenVMS Developer & System Manager
Willem Grooters
Honored Contributor

Re: Moving to one system disk

See previous entry; the main problem ssems to me to be the issue onshaerd scsi (http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=987016) so that needs to be solved first.
Willem Grooters
OpenVMS Developer & System Manager