Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

 
Adrian O
Occasional Advisor

Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Trying to satellite boot an alpha 6.2-1H3 (fully patched Alphaserver 4100) with system disk served from an Alphaserver 800 5/500 running V7.2-1 which has both fibre channel and locally attached disks.

If we satellite boot V7.2-1 on the 4100 everything works as expected.

When we try to boot V6.2-1H3, we get the following console output:

bootstrap code read in
FRU table creation disabled
base = 200000, image_start = 0, image_bytes = 7d878
initializing HWRPB at 2000
initializing page table at 1f2000
initializing machine state
setting affinity to the primary CPU
jumping to bootstrap code

%VMScluster-I-MOPSERVER, MOP server for downline was node DEV004
%VMScluster-I-SYSDISK, Satellite system disk is _$255$DKA100:
%VMScluster-I-SYSROOT, Satellite system root is
%VMScluster-I-REINIT_WAIT, Waiting for access to the system disk server
...
%VMScluster-I-REINIT_WAIT, Waiting for access to the system disk server
%VMScluster-I-REINIT_WAIT, Waiting for access to the system disk server
%VMScluster-I-REINIT_WAIT, Waiting for access to the system disk server
...
Repeat until ^P

Anyone any ideas as to where/what to look at?

Regards,

Adrian.
22 REPLIES 22
Jon Pinkley
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Is $255$DKA100: MSCP served by DEV004? Is DEV004 the Alphaserver 800 5/500 running 7.2-1?

Jon
it depends
Jon Pinkley
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Are you attempting to satellite boot V6.2-1H3 from a V6.2-1 system or the 7.2-1 system. I have never attempted satellite booting from anything but a common system disk, and multiple VMS versions off a single system disk isn't supported.

More details about the configurations would help diagnosis.

Jon
it depends
Adrian O
Occasional Advisor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

dev003 has no locally attached storage and is trying to be booted as a satellite from dev004.

dev004, the alpha 800, is serving all disks (MSCP_SERVE_ALL=1 and MSCP_LOAD=1).

$255$dka100 is mounted /cluster and (should be) served to the 4100 (dev003) from dev004.

dev004 is running 7.2-1 from its dedicated system disk.

$255$dka100 is a dedicated V6.2-1H3 system disk.

The 2 servers are connected via ethernet crossover cable, and the satellite booting works fine
when both nodes are booting 7.2-1.

Only seeing the problems when trying to boot 6.2-1H3 on dev003.
Wim Van den Wyngaert
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Did you verify the mc lancp node or ncl mop client config ? Post it ?

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

In a mixed-version OpenVMS Cluster system, serving "all available disks" is restricted to its pre-Version 7.2 meaning, that is, serving locally attached disks and disks connected to HSx and DSSI controllers whose node allocation class matches that of the system's node allocation class. To serve "all available disks" in a mixed-version cluster, you must specify the value 9.


Try 9 ?

Wim
Wim
Adrian O
Occasional Advisor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

The MOP load seems to work ok:

LANCP> show node dev003

Node Listing, volatile database:
DEV003 (08-00-2B-C5-C2-23):
MOP DLL: Load file: APB.EXE
Load root: $255$DKA100:
Last file: $255$DKA100:[SYSCOMMON.SYSEXE]APB.EXE
Boot type: Alpha satellite
33 loads requested, 6 volunteered
6 succeeded, 0 failed
Last request was for a system image, in MOP V4 format
Last load initiated 23-JUL-2008 11:06:58 on EWA0 for 00:00:05.26
3224595 bytes, 25143 packets, 0 transmit failures

Totals:
Requests received 33
Requests volunteered 6
Successful loads 6
Failed loads 0
Packets sent 12558
Packets received 12585
Bytes sent 3160548
Bytes received 64047
Last load DEV003 at 23-JUL-2008 11:06:58.59

Rgds

Adrian.
Adrian O
Occasional Advisor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Wim,

Tried MSCP_SERVE_ALL=9 yesterday.
Set it to 1 this morning.

Same difference.

Trying with 9 again now...

Rgds,

Adrian
Wim Van den Wyngaert
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

The param is not dynamic. You need to reboot the disk server.

Wim
Wim
Adrian O
Occasional Advisor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Changed MSCP_SERV_ALL to 9.
Removed quroum disk!
Autogenned, rebooted.

No change.

Wim Van den Wyngaert
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

The CLUSTER_AUTHORIZE.DAT file on the system disk does not match the other cluster members ?

Wim
Wim
Wim Van den Wyngaert
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

I also remember that when the node name was used before in the cluster but with a different scssystemid you get problems. Also when the scssystemid was used but now with a new name. Not sure if the "waiting for" message is given in this case.

I rebooted the whole cluster to solve it (all down, all up again).

Wim
Wim
Adrian O
Occasional Advisor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Wim,

CLUSTER_AUTHORIZE files are identical on both system disks.

I don't think we are getting as far as trying to form the cluster, as dev003 is waiting form a connection to the v6.2 system disk served by dev004.

Adrian.
Wim Van den Wyngaert
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Suppose 2 systems server the same disk ? How will the cluster know that it may answer ? OBased on the cluster_authorize file.

Wim
Wim
Hoff
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

I will assume there are the usual sorts of impediments to upgrading the gear and the software; this is ancient bits and ancient gear. Big, hot and slow, all things considered.

From what I see, V7.2-1 and V6.2-1H3 don't appear to be a migration pair, per a contemporary SPD. Which means you're in relatively uncharted territory.

Here's some little detail on the cluster protocol version span:

http://64.223.189.234/node/412

I don't know if your current pairing is inside that range.

And you'll want to ensure that CLUSIO and other "current" and mandatory ECOs are loaded on your V6.2-1H3 box, if you really want to try this mixed-version cluster. This CLUSIO ECO kit was needed for V6.2 (and its close friends, the Ghost, Zombie and Goblin releases) when V7.1 arrived.

Stephen Hoffman
HoffmanLabs LLC
Jess Goodman
Esteemed Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Way back in Feb. 2000 I tried network booting a VMS 6.2-1H3 AlphaServer 4100 (from a different bootserver). It failed with the same looping error message you posted.

I logged a call and within a few days VMS support gave me a fix - a new version of APB.EXE (Alpha Primary Bootstrap image). But my fix was never included in a VMS 6.2 ECO.

Since it might be difficult to get this fix from VMS support today, I have attached my APB.EXE as a ZIP file attachment to this
post.

After unzipping it you need to do this:

$ BACKUP APB.EXE SYS$SYSDEVICE:[VMS$COMMON.SYSEXE]/NEW/OWNER=ORIGINAL
$ DIRECTORY/FULL SYS$SYSDEVICE:[VMS$COMMON.SYSEXE]APB.EXE;
$!Verify the file still has "Contiguous" and "MoveFile disabled" attributes.
$ MCR WRITEBOOT
Update VAX portion of boot block (default is Y) : N
Update AXP portion of boot block (default is Y) : Y
Enter AXP boot file : SYS$SYSDEVICE:[VMS$COMMON.SYSEXE]APB.EXE

And hopefully that will solve you problem. Note: I know this APB.EXE works for network and non-network boots on a VMS 6.2-1H3 AS 4100. I have not attempted to use it on any other hardware.
I have one, but it's personal.
Adrian O
Occasional Advisor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Jess,

Many thanks for the file - We can not get to a SYSBOOT prompt, but now get the console output shown in the attachment:


OpenVMS (TM) Alpha Operating System, Version V6.2-1H3

%VMScluster-I-SYSDISK, Satellite system disk is _$255$DKA100:
%VMScluster-I-SYSROOT, Satellite system root is
%VMScluster-I-BUSONLINE, LAN adapter is now running 08-00-2B-C5-C2-23
%VMScluster-W-PROTOCOL_TIMEOUT, NISCA protocol timeout
%VMScluster-I-REINIT_WAIT, Waiting for access to the system disk server

%VMScluster-I-BUSONLINE, LAN adapter is now running 08-00-2B-C5-C2-23
%VMScluster-W-PROTOCOL_TIMEOUT, NISCA protocol timeout
%VMScluster-I-REINIT_WAIT, Waiting for access to the system disk server

Repeats until ^P

Any idea as to what is causing the Protocol Timeout?
Jon Pinkley
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

You are in unsupported terretory, so these types of issues are not uncommon.

How much is it worth to be able to boot this without any directly attached storage? Are you 100% sure that the Alphaserver 800 5/500 is supported by 6.2-1H3 (I didn't look).

If it is supported, my next step would be to put a local disk on the box and boot from that. That will probably be cheaper than calling in help to try and get the box to boot as a satellite.

Jon

it depends
Adrian O
Occasional Advisor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Jon,

We are trying to source local storage for the 4100. The reason behind trying the satellite boot was because there is no local storage atm.

Thanks for all the help so far.

Rgds,

Adrian.
Wim Van den Wyngaert
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Is it possible that you need to patch the OS 6.2 too or is the code for making the mscp connection only present in apb.exe (I would think it's not because you can have mscp with a local disk).

Wim
Wim
Hoff
Honored Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

I'd think you're left to upgrade your OpenVMS V6.2-1H3 box, downgrade your OpenVMS Alpha V7.2-1 box, or continue struggling with something that is known to be unsupported and that is here found to be non-functional.

SCS doesn't like long version spans, and if CLUSIO and such don't allow this, you're likely not going to get a connect here.

This "downgrade or upgrade" approach is a bigger hammer solution, yes, but I'd apply a yet larger hammer here and get all these boxes upgraded to V7.3-2 minimally, or to current.
Adrian O
Occasional Advisor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

Have not yet managed to get a successful satellite boot, although I think we were getting close.

Time constraints have forced us to connect an external pedestal to the V6.2 server and boot locally from this. We now have a working mixed version cluster, and when the devs give me a slot, I will have another tinker about with the satellite boot.

I think that we are missing one or more of the 'CPU' patches for V6.2-1H3 - I need to investigate this a bit more thoroughly.

Many thanks for the interest and especially the patched APB.EXE

Rgds,

Adrian.
Jess Goodman
Esteemed Contributor

Re: Satellite boot Alpha V6.2-1H3 from Alpha 7.2-1

At a minimum you must have these three patches installed on your VMS 6.2 AlphaServer 4100:

ALPCLUSIO01_062
ALPCPU1603_062
ALPLAN06_062

They are available here:
ftp://ftp.itrc.hp.com/openvms_patches/alpha/V6.2X/

Note that ALPCLUSIO01_062 contains a version of APB.EXE, so after installing it you must reinstall the newer version of APB.EXE that I gave you.
I have one, but it's personal.