Operating System - OpenVMS
1839249 Members
2968 Online
110137 Solutions
New Discussion

Re: openVMS boot strap failure

 
SOLVED
Go to solution
nipun_2
Regular Advisor

openVMS boot strap failure

Hi,
I have 3 node common environment cluster on ethernet lan running OpenVMS 7.3-1.

The systems were running fine and I recently rebooted the system.
So I started the main node and while it was sarting up, I started the Satellite node (Sat1)

and then when these two nodes were up I started the second satellite node (Sat2).

However Sat2 did not boot

This is how the events unfolded

>>>boot ewa0
starting MOP boot
....

bootstrap failure
^C to abort
retrying..

and this goes on

Is the order of booting a problem

I can start with Sever node then Sat2 and then Sat1

Let me know if you have any ideas

I not well versed in OpenVMS so please try to give me specific commands or an idea for where to look for the specific documentation.

Thanks in advance
16 REPLIES 16
John Donovan_4
Frequent Advisor

Re: openVMS boot strap failure

I have always started 1 node at a time when bringing up the whole cluster. main, sat1 & sat2. Your main system should be up completely before starting the satellite nodes.
"Difficult to see, always in motion is the future..."
Volker Halle
Honored Contributor

Re: openVMS boot strap failure

Nipun,

please check SYS$MANAGER:OPERATOR.LOG of your boot member (main node) for any messages seen when trying to (unsuccessfully) boot SAT2 and consider to include these messages in your next reply.

Volker.
nipun_2
Regular Advisor

Re: openVMS boot strap failure

Hi Hein,
Here is the operator log file.
Today morning I tried to again boot Sat2 but in the following sequence
Main server (completely up and running)
boot Sat2 (same error as before boostrap failure)

boot Sat1 (sucessfully booted)

I am attaching the log file.

Please let me know what you guys think

Note: SC4238 is the name of the Server node
Volker Halle
Honored Contributor

Re: openVMS boot strap failure

Nipun,

nothing in the OPERATOR.LOG file from a successful or even unsuccessful boot atttempt of Sat2. This implies, that the MOP boot request did not even reach node SC4238.

Did it ever work ? How do you normally boot Sat2 ? Via DECnet MOP (MC NCL SHOW MOP CLIENT *) or LANCP MOP (MC LANCP SHOW NODE) ?

At 10:52, there are LANACP messages showing that EIA0 and EWA0 are being enabled for MOP downline loading, but this is 1 minute before shutdown of node SC4238 ?!

If you only have a DTSS server configured on your LAN; you might want to get rid of those 'Too Few Servers Detected' DECnet event messages by using the following NCL command:

$ MC NCL
BLOCK EVENT DISPATCHER OUTBOUND STREAM local_stream GLOBAL FILTER -
((NODE,DTSS), Too Few Servers Detected)

This command can be executed interactively. This supported place for this command is in SYS$MANAGER:NET$EVENT_LOCAL.NCL ( see comments in SYS$MANAGER:NET$EVENT_LOCAL.TEMPLATE)

Volker.
Joseph Huber_1
Honored Contributor

Re: openVMS boot strap failure

In Your operator.log (it dates 5-jun-2003, is it really from today ???), there is LANCP MOP services enabled, but no attempt of successfull or unsucessfull to boot.
This means the MOP boot requests from Sat2 do not arrive at the server.
Is EWA0 connected to the right LAN ?
http://www.mpp.mpg.de/~huber
nipun_2
Regular Advisor

Re: openVMS boot strap failure

From the discussion going on here,
I did the following changed the ethernet cable and did the boot process. It still did not work.
Then I shutdown the Sat2 node (turned off the power) and again started it (it was configured for auto boot through ewa0)

The following message first showed up.

ED.E8.E7....
ewa0: link failed: Using 100BaseTX:full duplex.

Then the it went throug some other messages and finally came to the srm consol prompt.
>>>

Also I have been using
2608 (Gigabit Ethernet switch) which has 2 green (indicating Gb data transfer) blinking leds however for the Sat 2 cable only one amber colored (indicating 100 Mb data transfer?)
So looks like this
SPD/LNK/ACT Green(blinks) Green(blinks) Amber (blinks)
FDX/HDX Green(blinks) Green(blinks) NO LIGHT

So I am thinking whether the ethernet card could be the problem. Is there anyway to test this in the SRM console.

I can boot into Sat2 as a standalone machine, If that might help in some hardware test.

nipun_2
Regular Advisor

Re: openVMS boot strap failure

I should have added this
OpenVMS 7.3-1, common system environment
The network setup is as shown below

Server -> 2608 Gigabit switch
Sat1 -> 2608 Gigabit Switch
Sat2 -> 2608 Gigabit Swtich

Thus all machines communicate over the switch.


comarow
Trusted Contributor
Solution

Re: openVMS boot strap failure

Assuming nothing has changed from VMS,
your network manager may have filtered out SCS or MOP traffic.

Issue the command Reply/network to see if you are getting Mop requests.

The system could also be broken. As I always say, try booting from a CD to help isolate hardware, versus disk or networks.

Bob
Jan van den Ende
Honored Contributor

Re: openVMS boot strap failure

Slight addendum, and slight correction to last Comarow answer:

you need to have OPER priv for the REPLY command, and the syntax is:
$ REPLY/ENABLE=NET
(switch off again by /DISABLE)

As always: if not exactly sure about any systax, then HELP is a real friend!!

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Joseph Huber_1
Honored Contributor

Re: openVMS boot strap failure

The message
ewa0: link failed: Using 100BaseTX:full duplex.
clearly indicates the ethernet link is not working.
Now this not necessarily means the interface or cable broken, it may just be a protocol mismatch between the ewa0 interface and the gigaswitch.
Is the 100mbit full duplex set in the ewa0_mode variable or the result of autonegotiation ?
Do a SHOW ewa0* to see, and then try the variations
set ewa0_mode auto / fast / fast_fdx
(I'm not quite sure about the syntax, SRM should tell You the options if they are wrong).
http://www.mpp.mpg.de/~huber
nipun_2
Regular Advisor

Re: openVMS boot strap failure

Hi Joseph,
You are right I think it is more of a network problem. Does the info I pasted before (again pasting it for convenience)

Also I have been using
2608 (Gigabit Ethernet switch) which has TWO green (indicating Gb data transfer) blinking leds however for the Sat 2 cable only ONE amber colored (indicating 100 Mb data transfer?)
So looks like this
SPD/LNK/ACT Green(blinks) Green(blinks) Amber (blinks)
FDX/HDX Green(blinks) Green(blinks) NO LIGHT

Thus (I think) it clearly indicates that Main server has no link with Sat2

Regardig ewa0 mode

I did sh ewa0* and it displayed
ewa0_mode Auto Negotiate

I did set ewa0_mode Fast
but it did not make a difference

when I again did
set ewa0_mode Auto
It negotiated and then gave
Auto-Negotiate and 100 Mbps full

Note: The above mode has been working perfectly fine to date, so I don't think changing it would make any difference. But if you feel it might please let me know.

The following modes are available
Twisted-Pair
Full Duplex, Twisted-Pair
AVI
BNC
Fast
Fast FD (Full Duplex)
Auto-Negotiate





nipun_2
Regular Advisor

Re: openVMS boot strap failure


Hi,
I think the description of the led lights I gave before is confusing. Please have a look at the attached txt file

Terry Beales
Occasional Contributor

Re: openVMS boot strap failure

Hi,
I run a large cluster with _many_ satellite nodes & have occasionally had similar symptoms to yours on the odd node - one cause has been too many satellite nodes attempting to boot simultaneously (obviously not the cause in your case), another seems to be a poor quality network connection - setting EWA0_MODE to twisted pair will often 'fix' this - albeit with reduced performance. Otherwise, if I still can't get a MOP load, I call in the network tech. to test the connections with a fluke - this often identifies a dodgy cable or connection.
Incidentally - are both your satellites identical hardware?
nipun_2
Regular Advisor

Re: openVMS boot strap failure

Thanks for the reply Terry,

My Main Server is DS25
Sat1 - EV 68 (DS 25)
Sat2 - XP1000 (few years old)

I can provide more specifications if you need.


I tried to boot with Twisted Pair
set ewa0 Twisted-Pair

The light FDX/HDX still did not come to life.

What is the reason for the FDX/HDX light to go off?
Does it show error on the side of the Sat2 node or error on the side of the server node or the switch itself.

If anyone has any idea please let me know
Nipun


Antoniov.
Honored Contributor

Re: openVMS boot strap failure

Nipun,
>>> set ewa0 twisted_pair
set network device to 10mbs, half-duplex on my machine.
I guess you have to set auto mode; you can type
>>> set ewa0
and see the various options you can type.

Antonio Vigliotti
Antonio Maria Vigliotti
comarow
Trusted Contributor

Re: openVMS boot strap failure

Using lancp you can set the characteristics of the port.

If you use a Cisco switch the command show tech will diplay all the setting ( and nothing confidential).