Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

DECServers Rebooting

 
Zeni B. Schleter
Regular Advisor

DECServers Rebooting

We have added 4 ES47s on our network. After several weeks , all of a sudden the DECServer 200s started rebooting. The reboots are simultaneous and the elapse time between reboots varies from a few minutes to an hour or more. We will be checking the network but is there anyone who has experienced the same problem? Could we have enabled something on the ES47 when turning on LAT connections that could have caused this?

The DecServers appear to reboot successfully and are functional. No errors are being reported anywhere that I can find.

Details: VAX V7.1 is doing the downloads.
New ES47s are VMS v7.3-2 . Decnet IV on
all systems.
16 REPLIES
Robert Gezelter
Honored Contributor

Re: DECServers Rebooting

Zeni,

Obviously running a monitor on the network to capture the simulateous failure would be a good idea.

Another possible cause of the problem would be a power glitch of some kind. Any other symptoms on any other equipment in the area.

I cannot think of a reason why the presence of an ES47 would cause a problem with terminal servers. If anything caused a problem, I would suspect 7.3-2, but any such problem should have been noted a long time ago.

I hope that the above is helpful.

- Bob Gezelter, http://www.rlgsc.com
Zeni B. Schleter
Regular Advisor

Re: DECServers Rebooting

The terminal servers are seeing a lot of Unrecognized destinations. An HP network support person said that a burst of such could exceed the terminal servers capabilities and cause them to crash. The source of such is what we will look for.

I was wondering if there were limits to the LAT services or whether the ES47 would broadcast more frequently. In setting up the LAT services we took the defaults. If a ES47 does not do anything more than any other VMS host servicing LAT, then I do not see how these hosts added to the network caused the problem we are experiencing. The ES47s replaced AlphaServers that were at the same VMS version.

I am focusing on DECNET in that we have IP traffic but I thought that should be transparent to the DECServers. That may be a wrong assumption. The DECServers are model 200. They have been in service for many many years.

Thanks for your response.
Veli Körkkö
Trusted Contributor

Re: DECServers Rebooting

what is the version of software running on those DECserver 200? I think latest would be V3.3.

I had ages ago on issue with non-DEC terminal servers after VMS upgrade. The said upgrade changed LAT from the very old "slave only LAT" to the LATmaster (master,slave capabilities) and somehow the thirdparty terminal server disliked the new LAT on VMS.

You have here though VAX/VMS V7.1 so there is not that much difference in LAT protocol on VAX V7.1 and Alpha V7.3-2 probably.

One could of course attach a terminal to the port 1 of terminal server and see what kind of messages the server issues on the "console" when dying?

_veli
Volker Halle
Honored Contributor

Re: DECServers Rebooting

Zeni,

did you look at the SHOW SERVER STATUS display ? Escpecially the Software Status line on the bottom of the screen ? As far as I remember - and it's a long time ago - if the DECserver would crash with a bugcheck, it would record information about the bugcheck in that line.

You could compare, if all DECservers would be crashing with the same bugcheck information.

Make sure you are running the most recent available software version for the DECservers.

Do the DECserver write dumps to their load host when crashing ? If so, watch out for the disk space !

There could be a timing problem in the LAT protocol implementation on the DECserver that does not get triggered with older/slower OpenVMS systems.

Volker.
Uwe Zessin
Honored Contributor

Re: DECServers Rebooting

Unrecognized destinations...

If I recall correctly, this means that the DECserver has received ethernet frames with protocoll types it does not understand, e.g. TCP/IP.
.
Zeni B. Schleter
Regular Advisor

Re: DECServers Rebooting

DecServer versions are old v1.0

Show Status line appears to give current status and not status of crash. Did give useful info though. CPU is 6% used and memory is 12% used. It also confirmed that we are well below max number of services, nodes, and the like.

Digging up a Terminal and a Printer for Port 1 on one of the terminal servers is a good idea. The software is so old , the dumps are unreadable .
Veli Körkkö
Trusted Contributor

Re: DECServers Rebooting

well, maybe upgrading to latest (even that very old) software would be useful too.

_veli
Robert Gezelter
Honored Contributor

Re: DECServers Rebooting

Zeni,

If the DECservers are very far behind the last version of the software issues, I would certainly upgrade at least ONE of them before going too far down the debugging path.

As I noted, the ES47 should not be doing anything special or different.

- Bob Gezelter, http://www.rlgsc.com
Andy Bustamante
Honored Contributor

Re: DECServers Rebooting

>>> VAX VMS 7.1

There was an Alpha VMS 7.1x bug in LAT. On a "busy" network LAT packets that generated a collision and required a retransmission were padded with random values. The DECservers request retransmission of the mangled packet, the packet padded once again. . . repeat until the DecServer reboots. Depending on your network the value of "busy" may have been reached.

Check for LAT ECOs or consider an upgrade to VAX 7.3 if possible.

If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
Lawrence Czlapinski
Trusted Contributor

Re: DECServers Rebooting

Zeni:
1. If it were one server, I would suspect a susceptibilty to heat or a problem with the power to the terminal server. Since the reboots are simultaneous (I assume in various locations), I would update at least one of the DecServer 200's to V3.3 and see if that server stays up when the others go down. If it stays up, there is a good chance your problem is solved. The latter versions fixed crash problems. In any case, the first thing HP would say is upgrade your DecServer 200's to V3.3.
2. A bug check message would be reported on console port (port 1; 9600 baud).
" 3.3 SEVERE ERRORS
Severe errors may cause your DECserver 200 to hang or bugcheck.
Server hangs are usually recovered after 20 seconds by a automatic power fail, followed by a downline load. If this should occur, please describe as best you can the operating conditions on the server at the time of the hang.
If a FATAL BUGCHECK occurs, a bugcheck message will be printed out on the console terminal, showing the vital registers at the time of the bugcheck. Normally, an upline crash dump will automatically be created upon a fatal bugcheck error.
3. The second thing HP would say would be to upgrade to VAX VMS 7.3.
Lawrence
Volker Halle
Honored Contributor

Re: DECServers Rebooting

Zeni,

re: Unrecognized destinations...

Ethernet packets counted as 'unrecognized destinations' will most probably be broadcast packets. The packets are being received by the DECserver (due to them being addressed to it's MAC address or an enabled multicast address or are broadcast packets). The DECserver would only expect LAT unicast or multicast packets and would not be able to handle other protocols.

Do you also see high multicast frames received ? Do you have a high percentage of broadcast messages on your LAN, maybe something like a 'broadcast storm' from time to time ?

If the DECserver would bugcheck, the bugcheck message should be indicated on the SHOW SERVER STATUS screen after the reboot (just checked the docu).

Volker.
Dale A. Marcy
Trusted Contributor

Re: DECServers Rebooting

Zeni is not in this morning, she will be in later. I checked the patches on the VAX that boots the DECServers running VMS V7.1 and it has the VAXLAT02_071 patch applied which appears to be the latest patch for V7.1 VMS LAT. Upgrading VMS is not allowed at this time because of applications having to be re-certified to run under a newer version of VMS. We are looking into upgrading the DECServer software. I am not sure how far Zeni got in getting a terminal connected to port 1 of the DECServer. The only reason Zeni mentioned the ES47s was the only change that we were aware of when the DECServers started this rebooting was a cluster reboot of 2 of the ES47s. It might have just been a fluke that it happened at the same time. They should not have any effect on the DECServers. The DECServers boot off from a VAX and are not generally used to connect to the ES47s.
Dale A. Marcy
Trusted Contributor

Re: DECServers Rebooting

Was able to capture output from Port 1 on a rebooting DECServer. I am manually typing this in, so could be subject to error:

Local -913- Fatal Bugcheck PC=184EC8, SP=05D8A8, SR=2710, MEM=000000, CODE=100
Local -905- Waiting for image dump
Local -906- Dumping to host "xx-xx-xx-xx-xx-xx"
Local -907- Image dump complete
Local -908- Resetting console terminali
Local -901- Initializing DECserver "xx-xx-xx-xx-xx-xx" -- ROM BL20, H/W Rev D.A
Local -902- Waiting for image load
Local -903- Loading from host "xx-xx-xx-xx-xx-xx"
Local -904- Image load complete

I changed the real addresses to xx and the terminali was not a typo, that is how it printed out. I have the distribution for the upgrade to V3.3, but have not been able to move it to the VAX system because of other problems (not with the VAX, but the other systems I have to use to transfer it to the VAX).
Veli Körkkö
Trusted Contributor

Re: DECServers Rebooting

Code 100 Memory parity error.

However,

MEM = The illegal memory address of an addressing error or the address of the instruction that caused the error.

nd this was 0... I would not spend time upgrading to VMS V7.3. I would much rather upgrade to latest DECserver software, i.e. install DS2033 kit to the load host(s) and reboot the terminal server.

v1.0 must be quite old since I can see V3.0 on CONDIST from Nov 1989 to March 1991, V3.1 from May 1991 to Jan 1994 and V3.3 from May 1994 to 1997 March.

_veli
Zeni B. Schleter
Regular Advisor

Re: DECServers Rebooting

We updated the DecServer software to v3.3 and triggered a reboot of one of the servers. After 50 minutes the other two rebooted. The server with v3.3 did not reboot. This is promising.

I appreciate all the information that has be posted . It has been very helpful.
Zeni B. Schleter
Regular Advisor

Re: DECServers Rebooting

We have not experienced any more rebooting since the DecServer 200 software was upgraded to v3.3 and loaded. Thanks for all the information and insight.