Operating System - OpenVMS
1835116 Members
4311 Online
110076 Solutions
New Discussion

Re: Telnet problem decnet ok

 
Doug_81
Frequent Advisor

Telnet problem decnet ok

Two Alpha node cluster:
Node1 - VMS V7.3, TCPIP V5.1
Node2 - VMS V7.1-1H2, TCPIP V4.2

The problem is that new telnet connections to either node result isn the user seeing a string of y's (with two dots above each) and unable to communicate.

Accessing with a set host command (i.e. Decnet) works fine.

After a "while" (not sure how long), problem on Node2 cleared. However, Node1 required a reboot to clear the problem.

During this problem time, all users that were already connected, experienced no problem.

Anyone have any ideas, or seen this before?

11 REPLIES 11
Uwe Zessin
Honored Contributor

Re: Telnet problem decnet ok

I remember there was a TELNET negotiation bug in, I beleive V5.0A, but it resulted in a waste of CPU cycles on the destination system. Not that it helps you here, but perhaps it helps to trigger somebody elses brain.
.
Jan van den Ende
Honored Contributor

Re: Telnet problem decnet ok

Well,

I'm not sure if it IS the same, but (IIRC, at home now) we had something like it with TCP 5.1.
Under some condition(s) (not really trapable, not reproducable, but rather frequent) we had a 'runaway' ip slot allocation, and once it started, it kept creating more slots then could be released.. And then the Citrix servers started retrying and retrying and,, crashed with exhausted resources,
A patch for TCP/IP (can't remember which) helped us.
Please check you have the latest patches installed!


hth

Jan
Don't rust yours pelled jacker to fine doll missed aches.
Martin P.J. Zinser
Honored Contributor

Re: Telnet problem decnet ok

Hello Doug,

I did not directly experience this problem, but the occasion where you are likely to see an y dierisis is if you connect a terminal up via serial and have not set the line characteristics correctly. So, is it possible that someone had fiddled with IP parameters on these systems in the volatile database, so that they were restored from the permanent settings by the reboot?

Greetings, Martin
Doug_81
Frequent Advisor

Re: Telnet problem decnet ok

I just took over suppor of these systems and the previous support peson had very little knowledge of VMS. I just spoke to that person and found out some additional information.

This cluster (did I mention that they are clustered?), is also running Raxco's PerfectCache which apparently has a bug.

What they have done to work around a problem is to start the cacher on Node2, then start it on Node1. After it starts on Node1, stop it on Node2. If it is not stopped on Node2, then Node2 will crash.

This is what happened here. Node2 crashed and rebooted. These nodes reboot so fast that the person who reported this to me saw it as one minute it had a problem, and then it cleared itself.

I suspect it "cleared itself" by crashing/rebooting. After rebooting Node1, the problem cleared.

The crash dump on Node2, indicates
CPU 00 reason for Bugcheck: MACHINECHK, Machine check while in kernel mode

Process currently executing on this CPU: PMDFFE l

Current image file: $1$DKA100:[SYS0.SYSCOMMON.][SYSEXE]MAIL_SERVER.EXE

This is now looking like a conflict between Raxco's "Perfect"Cache and Process Software's PMDF mail server.
Andreas Fassl
Frequent Advisor

Re: Telnet problem decnet ok

Hi Doug,

even it looks like the typical support answer, you should try to upgrade and patch both systems at least to supported configuration. Especially TCPIP-Services 4.2 is a very old dog.
Neither Raxco nor PMDF will accept your analysis "problem between raxco and PMDF" until you have a stable supported OS and TCPIP stack configuration.

If possible, upgrade to 7.3-2 and the newest TCPIP-Release.

If not, you should first try to have a look at the system without Raxco.

Regards

Andreas
Ian Miller.
Honored Contributor

Re: Telnet problem decnet ok

upgrade to VMS V7.3-2. Try disable raxco perfectcrash and enable XFC and check performance of system. You may find an oppourtunity to save money by not paying for the raxco product.

Try disabling raxco product anyway and see if system is more stable. Slower but stable system better than fast and unreliable.
____________________
Purely Personal Opinion
Doug_81
Frequent Advisor

Re: Telnet problem decnet ok

Thank you all for your input.
I'm shutting down PMDF mail on Node2 tomorrow and upgrading TCP/IP to V5.1 next week.

Same old story with the OS upgrade. Application Support group can't support a newer version. I'll try the IP upgrade and let you know. It could be a month before I know if it has solve the problem. I'll be in touch.
Martin P.J. Zinser
Honored Contributor

Re: Telnet problem decnet ok

I do concur with Ian on the suggestion to try XFC. It can do you quite some good. You should stay current on patches though as the XFC is a non-trivial part of the OS (e.g. for 7.3-2 the Update 1 patch contains a number of fixits for XFC).

Greetings, Martin
Ian Miller.
Honored Contributor

Re: Telnet problem decnet ok

note that XFC will only work if all nodes in the cluster have XFC enabled. So V7.1 is no good.
____________________
Purely Personal Opinion
Andreas Fassl
Frequent Advisor

Re: Telnet problem decnet ok

Hi,

before upgrading from 4.2 to 5.1 (or higher, if possible) please read the release notes in depth. There are some caveats to consider.

When starting up TCPIP after the upgrade you should think about enabling a higher debug level, please check the docs how to.

Regards

Andreas
Doug_81
Frequent Advisor

Re: Telnet problem decnet ok

Thank you all for all your input.
I've upgraded ucx on both nodes to the latest version which is supported on the vms version running on that node:
i.e.
OpenVMS V7.1-1H2 - UCX Version V5.1 - ECO 5
OpenVMS V7.3 - UCX Version V5.4

The strange characters on a telnet connection haven't shown up since....it's been long enough but the system has been crashing during this time, so the results aren't conclusive.

The crashing problem appears to be related to a suspect FDDI card which we replaced last Friday. Could also be a cause of the telnet problem. Personally, I think it's a combination of the two.

Time will tell.
Thanks again,
Doug