Operating System - OpenVMS
1748076 Members
5464 Online
108758 Solutions
New Discussion юеВ

Re: TCP/IP and "Round Robin" issues

 
David J Dachtera
New Member

TCP/IP and "Round Robin" issues

Preface:
I use the terms "UCX" and "TCP/IP Services for OpenVMS" interchangeably, mostly for reasons of brevity.

I've been asking this question everywhere I can and, to date, no one has been able to come up with an answer.

Our site recently migrated to UCX (from Multinet), mostly at our application vendor's request. So far, "faster is better", with one very serious limitation.

(Apologies if the forum scrambles the format of this...)

Consider:

$ ucx sho int
Packets
Interface IP_Addr Network mask Receive Send MTU

LO0 127.0.0.1 255.0.0.0 5288801 5288801 4096
WE0 xxx.xx.xx.205 255.255.255.0 3346932877 3543180500 1500
WE1 xxx.xx.xx.206 255.255.255.0 2596612243 2567812743 1500

If WE1 loses IP Connectivity due to a down-stream issue, UCX will never detect this so long as the interface still has a physical link and, because of "round robin" routing of outbound packets destined for the xxx.xx.xx/24 subnet, production effectively goes down as connectivity to the network falls to less than 40% reliability.

Any thoughts as to how to fix this?

The only way to restore production that I've found to date has been to manually reduce the routing table from this:

$ ucx show route

DYNAMIC

Type Destination Gateway

AN 0.0.0.0 xxx.xx.xx.1
AH 127.0.0.1 127.0.0.1
AN xxx.xx.xx.0/24 xxx.xx.xx.205
AN xxx.xx.xx.0/24 xxx.xx.xx.206
AH xxx.xx.xx.205 xxx.xx.xx.205
AH xxx.xx.xx.206 xxx.xx.xx.206

...to this:

$ ucx show route

DYNAMIC

Type Destination Gateway

AN 0.0.0.0 xxx.xx.xx.1
AH 127.0.0.1 127.0.0.1

...provided the default route is still associated with a working interface.

UCX Engineering is stumped as is support, and no one in the vmsnet.networks.tcp-ip.ucx or comp.os.vms newsgroups has come up with anything useful.

Any ideas anyone may have will be tremendously helpful.

Thanx much.
7 REPLIES 7
Karl Rohwedder
Honored Contributor

Re: TCP/IP and "Round Robin" issues

Have you considered FailsafeIP? I haven't used it by myself this far, but I think it monitors the send/receive countes and moves the IP adress to another interface

regards Kalle
David J Dachtera
New Member

Re: TCP/IP and "Round Robin" issues

Yes, I have considered it (Failsafe IP), but it does not look like it will solve the problem since it only moves IP addresses when it detects failures, and this failure is not detectable.

Thanx for your reply, though. All input is appreciated and welcome.
Volker Halle
Honored Contributor

Re: TCP/IP and "Round Robin" issues

David,

welcome to the OpenVMS ITRC forum !

Regarding the 'detectability' of the problem: wouldn't a PING some-dest-addr drop to about 50% packet loss in such a situation ?

FailsafeIP just monitors the 'bytes received' counters of each interface, so unless one of the interfaces stops receiving packets, this won't help.

Volker.
Volker Halle
Honored Contributor

Re: TCP/IP and "Round Robin" issues

David,

just came across TCPIP> HELP ROUTE

...
12. To change existing network route 206.98.17 using interface device tu0 and gateway 206.98.17.45 to use device tu1 and
gateway 206.98.17.162, enter the following command:

TCPIP> route change -net 206.98.17 206.98.17.162 -olddev tu0 -dev tu1 \
-oldgateway 206.98.17.45

Would a similar command be worth to try in your scenario ?

Volker.
David J Dachtera
New Member

Re: TCP/IP and "Round Robin" issues

Changing the default route is only needed when the interface currently indicated by the default route loses IP connectivity, and this requires manual intervention.

We're looking for something that completely sidesteps any need for manual intervention ("lights out, unattended").

Thanx for replying. I appreciate all the input I can get.

D.J.D.
Volker Halle
Honored Contributor

Re: TCP/IP and "Round Robin" issues

David,

if you're looking for a 'configuration option' to automatically take care of this problem, this may not be possible.

My last 2 comments were trying to suggest procedures to be able to 'detect' the problem and trying to 'fix' it, manually or within an appropriate DCL procedure.

Using FailsafeIP and creating each interface's IP address with a failover address on the other interface may also provide the mechanism to easily fail over the IP address of the 'failing' interface (if you can detect that one) to the 'working' interface (using the TCPIP$FAILSAFE_FAILED_int logical).

Volker.
David J Dachtera
New Member

Re: TCP/IP and "Round Robin" issues

Thanx, Volker!

I certainly appreciate the input.

As you can imagine, this is rather a thorn in our side. We made this change because the vendor was convinced it would solve the high interrupt service demand tyime we were seeing with Multinet. At least Multinet could survive the loss of an interface (other than the default route) without bringing production down.