Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Decnet V question, (DecNet over IP).

 
SOLVED
Go to solution
The Brit
Honored Contributor

Decnet V question, (DecNet over IP).

At our site we are using mostly bl860c blades with OpenVMS 8.3-1H1, Decnet Phase V and TCPIP Services 5.6 ECO 5.

We have 4 standalone systems (Testing/Dev/etc) and a production cluster. The cluster is 3 blades + 1 Alpha (DS10 running OpenVMS 8.3, DecNet phase V and TCPWare 5.8.)

I am experiencing a problem with "Set Host" to/from the DS10 and any other cluster node.

Set Host To/from the DS10 to standalone nodes is fine.

Set Host To/From Blade to Blade within the cluster is fine.

The only place there is a problem is when Setting Host from a cluster blade to the DS10 (also a cluster member), or from the DS10 to another cluster member (i.e. a blade).

When a "Set Host" command is executed, it succeeds, however it takes 40-50 seconds to connect (in either direction)

note: The blades are in c7000 enclosures. The standalone blades are in a different location (which is the reason for DECnet over IP).

The 3 cluster blades are in a different enclosure and the ds10 is connected directly to a port on the Ethernet interconnect, which provides the path for SCS.

Can anyone suggest a reason why DecNet should be so sluggish to/from a cluster blade and the DS10.

Note: This problem is with DECNet only, Telnet works fine to/from all nodes.

thanks

Dave.





14 REPLIES 14
Hoff
Honored Contributor

Re: Decnet V question, (DecNet over IP).

Check your DNS server setting for both of the DECnet Phase V boxes involved; DECnet-Plus (these days) is usually set for local and BIND translations, and delays as described can be triggered by forward or reverse translation (back translation) time-outs within BIND.
Steve Reece_3
Trusted Contributor

Re: Decnet V question, (DecNet over IP).

Is it any quicker if you do SET HOST IP$xx.xx.xx.xx and go directly to the IP address of the clustered blade or DS10? Similarly, what happens if you do SET HOST x.y and use DECnet protocols directly to the clustered blade/DS10?

Do all of the clustered blades/DS10 have all of their addresses in the local address databases and towers or are they looking for DNS translation for the IP addresses? Do all of the DNS servers return the IP addresses? (i.e. what happens if you do a TCPIP SHOW HOST ?)

The magic incantation that sometimes helps is the one for clearing cached names/addresses:
MC NCL FLUSH SESSION CONTROL NAMING CACHE ENTRY "*"

Bear in mind that DECnet Plus keeps the cache of addresses (the naming cache) and maintains this across reboots. The cached addresses may still be in cache and causing DECnet to "go there" first rather than the real address where they've been updated.

Steve
tsgdavid
Frequent Advisor

Re: Decnet V question, (DecNet over IP).

Could this be a DECnet phase IV vs. DECnet phase V issue?

I would make sure that in DECNET_REGISTER on the node on which you issue SET HOST that the TP4 addresses are defined first and the NSP address are defined second (or not at all).

You can attempt to connect specifically to a phase IV or phase V address, to see if this is an issue, using

SET HOST NET$

The address ending in "20" is phase IV and the address ending i "21" is phase V. For example:

SET HOST NET$490028AA0054A021 will connect using a phase V address.

David Williams
The Brit
Honored Contributor

Re: Decnet V question, (DecNet over IP).

Steve

If I do a "set host XX.XX.XXX.XXX" the connection is created immediately.

If I do a "set host 1.7", the result is the same as using the hostname, i.e. 30-40 sec delay.

The "magical incantation" had no effect.

I have attached the output from DECNET_REGISTER for both the blades and the DS10.

I tried connecting to the net$... addresses with the following result

from BUD (Blade) to SPEEDY (DS10)

Bud:System>set host net$490001AA000400070420
%SYSTEM-F-UNREACHABLE, remote node is not currently reachable
Bud:System>set host net$490001AA000400070421
%SYSTEM-F-LINKEXIT, network partner exited

from BUD (Blade) to CITIUS (Blade)

Bud:System>set host net$490001AA000400230420

TESSCO Technologies - Unauthorized use is prohibited

Username: Exit

Bud:System>set host net$490001AA000400230421

TESSCO Technologies - Unauthorized use is prohibited

Username:

So this works between blades, but not between blade and DS10.

I think this is something to do with the setup of Decnet Phase V host resolution in the TCPWARE environment on the DS10.

Does this information and the attachment shed any new light on the problem??

thanks

Dave.
Ian Miller.
Honored Contributor

Re: Decnet V question, (DecNet over IP).

What is the result of
ncl show session control naming search path
ncl show session control back search path
ncl show session control transport Precedence
____________________
Purely Personal Opinion
tsgdavid
Frequent Advisor

Re: Decnet V question, (DecNet over IP).

It sounds like your attempt to access SPEEDY using a phase IV address is not geting routed properly. Does the SET HOST with the phase V (ending in 21) connect quickly? If so, that is a sign that this is the problem.

As stated before, I would redefine the order in DECNET_REGISTER so that the :21 address is first before :20. this way connections should try phase V first before attempting to use phase IV.

After making the change, you may have to clear the naming cache to correct this using:

MCR NCL FLUSH SESSION CONTROL NAMING CACHE ENTRY "*"

David Williams
The Brit
Honored Contributor

Re: Decnet V question, (DecNet over IP).

Ian,
I have attached the output from the three commands executed on the blade and on the DS10. As far as I can see, the output looks the same.

David
I was not able to set host to the DS10 from the blade, using either the Phase IV or the Phase V address. Both methods took a significant length of time, however they returned different errors.

Phase IV: %SYSTEM-F-UNREACHABLE, remote node is not currently reachable

Phase V: %SYSTEM-F-LINKEXIT, network partner exited

This was the result for both Incoming and Outgoing "set host" to/from the DS10.

Just as a reminder, the "set host" command always works, eventually. The problem is that it takes an inordinately long time to connect. I currently have a ticket in with Process Software because I believe that there is a problem related to host name resolution and our implimentation of TCPWARE.

I need to go offline now for ~3 hours, however I will respond to any new suggestions when I return.

thanks

Dave.
Walt McGaw
Occasional Advisor
Solution

Re: Decnet V question, (DecNet over IP).

Hello,

When you connect by name using DECnet, there is an order that addresses are tried. The address lookups occur first, and the information is cached in CDI. If you have 3 addresses configured for the remote node, then the connection order is DECnet OSI (21 selector address), DECnet over IP, and then DECnet using NSP (Phase IV style connection and the address has a 20 selector).

If DECnet protocol is being blocked between the source node and target node, we have to time out the connection for the DECnet OSI address (which is about 40 seconds or so) and then we will try the DECnet over IP connection (which appears to work for you). This could quite well explain the delay you are seeing trying to connect by name, but quick response to the ip$nnn.nnn.nnn.nnn address (which forces DECnet over IP).

If this is the case, you should remove the DECnet addresses for the target node from the source node in DECNET_REGISTER and then flush the source node's CDI cache using the command $ MCR NCL FLUSH SESSION CONTROL NAMING CACHE ENTRy "*".
Hoff
Honored Contributor

Re: Decnet V question, (DecNet over IP).

If you're using DECdns for the host name translations (and that is one of several options that can be available), make sure that that the associated name server(s) are reachable, and didn't accidentally get shut off, and that you have the "right" look-up order set in your DECnet configuration.
The Brit
Honored Contributor

Re: Decnet V question, (DecNet over IP).

Walt,
It seems like you have hit the nail on the head.

I was able to test this from one of my standalone nodes, by adding the DS10 to the DECNET_Register DB on that node. Note, there was no entry for the DS10 node in DecNet_Register on the standalone, and "SET HOST" to/from the DS10 was previously working fine. Also, since the standalone node is in another building, on a different vLAN, then both the Phase V and IV connectivity are not available anyway, however DecNet over IP should still work.

After entering the DS10 into DECNET_REGISTER on the standalone, and clearing the naming cache, I am now experiencing the ~40 sec delay connecting to the DS10 from the standalone. However SET HOST from the DS10 to the Standalone is still working fine (it has no entry in Decnet_Register for the standalone node)

So it seems like that issue might be resolved.

Anyway, A Question!! Where is the DECNET_REGISTER information stored?? Specifically, in a cluster, do the nodes share the same DB/File??

Dave.

Ian Miller.
Honored Contributor

Re: Decnet V question, (DecNet over IP).

doing
ncl set session control transport precedence {nsp,osi}

can help too
____________________
Purely Personal Opinion
The Brit
Honored Contributor

Re: Decnet V question, (DecNet over IP).

Ian,
will this cause DECNet-over-IP to be attemped first??

Dave
Walt McGaw
Occasional Advisor

Re: Decnet V question, (DecNet over IP).

Hello Dave,

Changing the transport precedence will not help in this case. DECnet over IP is considered an OSI application connection, not NSP. The best thing is to remove the 21 tower and 20 tower information from decnet_register for any remote nodes that you only have DECnet over IP capability.

The decnet register database is called NET$LOCAL_NAME_DATABASE.DAT and resides in the SYS$SYSTEM: directory. Typically this is in sys$common, so it is shared across a cluster. If you remove the :21 and :20 addressing information for each remote node that is only reachable via DECnet over IP, you can then issue the NCL> FLUSH SESSION CONTROL NAMING CACHE ENTRY "*" command on all cluster nodes and the connection should then work quickly.

Best regards,
Walt
The Brit
Honored Contributor

Re: Decnet V question, (DecNet over IP).

Finally figured out what was causing the problem in the first place. (We didnt have this problem a few months ago)

The solution jumped out when testing Walt's solution above. I was going to test between my TEST system an the DS10. It crossed my mind that this was not quite like the problem in my cluster since this was a different building/network that DECNet addressed (phase IV or V) would not work anyway. The test/development systems use DECNet-over-IP by default, and do not have references to any other systems in their respective DECNET_REGISTER's.

I induced the timeout delay on my TEST system by creating an entry in DECNET_REGISTER for the DS10, and then clearing the naming cache. The delay was removed by reversing this action.

I then re-examined the cluster situation again and the light went off. A couple of months ago we set about moving all of the cluster nodes to a new VLAN. After the move, everything was fine except that we were having a problem with Advanced Server. As a result, the DS10 was rolled back to its old address (but the blades remained on the new). Now that the DS10 and the Blades were on different VLAN's, then by default, the Phase V lookup would be blocked and at that point we would begin to incur the timeout delay.

Because this node is dedicated to a few specific apps, and because "set host" to this node is very rare (usually telnet), the problem was not observed until about 10 days ago.

My thanks to everyone who contributed. It greatly helped me understand some fundamental DECNet Phase V issues.

Dave.