- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- DECnet cluster alias breaks outgoing connections
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-01-2012 10:13 PM
11-01-2012 10:13 PM
DECnet cluster alias breaks outgoing connections
Hi,
This site has a dozen VMSclusters configured more-or-less identically: each has a pair of AlphaServer GS1280s running VMS 8.3 and a pair of BL8060c i2 blades running 8.4. Each cluster has DECnet-Plus and TCP/IP Services, and there are cluster aliases for each cluster.
The problem concerns the two Integrity blades ("MEBT03" and "MEBT04") in one cluster ("MEBA"). When these machines make an outgoing DECnet connection that has the "Outgoing Alias" attribute set to "True", the connection usually fails.
For example, trying to list a directory on a remote node or open a file typically results in an error like
%DIRECT-E-OPENIN, error opening MELT04::*.*;* as input
-RMS-E-FND, ACP file or directory lookup failed
-SYSTEM-F-UNREACHABLE, remote node is not currently reachable
after a few minutes. Occasionally the connection succeeds (still with some delay before results come back) but only because it was able to switch network stacks before the outgoing timer expired.
Setting the "Outgoing Alias" attribute to False makes the problem go away, but then the remote node associates the actual local nodename with the connection rather than the cluster alias name. This isn't a huge issue, it mainly affects DECnet proxies, and at least it gives us a workaround.
As part of the problem investigation I ran SYS$SYSTEM:CDI$TRACE on the remote node and initiated another DIRECTORY command on MEBT04, see the attached file. It seems to show a number of incoming connection attempts, all of which are correctly associated with the MEBA cluster alias, so my suspicion is that whatever acknowledgement is supposed to be sent back to the originating node isn't being received. On this occasion the originating node eventually tried connecting using TCP/IP rather than NSP, and that worked straight away. (The nodes are configured to try NSP first then TCP/IP because NSP usually works and saves having to recreate a whole lot of proxy entries.)
I'm just wondering where to look next. The naming cache has been flushed and the DECnet local database appears correct on all nodes. As far as I can tell, the hardware and DECnet configuration on these two servers is the same as on the other Integrity servers (with the obvious exception of the actual hostnames and addresses). If someone can point me to a specific configuration issue which would cause this behaviour, I can try harder to find it.
Thanks,
Jeremy Begg
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-03-2012 09:27 AM
11-03-2012 09:27 AM
Re: DECnet cluster alias breaks outgoing connections
Hello
I think one possible explanation is that the two nodes in question are end-nodes, and there aren't any L1 routers on the same segment with those two nodes.
MCR NCL SHOW ROUTING TYPE
MCR NCL SHO ROUTING CIRC CSMACD-xxx ADJ * LAN ADDR, NEIGHBOR NODE TYPE
you need at least one adjancent PHASE-V router in order for the cluster alias to work right.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-06-2012 06:30 AM
11-06-2012 06:30 AM
Re: DECnet cluster alias breaks outgoing connections
Jeremy,
as you may well know DECnet/Plus End Systems (Phase V) should not need a Router to connect to a DECnet Cluster Alias.
The inital communication (if no entry exists in the End-Node Cache) is done via multicasts (ALL-ES).
Are these systems in the various clusters multi-circuit (more than one DECnet circuit) end systems?
What do you see in the End-Node Cache on the initating and the receiving system when you initate a connection?
SDA> net show routing cache
To see what is really happening, you'll have to use something like Wireshark to get a trace.
Unfortunately, the analysis is a bit cumbersome, as Wireshark only associates NSP running over PhaseIV Routing protocol and not over ISO 8473 as in this case.
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-06-2012 03:17 PM
11-06-2012 03:17 PM
Re: DECnet cluster alias breaks outgoing connections
Hi,
All of the Itanium servers (21 of them) are configured identically, it's only two of them which are showing this problem. Note that the issue concerns initiating an outgoing connection. It doesn't matter which node or cluster alias they try to connect to. If I set "Outgoing Alias = False" on the FAL object, the outgoing connection works without any problems. But we'd rather have "Outgoing Alias = True".
All of them have two DECnet CSMA-CD circuits, the routing type is Endnode, and there is a DECnet router.
For example, on MEBT04 ...
MEBT04> mcr ncl NCL>show session control application fal all char Node 0 Session Control Application FAL at 2012-11-07-09:55:12.443+11:00Iinf Characteristics Client = <Default value> Addresses = { number = 17 } Outgoing Proxy = True Incoming Proxy = True Outgoing Alias = True Incoming Alias = True Node Synonym = True Image Name = SYS$SYSTEM:FAL.EXE User Name = <Default value> Incoming OSI TSEL = <Default value> OutgoingAlias Name = <Default value> Network Priority = 0 NCL>show routing type Node 0 Routing at 2012-11-07-09:55:17.979+11:00Iinf Characteristics Type = Endnode NCL>show routing circuit * Node 0 Routing Circuit CSMACD-0 at 2012-11-07-09:55:22.371+11:00Iinf Identifiers Name = CSMACD-0 Node 0 Routing Circuit CSMACD-1 at 2012-11-07-09:55:22.371+11:00Iinf Identifiers Name = CSMACD-1 NCL>show routing circuit csmacd-0 adj * lan addr, neighbor node type Node 0 Routing Circuit CSMACD-0 Adjacency RTG$0001 at 2012-11-07-09:56:03.970+11:00Iinf Status LAN Address = AA-00-04-00-1F-08 (LOCAL:.MELR01) Neighbor Node Type = Phase V Router NCL>show routing circuit csmacd-1 adj * lan addr, neighbor node type Node 0 Routing Circuit CSMACD-1 Adjacency RTG$0001 at 2012-11-07-09:56:09.051+11:00Iinf Status LAN Address = AA-00-04-00-1F-08 (LOCAL:.MELR01) Neighbor Node Type = Phase V Router NCL>
Here's the routing cache (as shown by SDA) on the target system (MELT04) when I issue a DIR MELT04:: command on MEBT04:
DECnet-OSI for OpenVMS Routing ES Cache Dump
--------------------------------------------
Routing Prefix DataBase Address B0739EF8
Prefix Table Start: 91A4E4CC , End: 91A4E6CC, Size 0
Routing Cache DataBase Address B0739EE0
Cache Table Start: 91A4FF0C , End: 91A5070C, Size 4
Cache Entry at Address 91A55714
NSAP:
4900 02AA0004 00160820 .....ª..I 00000000
NSP Transport - (2.22)
Cache Circuit Entry Count : 2, Probe Count: 894
Cache Circuit List: 91A579C0
Cache Circuit Entry:
Type: BroadCast
Format: PhaseV
Reachability: Direct
Blocksize: Non-FDDI
Remaining LifeTime: 002B
Holding Time: 012C
Data Link Address:
41CF 3DA41700 ..¤=ÏA 00000000
Cache Circuit Entry:
Type: BroadCast
Format: PhaseV
Reachability: Direct
Blocksize: Non-FDDI
Remaining LifeTime: 012C
Holding Time: 012C
Data Link Address:
41CF 3DA41700 ..¤=ÏA 00000000
Cache Entry at Address 91A5569C
NSAP:
4900 02AA0004 001B0820 .....ª..I 00000000
NSP Transport - (2.27)
Cache Circuit Entry Count : 1, Probe Count: 851
Cache Circuit List: 91A571E0
Cache Circuit Entry:
Type: BroadCast
Format: PhaseV
Reachability: Direct
Blocksize: Non-FDDI
Remaining LifeTime: 012B
Holding Time: 012C
Data Link Address:
76CF 3DA41700 ..¤=Ïv 00000000
Cache Entry at Address 91A556EC
NSAP:
4900 02AA0004 00320820 .2...ª..I 00000000
NSP Transport - (2.50)
Cache Circuit Entry Count : 1, Probe Count: 997
Cache Circuit List: 91A5E570
Cache Circuit Entry:
Type: BroadCast
Format: PhaseV
Reachability: Direct
Blocksize: Non-FDDI
Remaining LifeTime: 0121
Holding Time: 012C
Data Link Address:
18DC 3DA41700 ..¤=Ü. 00000000
Cache Entry at Address 91A556C4
NSAP:
4900 02AA0004 00360820 .6...ª..I 00000000
NSP Transport - (2.54)
Cache Circuit Entry Count : 1, Probe Count: 0
Cache Circuit List: 91A5E5A0
Cache Circuit Entry:
Type: BroadCast
Format: PhaseV
Reachability: Reverse
Blocksize: Non-FDDI
Remaining LifeTime: 024D
Holding Time: 0258
Data Link Address:
0836 000400AA ª...6. 00000000
(2.54)
The initiating node (MEBT04) has DECnet address 2.54 and its cluster alias (MEBA) is 2.50, both of which appear in the cache dump above.
And here's the cache on MEBT04 at the same time:
DECnet-OSI for OpenVMS Routing ES Cache Dump -------------------------------------------- Routing Prefix DataBase Address AE739EF8 Prefix Table Start: 91A6CCCC , End: 91A6CECC, Size 0 Routing Cache DataBase Address AE739EE0 Cache Table Start: 91A6E70C , End: 91A6EF0C, Size 3 Cache Entry at Address 91A73EC4 NSAP: 4900 02AA0004 00030820 .....ª..I 00000000 NSP Transport - (2.3) Cache Circuit Entry Count : 1, Probe Count: 995 Cache Circuit List: 91A75D50 Cache Circuit Entry: Type: BroadCast Format: PhaseIV Reachability: Direct Blocksize: Non-FDDI Remaining LifeTime: 00A5 Holding Time: 0258 Data Link Address: 0803 000400AA ª..... 00000000 (2.3) Cache Entry at Address 91A73F14 NSAP: 4900 02AA0004 00150820 .....ª..I 00000000 NSP Transport - (2.21) Cache Circuit Entry Count : 1, Probe Count: 144 Cache Circuit List: 91A7D1A0 Cache Circuit Entry: Type: BroadCast Format: PhaseV Reachability: Direct Blocksize: Non-FDDI Remaining LifeTime: 0127 Holding Time: 012C Data Link Address: 14DC 3DA41700 ..¤=Ü. 00000000 Cache Entry at Address 91A73EEC NSAP: 4900 02AA0004 001D0820 .....ª..I 00000000 NSP Transport - (2.29) Cache Circuit Entry Count : 1, Probe Count: 999 Cache Circuit List: 91A75D80 Cache Circuit Entry: Type: BroadCast Format: PhaseV Reachability: Direct Blocksize: Non-FDDI Remaining LifeTime: 011D Holding Time: 012C Data Link Address: 367C 77A41700 ..¤w|6 00000000
MELT04 has address 2.29 and that's the last entry in the cache listing.
All of these nodes have OPENVMS-I64-MCOE licences so maybe I could try changing the node routing type to L1 router, but I don't see why it should be necessary on two nodes out of 21.
Thanks,
Jeremy Begg
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2012 04:36 AM
11-07-2012 04:36 AM
Re: DECnet cluster alias breaks outgoing connections
Jeremy,
I just tested your configuration(?) in a similar environment and it worked for me:
V8.3 OpenVMS Alpha Multicircuit ES => V8.4 OpenVMS IA64 Multicircuit ES in a router (OpenVMS host based) environment.
These are the DECnet versions I was using:
Implementation =
{
[
Name = OpenVMS I64 ,
Version = "V8.4 "
] ,
[
Name = HP DECnet-Plus for OpenVMS ,
Version = "V8.4 ECO02 14-JUN-2012 16:46:39.40"
]
}
Implementation =
{
[
Name = OpenVMS Alpha ,
Version = "V8.3 "
] ,
[
Name = HP DECnet-Plus for OpenVMS ,
Version = "V8.3 ECO03 25-NOV-2008 15:49:46.19"
]
}
What surprised me from your output was that you only had a 'Cache Circuit Entry Count' of 1, even on your MEBT04 system,
considering you have 2 active DECnet circuits over which you can see the Router adjacencies.
I assume you are running a meshed and not a dual-railed LAN?
Are you using a host-based router?
Would it be possible to collect some more info, possibly in the form of an attachment.
On MEBT04, MELT04 and MELR01
$ mc lancp sho config
$ mc ncl show implementation
$ mc ncl sho csma-cd stat * all stat
$ mc ncl sho routing all char
$ mc ncl sho routing circ * all
On MEBT04, MELT04
$ mc ncl sho address
$ mc ncl sho alias port * all
$ mc ncl sho session control all
On MELR01 (router)
$! Address 2.50
$mc ncl sho rou dest node AA-00-04-00-32-08 all
$! Address 2.54
$mc ncl sho rou dest node AA-00-04-00-36-08 all
$! Address 2.29
$mc ncl sho rou dest node AA-00-04-00-1D-08 all
John
PS. I may not respond for the next few days as I'm attending the local TUD.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2012 06:03 PM
11-07-2012 06:03 PM
Re: DECnet cluster alias breaks outgoing connections
Hi John,
Thanks for helping out. I have attached the output from those commands for MEBT04 and MELT04.
MELR01 is a router box somewhere (I've never seen it), it's not a VMS system and I don't have access to it.
I should explain the LANCP configuration. These are blade servers configured using Virtual Connect and each blade gets assigned 16 virtual ethernet interfaces by the hardware, even though there are only two physical ethernet ports on the chassis. We have configured 6 of the 16 ethernet ports for use by VMS:
EWA0 & EWB0 are reserved for SCA (cluster traffic)
EWD0 & EWL0 are dedicated to a TCP/IP subnet used for backing up filesystems across the network
EWI0 & EWJ0 are general purpose used for DECnet and TCP/IP.
I have raised a support case with HP. If you want to continue to assist in my problem feel free to do so.
I shall share the results either way.
Regards,
Jeremy Begg
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-11-2012 09:49 PM
11-11-2012 09:49 PM
Re: DECnet cluster alias breaks outgoing connections
Well, I have an update of sorts.
HP logged in and had a look and found a number of "issues" with our DECnet configuration. But I'm not convinced that they explain why we're seeing this problem on only one of our many VMSclusters.
One thing I've found which might be significant: the router for the test/dev systems has a DECnet-IV address which is higher than all the test/dev node addresses EXCEPT for the nodes in the "problem" cluster.
On our production systems, the two production routers' DECnet-IV addresses are higher than all the production nodes.
Is it possible that our problem could be due to the DECnet address, and if so, why?
Thanks,
Jeremy Begg
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-12-2012 01:18 AM
11-12-2012 01:18 AM
Re: DECnet cluster alias breaks outgoing connections
Jeremy,
from the output that you made available I found nothing untoward except in combination with the following statement you made
"MELR01 is a router box somewhere (I've never seen it), it's not a VMS system and I don't have access to it."
Can you access it via NCL ?
NCL> SHOW NODE 2.31 all
If it is not a old DIGITAL router (eg. DECNIS) or a host based router then I would expect that your NCL ROUTING characteristic attribute 'Routing Mode' would be set to SEGREGATED and not INTEGRATED as in your case, as most non DEC router implementations were based on SIN (Ships In the Night). But I imagine this setting is also present on all the other cluster systems where you made your connectivity tests(?).
What do you mean when you say:
"the router for the test/dev systems has a DECnet-IV address which is higher than all the test/dev node addresses EXCEPT for the nodes in the "problem" cluster."
If all your systems are DECnetPlus End Systems, you will only need a router if you are routing between areas. As I have said before, unlike DECnet PhaseIV Systems, you will not need a router to serve a Cluster Alias.
Once this is hopefully resolved you can join me in persuading Engineering to move the tcpdump functionality from the TCP stack down to the LAN driver level (as was the case with Tru64) so that we resolve these problems a lot quicker than is the case now with utitilities such as CTF.
John