Operating System - OpenVMS
1824907 Members
3790 Online
109677 Solutions
New Discussion юеВ

50% packet loss problem.

 
SOLVED
Go to solution
thomas_220
Occasional Advisor

50% packet loss problem.

I clustered two ES47 (OpenVms 7.3.2) together and using 2 PCI to Dual 10/100 UTP ehet to connect the system together: I used one port as the hearbeat and the other as the network connection. My problem is that I consistently got the 50% packet loss issue between the two system. Any suggestions? Thanks.

Thomas.
15 REPLIES 15
Andy Bustamante
Honored Contributor
Solution

Re: 50% packet loss problem.

If this is a new installation, check the network speed/duplex setting. Use

$ MCR LANCP SHOW DEVICE /CHAR

You can set and and define line speed and duplex here. Set is the current boot, Define is the permanent configuration for the next reboot. For example:

$ MC LANCP SET DEVICE EIA0 /SPEED=100/FULL_DUPLEX

I've found that hard setting both the host and the network equipment leads to less problems.

What do you define as the heartbeat? Are you using a cross over cable? This should also be configured manually.

Andy
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
thomas_220
Occasional Advisor

Re: 50% packet loss problem.

Hi Andy,

Thank you very much for your quick response. Yes, I am using the crossover cable to connect the two server together. This "heartbeat" connection is to determine if the primary is down, then the secondary suppose to kick in. Let me try the command that I got from you and I'll get back with you. Thanks again Andy.

Thomas.
thomas_220
Occasional Advisor

Re: 50% packet loss problem.

I am using the cross-over cable to connect the two server. How do I manually configure this in OpenVMS? Thanks.
Andy Bustamante
Honored Contributor

Re: 50% packet loss problem.

To manually configure the cross connection: plug in the cable and make sure both ends are configured for the same network speed and duplex setting. As above but substitute DEFINE for SET to configure the permanent database.

The default behavior of OpenVMS in a cluster configuration is to use all available paths for cluster traffic. Having a cross over cable in the second NIC will spare you from having problems if some bright networking type decides to "quickly do some switch maintenance during lunch when no one will notice a brief outage." Your systems will also pass cluster traffic on the primary interface. You won't need to configure this behavior, it will happen.

I've assumed that you have a DE-602 NIC based on the information in your question. The interfaces will show up as EIA0 and EIB0. For a an even more robust configuration, add a second DE-602, connect the second interface with cross over cable and use LAN Failover or FailSAFE IP on the first 2 interfaces.

Andy
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
Michael Yu_3
Valued Contributor

Re: 50% packet loss problem.

Hi Thomas,

>>> consistently got the 50% packet loss

Is the packet loss resulted in using IP? If so, it might be due to the setup of interfaces and routing table. Please post your interfaces and routing table.

Thanks and regards.

Michael

Volker Halle
Honored Contributor

Re: 50% packet loss problem.

Thomas,

which operation or command do you use, that reports '50% packet loss' ?

TCPIP PING ?

If so, what does your TCPIP interface config look like (TCPIP SHOW INT) ? Did you configure both interfaces on both nodes into the same subnet ? TCPIP might do round-robin on transmit and - if it can't reach the other node on one of the IP interfaces - this would account for 50% of the packets lost.

Consider to also check your SCS channels with

$ MC SCACP> SHOW CHANNEL

and make sure, that each node has 2 channels open to the other node (via the cross-over cable and the network connection).

Volker.
thomas_220
Occasional Advisor

Re: 50% packet loss problem.

Andy,

I tried the define, but that did not resolve the issue yet. Here is the Char for the interfaces for both servers:

LANCP> show dev /char

Device Characteristics EIA0:
Value Characteristic
----- --------------
1500 Device buffer size
Normal Controller mode
External Internal loopback mode
00-13-21-08-18-1A Hardware LAN address
Multicast address list
CSMA/CD Communication medium
FF-FF-FF-FF-FF-FF Current LAN address
128 Minimum receive buffers
256 Maximum receive buffers
Yes Full duplex enable
Yes Full duplex operational
TwistedPair Line media type
100 Line speed (mbps)
Disabled/No Failset Logical LAN state
0 Failover priority

Device Characteristics EIB0:
Value Characteristic
----- --------------
1500 Device buffer size
Normal Controller mode
External Internal loopback mode
00-13-21-08-18-1B Hardware LAN address
Multicast address list
CSMA/CD Communication medium
FF-FF-FF-FF-FF-FF Current LAN address
128 Minimum receive buffers
256 Maximum receive buffers
Yes Full duplex enable
Yes Full duplex operational
TwistedPair Line media type
100 Line speed (mbps)
Disabled/No Failset Logical LAN state
0 Failover priority

LANCP> show dev /char

Device Characteristics EIA0:
Value Characteristic
----- --------------
1500 Device buffer size
Normal Controller mode
External Internal loopback mode
00-13-21-08-18-00 Hardware LAN address
Multicast address list
CSMA/CD Communication medium
FF-FF-FF-FF-FF-FF Current LAN address
128 Minimum receive buffers
256 Maximum receive buffers
Yes Full duplex enable
Yes Full duplex operational
TwistedPair Line media type
100 Line speed (mbps)
Disabled/No Failset Logical LAN state
0 Failover priority

Device Characteristics EIB0:
Value Characteristic
----- --------------
1500 Device buffer size
Normal Controller mode
External Internal loopback mode
00-13-21-08-18-01 Hardware LAN address
Multicast address list
CSMA/CD Communication medium
FF-FF-FF-FF-FF-FF Current LAN address
128 Minimum receive buffers
256 Maximum receive buffers
Yes Full duplex enable
Yes Full duplex operational
TwistedPair Line media type
100 Line speed (mbps)
Disabled/No Failset Logical LAN state
0 Failover priority
Volker,

Yes I used Ping command and get the packet loss error. Here is my TCPIP interface config look like for both servers:

UNITY1$ tcpip
TCPIP> show int
Packets
Interface IP_Addr Network mask Receive Send MTU

IE0 10.10.10.10 255.0.0.0 10 6 1500
IE1 10.1.1.1 255.0.0.0 6 6 1500
LO0 127.0.0.1 255.0.0.0 0 0 4096


TCPIP> show int
Packets
Interface IP_Addr Network mask Receive Send MTU

IE0 10.10.10.20 255.0.0.0 5 9 1500
IE1 10.1.1.2 255.0.0.0 5 6 1500
LO0 127.0.0.1 255.0.0.0 16 16 4096
TCPIP> exit

Also, here is my SCS channels for Server 1 and 2 respectively:

UNITY1$ mc scacp
SCACP> show channel

UNITY1 PEA0 Channel Summary 4-OCT-2005 10:08:12.66:

Remote LAN Dev Channel Total ECS Priority Buffer Delay Load
Total ---- Most Recent ----
Node Loc Rmt State Errors State Cur Mgt Hops Size (uSec) Clas
s Pkts(S+R) CH Opened Time CH Closed Time
---- --- --- ----- ------ ----- --- --- ---- ---- ------ ----
- --------- -------------- --------------
UNITY2 EIA EIA Open 2 N(T,P,S) 0 0 2 1426 28605.4 100
20666 04-OCT 09:41:56.66 (No time)
UNITY2 EIB EIB Open 2 Y(T,P,F) 0 0 2 1426 21534.0 100
48916 04-OCT 09:41:52.41 (No time)
UNITY1 LCL LCL Open 1 Y(T,P,F) 0 0 2 1426 253.3 0
2729 04-OCT 09:41:48.39 (No time)

UNITY2 PEA0 Channel Summary 4-OCT-2005 10:07:27.51:

Remote LAN Dev Channel Total ECS Priority Buffer Delay Load
Total ---- Most Recent ----
Node Loc Rmt State Errors State Cur Mgt Hops Size (uSec) Clas
s Pkts(S+R) CH Opened Time CH Closed Time
---- --- --- ----- ------ ----- --- --- ---- ---- ------ ----
- --------- -------------- --------------
UNITY1 EIA EIA Open 5 Y(T,P,F) 0 0 2 1426 5119.2 100
20520 04-OCT 09:41:56.52 (No time)
UNITY1 EIB EIB Open 4 Y(T,P,F) 0 0 2 1426 4394.5 100
48704 04-OCT 09:41:52.28 (No time)
UNITY2 LCL LCL Open 1 Y(T,P,F) 0 0 2 1426 253.3 0
2649 04-OCT 09:41:49.15 (No time)

Michael,

Here is my Route table:

TCPIP> show route

DYNAMIC

Type Destination Gateway

AN 0.0.0.0 10.10.10.254
AN 10.0.0.0/8 10.10.10.10
AN 10.0.0.0/8 10.1.1.1
AH 10.1.1.1 10.1.1.1
AH 10.10.10.10 10.10.10.10
AH 127.0.0.1 127.0.0.1
TCPIP>

For server 2:

UNITY2$ tcpip
TCPIP> show route

DYNAMIC

Type Destination Gateway

AN 0.0.0.0 10.10.10.254
AN 10.0.0.0/8 10.10.10.20
AN 10.0.0.0/8 10.1.1.2
AH 10.1.1.2 10.1.1.2
AH 10.10.10.20 10.10.10.20
AH 127.0.0.1 127.0.0.1
TCPIP>


Thanks all for your quick response. I really appreciate all your help. Hopefully, my issue will be resolved soon.

Thomas.
rick jones
Honored Contributor

Re: 50% packet loss problem.

If the network is otherwise idle, duplex mismatch will not result in ping losses. The reason being ping only has one packet in flight at one time, so there is no chance to have both ends trying to talk at the same time, so there is no chance to have the failure of CSMA/CD in the half-duplex case.

So, something else must be wrong.

It might be good to make sure that the cable is truly good - if you can verify its operation somewhere else do so.
there is no rest for the wicked yet the virtuous have no pillows
Andy Bustamante
Honored Contributor

Re: 50% packet loss problem.

Here you go

IE0 10.10.10.10 255.0.0.0 10 6 1500
IE1 10.1.1.1 255.0.0.0 6 6 1500

Both of these addresses are in the same subnet but the second interface is only connected to the other ES-47 by a cross over cable. You don't need TCPIP configured to support cluster traffic (SCS).

$ TCPIP SET NOINTERFACE IE1
$ TCPIP SET CONFIG NOINTERFACE IE1


Andy
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
rick jones
Honored Contributor

Re: 50% packet loss problem.

ok, i'm not a VMS networking expert but...

does VMS really "like" having multiple physical NICs configured into the same subnet?

does VMS only accept traffic destined for a given IP address on the interface on which that address is configured?

perhaps the ping responses (or requests) are getting round-robined, and hitting the "wrong" interface?

if it doesn't mess-up anything else, you might try putting the interfaces in separate subnets (at both ends)
there is no rest for the wicked yet the virtuous have no pillows
Andy Bustamante
Honored Contributor

Re: 50% packet loss problem.

>>does VMS really "like" having multiple physical NICs configured into the same subnet?

You can have mutiple NICs in the same subnet. See http://h71000.www7.hp.com/openvms/journal/v2/articles/tcpip.html for some FailSAFE IP configuration ideas.

In this case, the second NIC is connected by a cross over cable to the other ES-47. Any IP traffic using this interface won't reach the network. TCPIP seems to be dividing outbound traffic between the two interfaces configured in the same subnet.

Since the second interface is only there for cluster traffic, the easiest solution is to disable TCPIP on these interfaces.


Andy

If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
thomas_220
Occasional Advisor

Re: 50% packet loss problem.

Andy,

It worked! My question is what if I still want to use TCPIP on the EI1, then how should I configure so that I do not have any packet loss?

Thanks Andy, and all you guys are really helpful.

Regards.

Thomas.
Andy Bustamante
Honored Contributor

Re: 50% packet loss problem.

>>> what if I still want to use TCPIP on the EI1

What are trying to accomplish here? With a 2 node cluster a cross over cable to carry cluster traffic is about a reliable an interface as you can get. You also keep unnecessary TCPIP traffic off the LAN segment.

If you want to connect these interfaces to the same network as your EI0 interface you create a single point of failure, unless you use also redundant physical switches cross connected. Note to network specialists, two VLANs on the same physical super switch do not count as redundant.

You could configure another subnet, 192.168.1.1 & .2 with a /24 mask for example on this interface if have business requirements. You could also add additional networking equipment and configure failSAFE IP amoung all four interfaces. Cluster traffic will automatically find these paths. Matt Muggeridge's article in the link above has nice tips on configuring failSAFE IP in a clustered environment.

Your business requirements should drive the decision here.


Andy
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
thomas_220
Occasional Advisor

Re: 50% packet loss problem.

Thanks Andy! I think your analysis resolved my issue. I learned alots from this forum. Thanks guys! Btw, how do I rate the points for the solution? Thomas.
Robert_Boyd
Respected Contributor

Re: 50% packet loss problem.

Thomas,

You can assign points by viewing your posting while logged in and next to each reply there is a box to fill in with the points for relevance. Then there is a button somewhere above or below to submit the points assignment.

Robert
Master you were right about 1 thing -- the negotiations were SHORT!