1748089 Members
5003 Online
108758 Solutions
New Discussion юеВ

NIC Teaming problem

 
Dave Behler
Frequent Advisor

NIC Teaming problem

We are regularly receiving the following warning message on startup of our DL380G4 servers. Several seconds after receiving the first message we receive another message stating that A previously failed Network Link's receive status has been restored. In addition, occasionally we also receive a 5719-Netlogon error.

I swapped out network cables, tried different ports, etc.. all with the same result. This is happening on all of our DL380G4's where we have teaming installed & configured. We have teaming configured for NFT only. These two warnings occur after every reboot. When opening the team utility both nics are connected, but nic #1 (the primary nic) is in standby.

Our config is DL380G4's, w2k3 (no sp1), psp7.40a (also have tried version 8.15 & 8.20 of the NIC Teaming Utility), connected to two Cisco 6509 Switches with ports configured to Auto/Auto (also tried forcing 100/Full, but with no change) and made sure all ports are in the same vlan across both switches.

The problems that I am experiencing sound very similar to the following advisory, however upgrading to HP Network Configuration Utility Version 8.15 and/or 8.20 doesn't appear to help.
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c00573469

Type: Warning
Source: CPQTeamMP
Event ID: 434
Event Time: 2/6/2006 3:32:22 PM
User: n/a
Computer: abc123
Description:
HP Network Team #1: PROBLEM: A non-Primary Network Link is not receiving. Receive-path validation has been enabled for this
Team by selecting the Enable receive-path validation Heartbeat Setting.
ACTION: Please check your cabling to the link partner. Check the switch port status, including verifying that the
switch
port is not configured as a Switch-assist Channel. Generate Broadcast traffic
on the network to test whether these are being received. Also make sure all teamed NICs are on the same broadcast
domain. Run diagnostics to test card. Drop the NIC from the team, determine whether it is receiving broadcast traffic
in that configuration.

Type: Warning
Source: CPQTeamMP
Event ID: 386
Event Time: 2/6/2006 3:32:23 PM
User: n/a
Computer: abc123
Description:
HP Network Team #1: PROBLEM: A Failover occurred: The Primary Network Link is not receiving. Receive-path validation has been
enabled for this Team by selecting the Enable receive-path validation Heartbeat Setting.
ACTION: Please check your cabling to the link partner. Check the switch port status, including verifying that the switch
port is not configured as a Switch-assist Channel. Generate Broadcast traffic
on the network to test whether these are being received. Also make sure all teamed NICs are on the same broadcast
domain. Run diagnostics to test card. Drop the NIC from the team, determine whether it is receiving broadcast traffic
in an unteamed configuration.
12 REPLIES 12
Connery
Trusted Contributor

Re: NIC Teaming problem

Hi David,
Here are a couple of things to try:

1. Try turning on PortFast on the switch ports that these NICs are connected to. The NICs exchange heartbeats and the switch ports may be being blocked by STP for up to 30-50 seconds causing the heartbeats not to work.

2. Try turning off heartbeats during a boot up to see if the messages stop. If so, then the focus should be on trying to determine whats preventing the heartbeats from succeeding during boot.

3. Can you attach the switch config for the ports connected to these NICs? I'll look over it to see if I see anything out of the norm.

Best regards,
-sean
Dave Behler
Frequent Advisor

Re: NIC Teaming problem

1. I forgot to mention in my initial post that PortFast was already enabled.

2. After upgrading to 8.15, the messages continued to appear on reboot. Turned off heartbeat setting (both transmit & receive path validation), rebooted and the messages stop. Great to see that they stopped, but where do we go next?

3. Switch configs attached as provided by my network team. If this is not what you are looking for please let me know and I'll see what I can do.

Thanks,
Dave
Connery
Trusted Contributor

Re: NIC Teaming problem

Hi David,
I looked over the configs and everything looks fine.

Here are a couple of more troubleshooting steps:
1. Have the networking group set the ports to the equivalent of the command "set port host". This command on CatOS disables trunking and channeling autonegotiation and turns on PortFast. You have already turned on PortFast, but I have seen trunking and/or channeling autonegotiation (DTP and PAgP, respectively) cause port startup delays that produce the symptom you are seeing.

2. Try increasing the heartbeat timer in the Teaming GUI to a higher value - say, double. See if this changes the behavior.

3. You can also try plugging both ports into the same Cisco switch to see if that changes the behavior. I know you don't want to run like that (because you want switch redundancy) but it would provide troubleshooting information.

Regards,
-sean
Matthijs Wijers_1
Trusted Contributor

Re: NIC Teaming problem

One simple thing to try:

Edit TCP/IP properties of your networkcard(s), goto advanced, choose WINS
untick "Enable LMHOSTS lookup"

Regards,
Matthijs
Connery
Trusted Contributor

Re: NIC Teaming problem

Use (or no use) of LMHOSTS lookup should not have an affect on NIC Teaming heartbeat error messages.

I'd be very interested in an explanation if someone disagrees.
Matthijs Wijers_1
Trusted Contributor

Re: NIC Teaming problem

I've seen netlogon (5719) errors during boot being solved by disabling the LMHOSTS lookup.

In this case the netlogon error makes sense if there's a hardware error causing failovers and lost links.
Is spanning tree disabled?

Regards,
Matthijs
Connery
Trusted Contributor

Re: NIC Teaming problem

Disabling LMHOST lookup may address his occassional 5719, but it won't have an affect on his more persistent problem of Teaming heartbeat event log entries.

Spanning tree has already been address in my first reply (PortFast). He replied that PortFast has already been implemented and his attached config confirms.

-sean
Matthijs Wijers_1
Trusted Contributor

Re: NIC Teaming problem

Team members can be split across more than one switch in order to achieve switch redundancy. However,
all switch ports that are attached to members of the same team must comprise a single broadcast domain
(in other words, same VLAN). Additionally, if problems exist after deploying a team across more than one
switch, all team members should be reattached to the same switch. If the problems disappear, then the
cause of the problem resides in the configuration of the switches and not in the configuration of the team. If
switch redundancy is required (in other words, team members are attached to two different switches), then
HP recommends that the switches be deployed with redundant links between them and Spanning Tree be
enabled (or other Layer 2 redundancy mechanisms) on the ports that connect the switches. This helps
prevent switch uplink failure scenarios that leave team members in separate broadcast domains.

More information:
"HP ProLiant Network Adapter Teaming White Paper"
ftp://ftp.compaq.com/pub/products/servers/networking/TeamingWP.pdf

Regards,
Matthijs
Connery
Trusted Contributor

Re: NIC Teaming problem

I wrote that paper. Glad to see it's being used! :-)

David has already verified that all ports are in the same VLAN. I also verified it by looking at his configs.

I also already took my own advice from that paper and recommended that David connect both Team members to the same switch to see if it solves the problem. I'm waiting on him to respond with the result.

Thanks for your ideas on the possible causes!

-sean