1832173 Members
2780 Online
110038 Solutions
New Discussion

Re: Weird Problem!!!

 
Sprint Unix Team
Frequent Advisor

Weird Problem!!!

We have a 6 node MC/SG cluster with auto port aggregation on all of them.
ANd recently i added a new node with autoport aggregation. These nodes are connected to XP512.

Now for the problem. WHen the auto port aggregation is down..there is no packet loss..but when its aggregated there is severe packet loss and time outs on the network with this server. ALso, the SG commands like cmviewcl, cmviewconf takes a lot of time than the usual to give an output...

Iam not sure whether all these problems are inter-related...PLEASE HELP!!!
5 REPLIES 5
harry d brown jr
Honored Contributor

Re: Weird Problem!!!

>>>> ANd recently i added a new node with autoport aggregation. <<<<
And what happens when you remove the new node from the cluster, does everything go back to normal?

>>>> WHen the auto port aggregation is down..there is no packet loss..but when its aggregated there is severe packet loss and time outs on the network with this server.<<<<

No packet loss where, on the 7th (new) node?

>>>>ALso, the SG commands like cmviewcl, cmviewconf takes a lot of time than the usual to give an output...<<<<

How are you trying to run and display these commands (via network using X or ascii display like the console)?

What OS are you running, and what patch level?

What servers make up your nodes?

What kind of network cards are you APA'ing?

Maybe, if I ask enough questions, someone else can chime in with some other ideas?

live free or die
harry

Live Free or Die
Bill Hassell
Honored Contributor

Re: Weird Problem!!!

Sounds like auto-negotiation failure at 100 Mbits. Could be due to cable length or differences between the switch and the CPU LAN cards. Verify this with lanadmin -x. Negotiation failure will revert to 100Mbit half duplex and you'll VERY slow throughput with lots of FCS errors and collisions (not possible in full duplex) in lanadmin stats.

For maximum reliability, change all switch ports and all LAN cards to manual setting: 100 Mbit full duplex.


Bill Hassell, sysadmin
Krishna Prasad
Trusted Contributor

Re: Weird Problem!!!

Also check and make sure that the physical cards are set correctly in /etc/rc.config.d/hpbtlanconf. These would the ports that are setup in hp_apaportconf to create the aggregat. In our case we need to make sure they are set to 100MB full-duplex and the port in the switch is the same or they will cause "runts" ( what our network experts called them) which will slow down everything on that network. Basically sending out different packet sizes to the switch then what the switch is expecting and configured for.
Positive Results requires Positive Thinking
Ron Kinner
Honored Contributor

Re: Weird Problem!!!

Ron,

Runts are packets that are discarded because they are smaller than the medium's minimum packet size. Any Ethernet packet that is less than 64 bytes is considered a runt.

The above from cisco.com

Runts get created by collisions. The original sender detects that someone else is transmitting so abruptly stops. This creates a unfinished or truncated packet with only a partial header.

You get a major increase in collisions when one end is half and the other end is full since the full end doesn't listen before transmitting and will step all over the half duplex sender.

Ron
Sprint Unix Team
Frequent Advisor

Re: Weird Problem!!!

Thanks for all ur help. The problem was with our network and the switch settings. Also, we had a wrong routing table..