Switches, Hubs, and Modems
cancel
Showing results for 
Search instead for 
Did you mean: 

new 5406zl switches - heavy packet loss

Luqman Achmat
Regular Advisor

new 5406zl switches - heavy packet loss

Greetings

We recently replaced our 4108GL switches with 5x5406zl switches, 4 of which connect to a single core 5406zl switch.

We experience some heavy packet loss to some of our servers on the core switch, even when the client is on the same core switch. The problem is not consistent and the servers which are affected vary at any time. We are running physical and VMWare machines, Windows, Linux & HP-UX OS's.

I isolated a single switch in which a single 24-port 10/100/1000 module was installed, with only 2 devices connected; (1) My PC and (2) A hp-ux server where the symptoms experienced are the worst. Sometimes I get a response from ping, more often not. While pinging the hp-ux server successfully, I disconnect my LAN cable and reconnect it again, it takes around 35-40 seconds for the ping to recover again (reply). The 4108GL took around 3 seconds to recover. Sometimes when pinging, I get a first response, but no more after that, not until a good few minutes have elapsed.

I'm sure there must be a problem with the switches or the config itself?

STP is enabled.
8 VLANs on each switch.

Can anyone out there help?

Regards
Luq A
21 REPLIES
Matt Hobbs
Honored Contributor

Re: new 5406zl switches - heavy packet loss

With the 5400's they only run MSTP. With MSTP you must specify the port to be an edge port for your clients.

e.g.

5400(config)# spanning-tree a1-a24 edge-port

Otherwise spanning-tree will block the port for 30 seconds before it starts forwarding.
Mohieddin Kharnoub
Honored Contributor

Re: new 5406zl switches - heavy packet loss

Hi

I also advice you if you can isolate the Servers in a separate Vlan, and see if it possible to team 2 server's NICs to the core for more bandwidth.

Also if you have a Backup servers, i recommend you consider enable Jumbo frames for this Vlan, because heavy packet loss means the port is saturated and can;t forward a new packet arrive until it serves the old ones, and this could be caused by big or Jumbo Frames on the Servers ports.

Good Luck !!!
Science for Everyone
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

Hi Matt

The problem began before implementing STP and I implemented STP in an effort to curb the problem.

Should I still set the "edge-port"?

Do I need to set "edge-port" on all ports (except switch to switch)?

Luq
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

Hi Mohieddin

I don't think it is a bandwidth issue. I had a single server and my PC on the same module on the same switch, nothing else connected. I tested with another server & another, all separately. All experience the same intermittent problem. The only constant is the recovery time after a disconnect/reconnect the LAN cable. Is this normal???

Jumbo frames are not necessary - The server I am testing to is doing nothing at all. I am simply pinging the server.

Help!

Luq
Mohieddin Kharnoub
Honored Contributor

Re: new 5406zl switches - heavy packet loss

Hi

Setting edge port then as Matt's suggestion will guarantee you fast connecting.

Good Luck !!!
Science for Everyone
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

Hi

I tried setting the edge-port and still experience the same problem.

Any other ideas?

Something I forgot to mention; My switch-to-switch connectivity has been setup to tag all VLANs except the default VLAN (untagged). I seem to only experience the problem on my default VLAN (VLAN 1).

Luq
Matt Hobbs
Honored Contributor

Re: new 5406zl switches - heavy packet loss

Are you seeing any packet drops in 'show interfaces'?

What firmware are you running, make sure to update to K.12.05 - http://www.hp.com/rnd/software/j86921205.htm

Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

Hi Matt

I cannot see any packet drops as you suggested.

This is our first proper/full POE installation; I now think that there is a fundamental difference in the default config between the 5406zl L3 switches and the 4108GL L2 switches (the obvious being L3 functionality). I simply unplugged the server and PC from the 4108GL and plugged the same into the 5406zl, the problem appears immediately.

Other strange behaviours we just picked up;

1) While the ping times out from my PC, I plug a 2nd PC into the same module & switch, but the pings get a RESPONSE from the same server.

2) The ping always SEEM to respond from outside the default network (or VLAN). Eg. If I ping the same server from our routed DMZ, the pings always seem to respond.

Any ideas about what the problem could be?

Luq

p.s. I have now enabled edge-port STP on all ports, except switch-to-switch connectivity.
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

btw, we are running firmware K12.02.

Regards
Luq
Paulen
Regular Advisor

Re: new 5406zl switches - heavy packet loss

Hi,

The one thing that springs to my mind is IP issues - doublecheck that subnet masks are identical, no double interfaces with double default gws on the server etc. Do you do routing or just L2 connection?

/BT
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

Hi Paulen

The ip addresses the switches have is the same as the old switches. Also, we're not using L3 functionality at all (no routing). Only L2 being used.

As for "double default gws on the server", it happens to many servers (mixed physical & VMware) etc.

Thanks for your input.

Luq
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

Typical symptom when pinging a server from my PC. Both devices on the same 5406zl switch.


C:\>ping -t 172.18.16.77

Pinging 172.18.16.77 with 32 bytes of data:

Reply from 172.18.16.77: bytes=32 time<1ms TTL=128
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.

Matt Hobbs
Honored Contributor

Re: new 5406zl switches - heavy packet loss

Can you attach the output of 'ipconfig /all' from both the server and problem client - and also capture the output of 'arp -a'.
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

C:\>ping -t 172.18.16.77

Pinging 172.18.16.77 with 32 bytes of data:

Reply from 172.18.16.77: bytes=32 time<1ms TTL=128
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 172.18.16.77:
Packets: Sent = 11, Received = 1, Lost = 10 (90% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 0ms, Average = 0ms
Control-C
^C

CLIENT:
C:\>ipconfig /all

Windows IP Configuration

Host Name . . . . . . . . . . . . : luqman
Node Type . . . . . . . . . . . . : Hybrid
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No

Ethernet adapter Local Area Connection:

Description . . . . . . . . . . . : Broadcom 440x 10/100 Integrated Controller
Physical Address. . . . . . . . . : 00-15-60-C7-B1-9A
Dhcp Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
IP Address. . . . . . . . . . . . : 172.18.17.68
Subnet Mask . . . . . . . . . . . : 255.255.254.0
Default Gateway . . . . . . . . . : 172.18.16.1
DHCP Server . . . . . . . . . . . : 172.18.16.65
DNS Servers . . . . . . . . . . . : 172.18.16.65
172.18.16.67
Primary WINS Server . . . . . . . : 172.18.16.65
Secondary WINS Server . . . . . . : 172.18.18.67
Lease Obtained. . . . . . . . . . : 30 April 2007 08:26:53
Lease Expires . . . . . . . . . . : 07 May 2007 08:26:53

C:\>

SERVER:
C:\>ipconfig /all

Windows IP Configuration

Host Name . . . . . . . . . . . . : carna
Node Type . . . . . . . . . . . . : Unknown
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No

Ethernet adapter Local Area Connection:

Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Broadcom NetXtreme Gigabit Ethernet
Physical Address. . . . . . . . . : 00-15-60-4F-76-2E
DHCP Enabled. . . . . . . . . . . : No
IP Address. . . . . . . . . . . . : 172.18.16.77
Subnet Mask . . . . . . . . . . . : 255.255.254.0
Default Gateway . . . . . . . . . : 172.18.16.1
DNS Servers . . . . . . . . . . . : 172.18.16.65

C:\>
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

CLIENT:
C:\>arp -a

Interface: 172.18.17.68 --- 0x2
Internet Address Physical Address Type
172.18.16.1 00-09-43-e9-a8-c0 dynamic
172.18.16.65 00-50-56-88-44-10 dynamic
172.18.16.67 00-50-56-88-23-73 dynamic
172.18.16.77 00-e0-b6-06-ef-6a dynamic
172.18.16.79 00-c0-b6-07-c6-69 dynamic
172.18.16.91 00-0e-7f-21-0e-6d dynamic
172.18.16.120 00-11-43-cd-27-d4 dynamic

C:\>

Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

Hi Matt

One more thing; the problem only occurs on the local LAN or so it seems. If I ping the same ip address from a routed network (eg. our DMZ), the ping always gets a response (no nat).

However, having said that, on the same internal LAN, while I get a response from one PC, I don't get a response from another.

Luq
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

Matt

Something else very interesting.... when I clear the arp cache on the server side, it starts to work, but not for long:

SERVER:
C:\>ping -t 172.18.17.68 (my PC/client)

Pinging 172.18.17.68 with 32 bytes of data:

Request timed out.
Request timed out.
Request timed out.

Ping statistics for 172.18.17.68:
Packets: Sent = 3, Received = 0, Lost = 3 (100% loss),
Control-C
^C
C:\>arp -d *

C:\>ping -t 172.18.17.68

Pinging 172.18.17.68 with 32 bytes of data:

Reply from 172.18.17.68: bytes=32 time<1ms TTL=128
Reply from 172.18.17.68: bytes=32 time<1ms TTL=128
Reply from 172.18.17.68: bytes=32 time<1ms TTL=128
Reply from 172.18.17.68: bytes=32 time<1ms TTL=128

Ping statistics for 172.18.17.68:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 0ms, Average = 0ms
Control-C
^C
C:\>
Matt Hobbs
Honored Contributor

Re: new 5406zl switches - heavy packet loss

It seems like there is an IP address conflict between the server and another device:

Physical Address. . . . . . . . . : 00-15-60-4F-76-2E
DHCP Enabled. . . . . . . . . . . : No
IP Address. . . . . . . . . . . . : 172.18.16.77
Subnet Mask . . . . . . . . . . . : 255.255.254.0

172.18.16.77 00-e0-b6-06-ef-6a dynamic

You need to track down 00-e0-b6-06-ef-6a and fix the conflict.
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

Thanks Matt

I just noticed that myself. Will track it down and hopefully resolve the connectivity issues to the other servers too.

Luq
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

Hi

Problem looks resolved, except that I can still not connect to 2 of my tagged VLAN's from the hp-ux server; will take up with HP.

The problem was 2 fold:

1) Someone had plugged in a device on the internal LAN creating a loop

2) The Server I used to test with had a problem with its NIC driver. HP advised that I installed the latest hp-uxv2 patches and this seemed to resolve that issue

Thanks to all for your help.

Regards
Luqman Achmat
Regular Advisor

Re: new 5406zl switches - heavy packet loss

2 problems resolved.