excessive packet loss w/BL460cs - where to start?


excessive packet loss w/BL460cs - where to start?

Our new BL460cs are experiencing upwards of 5% packet loss under all conditions, rendering them unusable. I am at terminal frustration trying to troubleshoot this.

We've got a C7000 enclosure with 4 BL460cs with a single GbE2c switch; we're going to add a second switch module one this problem is worked out. The blades are all Windows 2008 x64 enterprise, PSP 8, NIC teaming enabled.

The GbE2c has two trunk groups, ports 20+21 and 23+24. These trunk groups uplink to a pair of Procurve 2848 rackmount switches, which each have a corresponding trunk group on ports 14+16. All the ports are in the same static VLAN (120). The 2848s are cross-connected via a third trunk group on ports 45+46. The rackmount switches connect to a pair of firewalls (active/passive) for external and VPN access. I'm accessing these servers via a VPN.

Even a simple ping test from any of the BL460s shows 5 - 10% packet loss, and they're under no load at all. I get the same results if I ping the 2848 switches, other servers on the network, anything. Pinging from other nodes to the blades gives the same results.

I tried disabling one of the GbE2c trunk groups, so that the GBe2c was uplinked to only one of the 2848s, and I get the same results. I can even break the trunk group and only use one uplink port (the others are disabled), and I still get the same results.

I can only reliably access the blades via ILO.

To make this more confusing, we have three p class enclosures, each with _two_ GbE2 switches, each one uplinked to _both_ 2848s, as well as being cross-connected, and they have no packet loss, even under heavy load. They use two-port trunk groups in the same way as I was trying with the c-class. I've tried to set the spanning tree and other settings identically, but to no avail.

I know there are hundreds of possible variables in this installation. I'm trying to find some handbook examples on how a setup like ours might be configured. I'm finding the available documentation a little thin on that score. Short of that, I'm looking for any tools which might help with identifying my problem (ignorance aside).

Does having only one c-class switch with NIC teaming enabled cause this problem? Short of providing a network diagram, is there anything else glaringly wrong?
Honored Contributor

Re: excessive packet loss w/BL460cs - where to start?

A couple of places to start

The Gbe2c switches have a monitoring & analysis program you can install and it has a free 5 switch license

The Procurves also have a monitoring and analysis program that you can install for 30days.

And also look around here