ProCurve / ProVision-Based
2920 stack dropping majority of packets

I have a network with a 5412R zl2 as the core switch, and several stacks of 4 2920s, each stack almost identically configured ( same VLANs, some small differences in which ports are assigned to which VLAN ). These are "real" stacks with stacking modules and cables, configured as a ring. There is a single Aruba 7200 wireless controller, with APs connected to each stack. Each stack has two fibre links back to the core and spanning tree is enabled.

After a period of some days one specific stack will stop passing traffic to newly connected edge devices. Traffic via the Aruba APs connected to the same stack ( tunelled back to the controller ) is unaffected, and so wireless becomes the only practical way to access the swtich

When the fault arises approx 80% of pings are dropped. ssh'ing in to the stack via wireless and trying to ping a device connected to the stack nearly always fails. The MAC addresses of the device I can't ping shows in the output of show mac-address

It appears that this connectivity issue is limited to VLAN 1 ( which is untagged across all uplinks and most edge ports ) which is the VLAN that contains the switches, servers and PCs.

I appreciate that having all that equipment in a single VLAN, using VLAN 1, and having VLAN 1 untagged across all switches isn't best practice, but I dont "own" this network and I'm not in a position to get any of this changed.

If I remote one stacking cable between members 2 and 3, then members 1 and 2 start to work perfectly. Members 3 and 4 have no connectivity, despite it being a broken ring, not a chain. 

A restart of the whole stack resolves the issue.

As I say, this only occurs on one stack. The other 8 stacks of 2920s all work exactly as I'd expect.

Any ideas welcome!



Given what's happening it looks like a very localized problem.

What's about providing more details with regard to the four 2920 members' Stack that is experiencing the issue?

Sanitized outputs of these CLI commands run against the problematic Stack would help:

  • show stacking
  • show version (and/or show flash)
  • show lacp
  • show vlan

Stack's Logs report something useful to diagnose the problem?

Hi. I"ll grab this info on Monday - the site is inaccessible until then.

Just to close this off. When we got to site we saw that stack member four was missing. We took a "show tech all" to open a support case, rebooted it and left site. The next morning the switch was dead - no activity lights, and was missing from the stack. A reboot didn't bring it fully back online as it reported a faulty stacking card.

 We replaced the switch and the stack has been fine for 3+ days now.

 One oddity is that none of the APs that were connected to the dead switch would reconnect to the Aruba 7205 wireless controller - even when connected to a different stack. The core switch could ping the APs, but the 7205 could not. A new AP, in the new switch, worked perfectly.

 The 7205 required a reboot before it could communicate with the APs. I'm at a loss to explain why. Can the 7205 blacklist APs? It didn't appear to be an issue with the switches, as all switches could ping the APs. It was just AP<->controller traffic that was affected.

 I wasn't able to open a support case as the client needed everything up ASAP and the problem isn't reproducible.