Aruba & ProVision-based

2610 Crash after BPDU Filter and MSTP Change

 
a_gizmo
Visitor

2610 Crash after BPDU Filter and MSTP Change

Hi everybody,

 

Does anybody have an idea as to what would cause the following crashes on an HP2610 48 port PoE switch running software R11.95?

 

 

# sh boot-history
Master -- Saved Crash Information (most recent first):
======================================================

SubSystem 0 went down:  03/13/13 13:01:42
Divide by Zero Error: IP=0x803c96f8 Task='mEaseCtrl'
  Task ID=0x85dcc710 fp:0x00000000 sp:0x85dcc5a0 ra:0x803c96d8 sr:0x1000fc01

SubSystem 0 went down:  03/08/13 13:02:14
TLB Miss:  Virtual Addr=0x00000060 IP=0x00000060 Task=''
  Task ID=0x85be8c90 fp:0x00000000 sp:0x85be7168 ra:0x00000060 sr:0x1000fc01

Basic network layout: We have a HP8206zl used as our core switch and primary router. From there a 2610 switch from each building of our university campus connects back to the core. In some locations we have to daisy chain another 2610 switch off the first to get enough port density. There is no redundant STP links between devices. So in simplest terms, the network radiates from the core switch and is never more than 2 switches deep.

 

When it started: For over 1.5 years our network had been the victim of MSTP topology recalculations that happened within hours, minutes, sometimes seconds of each other. We had talked to different HP engineers and 3rd party consultants about the matter, but nothing seemed to help. To continue the search for a solution, last Friday, 3/08/13, we made a redesign our our MSTP configuration. We reduced the instance count from 5 to 2, combining all of our regular network VLANs into instance 1 and moving the SAN VLAN to instance 2. Along with the MSTP change we also enabled BPDU filtering on all of the edge ports. All of our work was completed by 12:00pm and we checked the config of all the switches to make sure everybody matched. At just after 1:00 we saw 42 out of 91 of our 2610 switches crash and reboot. They all reported an error similar to the one seen above time stamped from 3/8/13. Over the past few days 17 of the remaining un-rebooted switches have crashed with the same message. During the times before and after the crashes, the switches appeared to be operating normally and nothing in the logs indicated an abnormality. I have a feeling the remaining 32 switches will one day crash. Only one switch crashed more than once and its error message was different (shown by the event above time stamped from 3/13/13) so may unrelated.

 

Besides all of that, I'm happy to report that we finally tracked down all of the causes of our STP topology recalculations. It was a mix of HP Intellijacks and MSM APs that had errors and would crash, reboot, or throw errors on the uplink ports. On a network that has 111 switches and over 350 APs (half of which are MSM317s spread between two residence halls) the last recalculation was 27 hours ago and counting.

2 REPLIES 2
paulgear
Esteemed Contributor

Re: 2610 Crash after BPDU Filter and MSTP Change

Hi a_gizmo,

 

Regardless of what caused it, your best bet is to report it to HP and ask for fixed software.  Given that you've had HP engineers involved in diagnosing the problem, they should be able to reproduce it reasonably well.  (One would hope... ;-)

Regards,
Paul
Packet-Ghost
Occasional Advisor

Re: 2610 Crash after BPDU Filter and MSTP Change

Hi,

 

Just out of curiosity...you say that "There is no redundant STP links between devices" ? Is this only true for connections between core and edge or throughout the network?

 

Just ...why run STP at all if there are "...no redundant STP links..." ?

 

At least - I wouldn't enable STP on the edges then, just use loop protection on edge. This would decrease the number of STP-devices and ease troublshooting I guess?

 

Or am I missing something?

 

K.