Web and Unmanaged
cancel
Showing results for 
Search instead for 
Did you mean: 

stp loop-protection (HP 1920)

Dunky
Regular Advisor

stp loop-protection (HP 1920)

Can anyone tell me what the purpose of the "stp loop-protection" command is, and whether it should be enabled on inter-switch links and BAGG interfaces?

Thanks

13 REPLIES
Ian Vaughan
Honored Contributor

Re: stp loop-protection

Howdy,

The stp loop-protection feature under Comware type OS is a kind of stability mechanism for the spannning tree topology so that it doesn't kick off topology changes if you get momentary BPDU startvation on your interswitch links.

It should only be applied to the upstream links that you expect to be participating in spanning tree i.e. the root port and the secondary/standby/other/can't-remember-name port that are "facing" the root bridge on the network.

The ports with stp loop-protection applied won't forward traffic until the switch logically above them sends them a BPDU so it's kind of opposite to BPDUguard which shuts down a link if a switch BPDU is detected.

The L2 configuration manual for a 5130ei covers the subject in some detail :-) follow the white rabbit down the Tech support / manuals link from the product page.

Should work just as well on BAGG ports as normal skinny ports. Use same version of STP thoughout and get your MSTP parameters to match across the topology if that's what you are using - let us know how you get on.

HTH - GKIID

Ian

 

Hope that helps - please click "Thumbs up" for Kudos if it does
## ---------------------------------------------------------------------------##
Which is the only cheese that is made backwards?
Edam!
Tweets: @2techie4me
Dunky
Regular Advisor

Re: stp loop-protection

Hi Ian,

This is what I am seeing in the log of SWITCH-03

%Aug 22 19:27:49:163 2016 SWITCH-03 MSTP/5/MSTP_BPDU_RECEIVE_EXPIRY: Instance 0's port Bridge-Aggregation2 received no BPDU within the rcvdInfoWhile interval. Information of the port aged out.
%Aug 22 19:27:49:164 2016 SWITCH-03 MSTP/6/MSTP_DISCARDING: Instance 0's port Bridge-Aggregation2 has been set to discarding state.
%Aug 22 19:27:49:165 2016 SWITCH-03 MSTP/4/MSTP_LOOP_PROTECTION: Instance 0's LOOP-Protection port Bridge-Aggregation2 failed to receive configuration BPDUs.
%Aug 22 19:27:49:166 2016 SWITCH-03 MSTP/5/MSTP_BPDU_RECEIVE_EXPIRY: Instance 1's port Bridge-Aggregation2 received no BPDU within the rcvdInfoWhile interval. Information of the port aged out.
%Aug 22 19:27:49:167 2016 SWITCH-03 MSTP/6/MSTP_DISCARDING: Instance 1's port Bridge-Aggregation2 has been set to discarding state.
%Aug 22 19:27:49:168 2016 SWITCH-03 MSTP/4/MSTP_LOOP_PROTECTION: Instance 1's LOOP-Protection port Bridge-Aggregation2 failed to receive configuration BPDUs.
%Aug 22 19:27:49:169 2016 SWITCH-03 MSTP/5/MSTP_BPDU_RECEIVE_EXPIRY: Instance 2's port Bridge-Aggregation2 received no BPDU within the rcvdInfoWhile interval. Information of the port aged out.
%Aug 22 19:27:49:170 2016 SWITCH-03 MSTP/6/MSTP_DISCARDING: Instance 2's port Bridge-Aggregation2 has been set to discarding state.
%Aug 22 19:27:49:171 2016 SWITCH-03 MSTP/4/MSTP_LOOP_PROTECTION: Instance 2's LOOP-Protection port Bridge-Aggregation2 failed to receive configuration BPDUs.
%Aug 22 19:28:32:698 2016 SWITCH-03 LAGG/5/LAGG_INACTIVE_PARTNER: Member port GigabitEthernet1/0/23 of aggregation group BAGG2 becomes INACTIVE because the port's partner is improper for being attached.
%Aug 22 19:28:32:748 2016 SWITCH-03 MSTP/6/MSTP_FORWARDING: Instance 0's port Bridge-Aggregation2 has been set to forwarding state.
%Aug 22 19:28:32:750 2016 SWITCH-03 MSTP/6/MSTP_DETECTED_TC: Instance 0's port Bridge-Aggregation2 detected a topology change.
%Aug 22 19:28:32:751 2016 SWITCH-03 MSTP/6/MSTP_FORWARDING: Instance 1's port Bridge-Aggregation2 has been set to forwarding state.
%Aug 22 19:28:32:752 2016 SWITCH-03 MSTP/6/MSTP_FORWARDING: Instance 2's port Bridge-Aggregation2 has been set to forwarding state.
%Aug 22 19:28:32:929 2016 SWITCH-03 LAGG/5/LAGG_ACTIVE: Member port GigabitEthernet1/0/23 of aggregation group BAGG2 becomes ACTIVE.
%Aug 22 19:28:33:284 2016 SWITCH-03 MSTP/6/MSTP_NOTIFIED_TC: Instance 0's port Bridge-Aggregation2 was notified of a topology change.
%Aug 22 19:28:34:071 2016 SWITCH-03 MSTP/6/MSTP_NOTIFIED_TC: Instance 0's port Bridge-Aggregation2 was notified of a topology change.

 

The relevant parts of the config are here....

stp region-configuration

 region-name XXX

 revision-level 1

 instance 1 vlan 3 to 499

 instance 2 vlan 2 500 to 599

 active region-configuration

#

stp bpdu-protection

 stp enable

#

interface Bridge-Aggregation1

 description LAG to -02

 port link-type trunk

 port trunk permit vlan all

 link-aggregation mode dynamic

 stp loop-protection

#

interface Bridge-Aggregation2

 description LAG to -00

 port link-type trunk

 port trunk permit vlan all

 link-aggregation mode dynamic

 stp loop-protection

#

interface GigabitEthernet1/0/21

 description LAG member to -02/g23

 port link-type trunk

 port trunk permit vlan all

 speed 1000

 duplex full

 port auto-power-down

 port link-aggregation group 1

#

interface GigabitEthernet1/0/22

 description LAG member to -02/g24

 port link-type trunk

 port trunk permit vlan all

 speed 1000

 duplex full

 port auto-power-down

 port link-aggregation group 1

#

interface GigabitEthernet1/0/23

 description LAG member to -00/g20

 port link-type trunk

 port trunk permit vlan all

 speed 1000

 duplex full

 port auto-power-down

 port link-aggregation group 2

#

interface GigabitEthernet1/0/24

 description LAG member to -00/g19

 port link-type trunk

 port trunk permit vlan all

 speed 1000

 duplex full

 port auto-power-down

 port link-aggregation group 2

 

What would be causing:

SWITCH-03 MSTP/5/MSTP_BPDU_RECEIVE_EXPIRY: Instance 0's port Bridge-Aggregation2 received no BPDU within the rcvdInfoWhile interval. Information of the port aged out.
SWITCH-03 MSTP/4/MSTP_LOOP_PROTECTION: Instance 0's LOOP-Protection port Bridge-Aggregation2 failed to receive configuration BPDUs.
SWITCH-03 LAGG/5/LAGG_INACTIVE_PARTNER: Member port GigabitEthernet1/0/23 of aggregation group BAGG2 becomes INACTIVE because the port's partner is improper for being attached.

 

I wonder if the SWITCH-03 LAGG/5/LAGG_INACTIVE_PARTNER error on 1/0/23 is the root cause of the problems I am seeing.  

FYI, the configuration at the other end of the BAGG2 on SWITCH-00 is...

stp region-configuration

 region-name XXX

 revision-level 1

 instance 1 vlan 3 to 499

 instance 2 vlan 2 500 to 599

 active region-configuration

#

 stp instance 0 root primary

 stp instance 1 root primary

 stp instance 2 root primary

 stp bpdu-protection

 stp enable

#

interface Bridge-Aggregation2

 description LAG to -03

 port link-type trunk

 port trunk permit vlan all

 link-aggregation mode dynamic

 stp loop-protection

#

interface GigabitEthernet1/0/19

 description LAG member to -03/g24

 port link-type trunk

 port trunk permit vlan all

 speed 1000

 duplex full

 port auto-power-down

 port link-aggregation group 2

#

interface GigabitEthernet1/0/20

 description LAG member to -03/g23

 port link-type trunk

 port trunk permit vlan all

 speed 1000

 duplex full

 port auto-power-down

 port link-aggregation group 2

Dunky
Regular Advisor

Re: stp loop-protection

This may provide some insight as to the frequency of the issue...

[SWITCH-03]dis stp his
--------------- STP slot 1 history trace ---------------
------------------- Instance 0 ---------------------

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/22 19:28:32
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/22 19:27:49
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/20 12:31:57
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/20 12:31:14
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/18 06:35:30
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/18 06:34:47
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/16 22:01:06
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/16 22:00:23
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/07 10:27:21
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation1
Role change : Root->Desi
Time : 2016/08/07 10:27:21
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 36
32768.2c23-3a81-47c6 128.30

Port Bridge-Aggregation1
Role change : Desi->Root
Time : 2016/08/07 10:27:21
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 36
32768.2c23-3a81-47c6 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/07 10:26:33
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/01 10:00:25
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/01 09:59:38
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/07/31 05:14:06
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/07/31 05:13:22
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/07/21 03:52:47
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/07/21 03:52:01
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/07/18 23:06:28
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/07/18 23:05:41
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2000/04/26 12:00:47
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 0
0.2c23-3a7f-d506 128.30
------------------- Instance 1 ---------------------

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/22 19:28:32
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/22 19:27:49
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/20 12:31:57
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/20 12:31:14
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/18 06:35:30
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/18 06:34:47
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/16 22:01:06
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/16 22:00:23
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/07 10:27:21
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation1
Role change : Root->Desi
Time : 2016/08/07 10:27:21
Port priority : 0.2c23-3a7f-d506 36 32768.2c23-3a81-47c6 128.30

Port Bridge-Aggregation1
Role change : Desi->Root
Time : 2016/08/07 10:27:21
Port priority : 0.2c23-3a7f-d506 36 32768.2c23-3a81-47c6 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/07 10:26:33
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/01 10:00:25
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/01 09:59:38
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/07/31 05:14:06
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/07/31 05:13:22
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/07/21 03:52:47
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/07/21 03:52:01
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/07/18 23:06:28
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/07/18 23:05:41
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2000/04/26 12:00:47
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30
------------------- Instance 2 ---------------------

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/22 19:28:32
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/22 19:27:49
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/20 12:31:57
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/20 12:31:14
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/18 06:35:30
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/18 06:34:47
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/16 22:01:06
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/16 22:00:23
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/07 10:27:21
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation1
Role change : Root->Desi
Time : 2016/08/07 10:27:21
Port priority : 0.2c23-3a7f-d506 36 32768.2c23-3a81-47c6 128.30

Port Bridge-Aggregation1
Role change : Desi->Root
Time : 2016/08/07 10:27:21
Port priority : 0.2c23-3a7f-d506 36 32768.2c23-3a81-47c6 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/07 10:26:33
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/08/01 10:00:25
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/08/01 09:59:38
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/07/31 05:14:06
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/07/31 05:13:22
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/07/21 03:52:47
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/07/21 03:52:01
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2016/07/18 23:06:28
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Root->Desi (Aged)
Time : 2016/07/18 23:05:41
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Port Bridge-Aggregation2
Role change : Desi->Root
Time : 2000/04/26 12:00:47
Port priority : 0.2c23-3a7f-d506 0 0.2c23-3a7f-d506 128.30

Ian Vaughan
Honored Contributor

Re: stp loop-protection

Hello,

Just for the sake of clarity could you please sketch out your topology on a napkin / fag packet / whiteboard and take a picture of it and post that up to the forum.

That would be awesome. I promise I'll take a peek later when I've finished juggling todays aligators. :-)

BTW - The MS "Office lens" app is really good at taking pictures of diagrams on whiteboards etc even at dodgy angles. I take a quick snap of everything I draw on the whiteboard now and it magically ends up in OneDrive - Happy Days!

Cheers

Ian

 

Hope that helps - please click "Thumbs up" for Kudos if it does
## ---------------------------------------------------------------------------##
Which is the only cheese that is made backwards?
Edam!
Tweets: @2techie4me
Ian Vaughan
Honored Contributor

Re: stp loop-protection

And just one more thing...

If you having to nail your links to 1000/Full instead of auto/auto you most probably have a dodgy patch cable or an iffy bit of infrastructure in the way - that will deffo $%!* up your LACP and your STP.

Gig to gig should (almost) always be auto / auto - you might have a single dodgy copper wire out of your 8 in one of the leads.  

try and isolate to see which connection won't do an auto/auto to 1000/Full and thats where i would start

Ta

Ian

Hope that helps - please click "Thumbs up" for Kudos if it does
## ---------------------------------------------------------------------------##
Which is the only cheese that is made backwards?
Edam!
Tweets: @2techie4me
Dunky
Regular Advisor

Re: stp loop-protection

Hi Ian,

I uploaded a PNG of the LAN in one of the above posts.

Steve

Dunky
Regular Advisor

Re: stp loop-protection

Hi Ian,
The interfaces that form the BAGG are 1000/FULL only because thats how I hard coded it from scratch rather than auto/auto.
Looking at these bits in the logs....

SWITCH-03 MSTP/5/MSTP_BPDU_RECEIVE_EXPIRY: Instance 0's port Bridge-Aggregation2 received no BPDU within the rcvdInfoWhile interval. Information of the port aged out.
SWITCH-03 MSTP/4/MSTP_LOOP_PROTECTION: Instance 0's LOOP-Protection port Bridge-Aggregation2 failed to receive configuration BPDUs.
SWITCH-03 LAGG/5/LAGG_INACTIVE_PARTNER: Member port GigabitEthernet1/0/23 of aggregation group BAGG2 becomes INACTIVE because the port's partner is improper for being attached.

... what would cause this?  As you can see from the "sh stp history" in an earlier post this happens during the night as well as during the day.  Then it will run for days with no issues whatsoever.  There are no errors showing on any of the interfaces so I am at a bit of a loss to explain what the cause is.

Hoping you may be able to shed some light, I really appreciate you putting your time in to assist.

Steve

Ian Vaughan
Honored Contributor

Re: stp loop-protection

Howdy,

A handful of things after a quick scan:

1) Auto power-down the port if no packets go across it? Maybe so on an edge port but on a member of a LAGG uplink port that's part of the STP topology? For peace of mind and just to rule it out I would remove that line from the port config and take the hit on the electricity bill :-)

2) I'm still a fan of gigabit links being auto/auto. If you don't have nice stable gigabit links it's usually a layer 1 (phyical) problem rather than negotiation issue. If you can't get stability with auto/auto your Gig ports are trying to tell you something. Failed negotiation is a message that something isn't quite right.

3) Backhaul from the edge switches e.g. switch06/g26 - switch02/g26 described as "wireless" - is this a p2p wifi link or a laser or something of that ilk? Is it stable, does it ever flap?

4) You can get BPDU starvation if there's a lot of chat on the network (even a sudden burst) and the STP keepalives get lost in the wash. You could implement broadcast suppression and unicast suppression to lessen the impact of any storms that might be happening but it is flood prevention not a cure for the underlying ills.

broadcast-suppression pps 200

multicast-suppression pps 200

unicast-suppression pps 200

       # these are "unknown" unicast not your normal prod traffic.

If 200 seems a bit high halve it. If you run multicast streams as normal traffic keep doubling it or use the % traffic counter in the config rather than teh pps value. When you don't get alerts you'll know a "happy medium" level for your network. Google for the broadcast suppression stuff and get your head around it first - there are a couple of blog posts and some posts on this forum as well.  

Are your switches aligned with an NTP timeserver? makes troubleshooting way easier if the clocks are aligned.

Are you forwarding / aggregating at a syslog server? same reason at above - you could always grab a copy of the IMC monitoring / management software as a 60 day free trial and see if it helps solve the problem as it can build a MST instance / topology view.  Take a look at this.

One of the hardest ones of these to solve was (a bit like this) intel NICs on new laptops having a massive IPv6 neighbour discovery storm when they were supposed to be asleep. CPU on switch got hammered and switch turned into a massive hub and flooded everything everywhere making everything worse. Is the CPU on switch0 impacted at the same time as the BPDU's stop arriving at sw3?

Another was cheap access points that undercut the core switch MAC address and asserted themselves as STP Primary and Secondary Roots. When they wobbled twice a day as their ARP tables filled up and they brought the whole place to a halt while the topology rocked back and forth before stabilizing.

Another step in the right direction I hope.

Cheers

Ian

Hope that helps - please click "Thumbs up" for Kudos if it does
## ---------------------------------------------------------------------------##
Which is the only cheese that is made backwards?
Edam!
Tweets: @2techie4me
Dunky
Regular Advisor

Re: stp loop-protection

Hi Ian,

Thanks for the comprehensive reply :)

In response......

1) Agreed. Its just part of the default interfacec config we apply that I never removed.

2) I will change this when I can as it will impact on the network I assume so will need ot be OOH.

3) Sorry, There are lots of VLANs, many wired and a handful of wireless SSID's - I use MST instances and stp costs to 'route' the wired VLANs over one fibre and the wireless VLANs across the other for load balancing.

4) It is always SWITCH3 not receiving BPDU;s from SWITCH-00 via its BAGGs interface that I see in the logs - no other switches are reporting loss of BPDU's.

I would be suprised if it was any sort of broadcast storm given that it happens OOH and at different times (so its not like a large backup kicking it etc as that would be the same time every day)

There is no multicast (at least I havnt configured any igmp queriers etc.) to the best of my knowledge.

Yes, switches are alligned to NTP server on the firewall which is synced to ntp.org pool.

Yes, I am logging to a syslog server.

The youtube vid you linked to looks cool - I will look at the 60 day trial next Tuesday which will be the earliest I will be able to get access.

THe IPv6 stuff looks interesting, is there a way of just blocking this on all the ingress access ports?  I have noticed some Apple MacBooks that do lots of down/up, even through the night when nobody is on-site.

I dont know if the CPU on switch 00 spikes as I am always looking at historical events - is there a command that allows you to see the CPU utilization over a period of time?

I dont think another device being elected as the stp root will be an issue as I have forced switch-00 to be the primary and switch-01 to be the secondary for all MSTI instances.  (BTW, switch-00 is where the internet connection is.

Many thanks for your advise so far Ian, I very much appreciate it.

Steve

 

 

 

Dunky
Regular Advisor

Re: stp loop-protection

I have been able to identify a bit more information on this problem this morning.

It appears that I do get a loop on the LAN as I have seen the following on the syslog server...

Aug 29 18:06:59 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:06:59) kernel [2072766.333397] net_ratelimit: 2883 callbacks suppressed
Aug 29 18:06:59 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:06:59) kernel [2072766.333400] vlan200: received packet on eth1.200 with own address as source address
Aug 29 18:06:59 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:06:59) kernel [2072766.333463] vlan200: received packet on eth1.200 with own address as source address
Aug 29 18:06:59 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:06:59) kernel [2072766.333475] vlan200: received packet on eth1.200 with own address as source address
Aug 29 18:06:59 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:06:59) kernel [2072766.333492] vlan200: received packet on eth1.200 with own address as source address
Aug 29 18:06:59 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:06:59) kernel [2072766.333553] vlan200: received packet on eth1.200 with own address as source address
Aug 29 18:07:04 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:07:04) kernel [2072771.332217] net_ratelimit: 1068885 callbacks suppressed
Aug 29 18:07:04 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:07:04) kernel [2072771.332220] vlan200: received packet on eth1.200 with own address as source address
Aug 29 18:07:04 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:07:04) kernel [2072771.332226] vlan200: received packet on eth1.200 with own address as source address
Aug 29 18:07:04 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:07:04) kernel [2072771.332232] vlan200: received packet on eth1.200 with own address as source address
Aug 29 18:07:04 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:07:04) kernel [2072771.332237] vlan200: received packet on eth1.200 with own address as source address
Aug 29 18:07:04 10.125.3.1 FIREWALL-00 local0 warning (2016-08-29T17:07:04) kernel [2072771.332243] vlan200: received packet on eth1.200 with own address as source address

Based on the 'callbacks suppressed' figure it looks as though there are literally thousands of dropped syslog messages (I assume thats what it means hence thousands of packets). btw its a Watchguard firewall.

At this stage I am not really sure where to start looking.  Lack of receiving a BPDU could be caused by the loop creating a storm, but what is causing STP to transition to a forwarding state and hence cause the loop in the first place, albeit only for a very short period of time before all goes back to normal.  It appears as though STP is not woking as it should if it is creating loops for a short period.

Dunky
Regular Advisor

Re: stp loop-protection

Hi Ian,

Just downloaded IMC as you suggested and attempted to install, but it appears that SQL is a pre-requisite so its a bit of a non-started I'm afraid.

 

Dunky
Regular Advisor

Re: stp loop-protection

I have removed the "port auto-power-dow"n from the ling aggregation member interfaces and set speed/duplex to auto on all of them.

Will wait and see if this makes any difference.

 

Dunky
Regular Advisor

Re: stp loop-protection

Problem has reappeared.

I can see why loops are being introduced - switch-06 is transitioning interfaces to FORWARDING on an uplink before it sets them to DISCARDING on the other uplink !!

%Sep 10 08:57:29:258 2016 SWITCH-06 MSTP/6/MSTP_NOTIFIED_TC: Instance 0's port GigabitEthernet1/0/25 was notified of a topology change.
%Sep 10 08:57:30:905 2016 SWITCH-06 MSTP/6/MSTP_NOTIFIED_TC: Instance 0's port GigabitEthernet1/0/25 was notified of a topology change.
%Sep 10 08:57:31:007 2016 SWITCH-06 MSTP/6/MSTP_NOTIFIED_TC: Instance 0's port GigabitEthernet1/0/25 was notified of a topology change.
%Sep 10 08:57:31:135 2016 SWITCH-06 MSTP/6/MSTP_DISCARDING: Instance 2's port GigabitEthernet1/0/26 has been set to discarding state.
%Sep 10 08:57:31:136 2016 SWITCH-06 MSTP/6/MSTP_FORWARDING: Instance 2's port GigabitEthernet1/0/25 has been set to forwarding state.
%Sep 10 08:57:32:916 2016 SWITCH-06 MSTP/6/MSTP_NOTIFIED_TC: Instance 0's port GigabitEthernet1/0/25 was notified of a topology change.
%Sep 10 08:58:00:255 2016 SWITCH-06 MSTP/6/MSTP_FORWARDING: Instance 0's port GigabitEthernet1/0/26 has been set to forwarding state.
%Sep 10 08:58:00:256 2016 SWITCH-06 MSTP/6/MSTP_DETECTED_TC: Instance 0's port GigabitEthernet1/0/26 detected a topology change.
%Sep 10 08:58:00:257 2016 SWITCH-06 MSTP/6/MSTP_FORWARDING: Instance 1's port GigabitEthernet1/0/26 has been set to forwarding state.
%Sep 10 08:58:00:258 2016 SWITCH-06 MSTP/6/MSTP_FORWARDING: Instance 2's port GigabitEthernet1/0/26 has been set to forwarding state.
%Sep 10 08:58:13:531 2016 SWITCH-06 MSTP/6/MSTP_NOTIFIED_TC: Instance 0's port GigabitEthernet1/0/26 was notified of a topology change.
%Sep 10 08:58:13:532 2016 SWITCH-06 MSTP/6/MSTP_DISCARDING: Instance 0's port GigabitEthernet1/0/26 has been set to discarding state.
%Sep 10 08:58:13:533 2016 SWITCH-06 MSTP/6/MSTP_DISCARDING: Instance 1's port GigabitEthernet1/0/26 has been set to discarding state.
%Sep 10 08:58:13:534 2016 SWITCH-06 MSTP/6/MSTP_DISCARDING: Instance 2's port GigabitEthernet1/0/25 has been set to discarding state.
%Sep 10 08:58:13:720 2016 SWITCH-06 MSTP/6/MSTP_NOTIFIED_TC: Instance 0's port GigabitEthernet1/0/25 was notified of a topology change.
%Sep 10 08:58:14:933 2016 SWITCH-06 MSTP/6/MSTP_NOTIFIED_TC: Instance 0's port GigabitEthernet1/0/25 was notified of a topology change.

 

Normal operation is:

1/0/25  Forwarding Instances 0,1 and Discard instance 2

1/0/26  Discard instances 0,1 and Forwarding instance 2

so it can be clearly seen in the log above that I end up with all three instances being forwarded out of both g1/0/25 and g1/0/26.

 

This is when I start seeing looping packets in the network and may indicate why I get BPDU starvation on switch-02.

Is this a known bug? If so is there a fix?

Not sure what else I can do if STP is not functioning correctly.