Comware Based
cancel
Showing results for 
Search instead for 
Did you mean: 

LSP flap after A5500HI upgrade

itcoop
Occasional Advisor

LSP flap after A5500HI upgrade

I have a ring of 5500HI's which were upgraded from a5500hi-cmw520-r5501p25 to a5500hi-cmw520-r5501p32 code early this morning.  After the upgrade, I am receiving LSP state changes up/down in my syslog:

2017.06.29-05:14:03 <99.99.99.44>: <188>Jun 29 04:14:03 2017 DP %%10LSPM/4/TRAP(t): -DevIP=99.99.99.44; 1.3.6.1.2.1.10.166.2.0.2 LSP 0.0.85.59 Changes to Down
2017.06.29-05:14:03 <99.99.99.44>: <188>Jun 29 04:14:03 2017 DP %%10LSPM/4/TRAP(t): -DevIP=99.99.99.44; 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.85.60 Changes to Up
2017.06.29-05:14:14 <99.99.99.4>: <188>Jun 29 05:14:14 2017 CR-C %%10LSPM/4/TRAP(t): -DevIP=99.99.99.4; 1.3.6.1.2.1.10.166.2.0.2 LSP 0.0.82.152 Changes to Down
2017.06.29-05:14:15 <99.99.99.4>: <188>Jun 29 05:14:15 2017 CR-C %%10LSPM/4/TRAP(t): -DevIP=99.99.99.4; 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.82.153 Changes to Up
2017.06.29-05:14:22 <99.99.99.8>: <188>Jun 29 05:14:22 2017 WO %%10LSPM/4/TRAP(t): -DevIP=99.99.99.8; 1.3.6.1.2.1.10.166.2.0.2 LSP 0.0.86.47 Changes to Down
2017.06.29-05:14:22 <99.99.99.8>: <188>Jun 29 05:14:22 2017 WO %%10LSPM/4/TRAP(t): -DevIP=99.99.99.8; 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.86.48 Changes to Up
2017.06.29-05:14:23 <99.99.99.2>: <188>Jun 29 05:14:23 2017 CR %%10LSPM/4/TRAP(t): -DevIP=99.99.99.2; 1.3.6.1.2.1.10.166.2.0.2 LSP 0.0.82.59 Changes to Down
2017.06.29-05:14:23 <99.99.99.2>: <188>Jun 29 05:14:23 2017 CR %%10LSPM/4/TRAP(t): -DevIP=99.99.99.2; 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.82.60 Changes to Up
2017.06.29-05:14:31 <99.99.99.43>: <188>Jun 29 05:14:31 2017 CF %%10LSPM/4/TRAP(t): -DevIP=99.99.99.43; 1.3.6.1.2.1.10.166.2.0.2 LSP 0.0.85.65 Changes to Down
2017.06.29-05:14:31 <99.99.99.43>: <188>Jun 29 05:14:31 2017 CF %%10LSPM/4/TRAP(t): -DevIP=99.99.99.43; 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.85.66 Changes to Up
2017.06.29-05:14:33 <99.99.99.44>: <188>Jun 29 04:14:33 2017 DP %%10LSPM/4/TRAP(t): -DevIP=99.99.99.44; 1.3.6.1.2.1.10.166.2.0.2 LSP 0.0.85.60 Changes to Down
2017.06.29-05:14:33 <99.99.99.44>: <188>Jun 29 04:14:33 2017 DP %%10LSPM/4/TRAP(t): -DevIP=99.99.99.44; 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.85.61 Changes to Up
2017.06.29-05:14:44 <99.99.99.4>: <188>Jun 29 05:14:44 2017 CR-C %%10LSPM/4/TRAP(t): -DevIP=99.99.99.4; 1.3.6.1.2.1.10.166.2.0.2 LSP 0.0.82.153 Changes to Down
2017.06.29-05:14:45 <99.99.99.4>: <188>Jun 29 05:14:45 2017 CR-C %%10LSPM/4/TRAP(t): -DevIP=99.99.99.4; 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.82.154 Changes to Up
2017.06.29-05:14:52 <99.99.99.8>: <188>Jun 29 05:14:52 2017 WO %%10LSPM/4/TRAP(t): -DevIP=99.99.99.8; 1.3.6.1.2.1.10.166.2.0.2 LSP 0.0.86.48 Changes to Down
2017.06.29-05:14:52 <99.99.99.8>: <188>Jun 29 05:14:52 2017 WO %%10LSPM/4/TRAP(t): -DevIP=99.99.99.8; 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.86.49 Changes to Up
2017.06.29-05:14:53 <99.99.99.2>: <188>Jun 29 05:14:53 2017 CR %%10LSPM/4/TRAP(t): -DevIP=99.99.99.2; 1.3.6.1.2.1.10.166.2.0.2 LSP 0.0.82.60 Changes to Down
2017.06.29-05:14:53 <99.99.99.2>: <188>Jun 29 05:14:53 2017 CR %%10LSPM/4/TRAP(t): -DevIP=99.99.99.2; 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.82.61 Changes to Up
2017.06.29-05:15:01 <99.99.99.43>: <188>Jun 29 05:15:01 2017 CF %%10LSPM/4/TRAP(t): -DevIP=99.99.99.43; 1.3.6.1.2.1.10.166.2.0.2 LSP 0.0.85.66 Changes to Down
2017.06.29-05:15:01 <99.99.99.43>: <188>Jun 29 05:15:01 2017 CF %%10LSPM/4/TRAP(t): -DevIP=99.99.99.43; 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.85.67 Changes to Up
2017.06.29-05:15:03 <99.99.99.44>: <188>Jun 29 04:15:03 2017 DP %%10LSPM/4/TRAP(t): -DevIP=99.99.99.44; 1.3.6.1.2.1.10.166.2.0.2 LSP 0.0.85.61 Changes to Down
2017.06.29-05:15:03 <99.99.99.44>: <188>Jun 29 04:15:03 2017 DP %%10LSPM/4/TRAP(t): -DevIP=99.99.99.44; 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.85.62 Changes to Up

'display logbuffer reverse' does not show these events.

What are these logs/traps? What do they mean?

5 REPLIES
parnassus
Honored Contributor

Re: LSP flap after A5500HI upgrade

Not an expert but try to follow this Knowledge Base article:

http://h20564.www2.hpe.com/hpsc/doc/public/display?docId=mmr_kc-0134919

It's not directly related to HPE 5500 HI but, maybe, it's potentially useful to diagnose what those (LSP flap) log messages could mean.
itcoop
Occasional Advisor

Re: LSP flap after A5500HI upgrade

parnassus,

  Thank you for the reply.  I have used that article to troubleshoot to no avail; the hex address in DESTINATION shows as the loopback interface/lsr-id of the local router.  I'm considering going back to P25 to see if the errors go away.  This doesn't seem to be service impacting.

parnassus
Honored Contributor

Re: LSP flap after A5500HI upgrade

...or try the latest R5501P33.
itcoop
Occasional Advisor

Re: LSP flap after A5500HI upgrade

I wish I could.  We must test the release before deployment in production.  It doesn't seem to be service impacting; however, it is filling up our logs with these notifications every 30 seconds - annoying.  This issue didn't show itself in testing and we cannot replicate this condition in the lab.

Thanks!

itcoop
Occasional Advisor

Re: LSP flap after A5500HI upgrade

After two months of running P32 with this issue, we upgraded to P33 last week. These log reporting LSP flaps exasperated after upgrading to P33.  There were two tickets opened with HPE to address the issue.  Here are the details:

  • Symptoms of the LSP log flapping are not present; transient traffic passes with no upper-layer application issues or events.
  • L3 HP TAC was adamant that there was instability in the IGP (OSPF in this case).
  • After two weeks trying to prove the nonexistence of this bias, I made it clear that the LSP flaps in the logs are occurring every ~30.23s with no correlative OSPF convergence event.
  • It was forwarded to L4 HP TAC who reaffirmed L3’s bias to wit: This trap is generated under the following conditions: 1. There was a topology change or 2. There was a route change. “[U]ltimately the trap messages are a result from instability external to the switch.”
  • I continued my search for route transitions by repeating a tracert from a remote end of the network to a device in my local campus every 2 seconds; I found no route change.  However, I did record a single packet timeout in my tracert which corresponded to the timing of LSP flapping event traps.
  • L3 HP TAC, still biased with IGP instability as the root cause, suggested the following method to determine the IP address from the LSP trap:

Jun 29 02:04:45:655 2017 CR LSPM/4/TRAP: 1.3.6.1.2.1.10.166.2.0.1 LSP 0.0.80.1 Changes to Up

  • Convert the LSP 0.0.80.1 in the log from QDN to decimal (20481 in this example)
  • Show the mpls table
    1. “Display mpls lsp verbose | begin 20481”
  • Cross reference the interface with LspIndex 20481

Every LSP trap in my logs originate from the Loopback interface:

  • No evidence (anywhere) of OSPF route transitions coinciding with the LSP flap
  • The lsp-id is the configured loopback address of the router
  • The ospf router-id is also the configured loopback address of the router
  • The flap is originating from the loopback

What would cause a Loopback interface to cause an LSP flap?

From 15 hops away, on a remote end of my network, I ran a "ping –r" using the loopback address of the router as the source.  Using the loopback address of the router as source, the remote router ping revealed one hop.  Using the ip address of a physical interface as a source, it was 15 hops away – as expected.

In Summary:

Loopback sources are being tunneled via MPLS on the 5500’s in my environment. This would be the cause of a 30 second LSP timeout and reestablishment. The 5500 prefers the direct MPLS route as opposed to using the IGP OSPF, timeout, reestablish, timeout, reestablish, ...