Aruba & ProVision-based
1752577 Members
4661 Online
108788 Solutions
New Discussion

Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks

 
SOLVED
Go to solution
parnassus
Honored Contributor

Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks

Interesting...that global setting was introduced and enabled by default since x.15.14.0007 (here) - I recall it was cited already on this Post - ...but the OP's Switch starting software version was KB.15.17...so the Switch was already running with a post KB.15.14 software version...at this point: is it possible that that setting was effective (because it was enabled by default on any version since the KB.15.14.0007) but didn't produced all the negative effects on his network until the last upgrade to KB.16.02 jumped in? why the negative effects didn't showed up before if no other changes (traffic consistently grown?) were introduced?


I'm not an HPE Employee
Kudos and Accepted Solution banner
Michael Patmon
Trusted Contributor

Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks

We're still in the process of root causing it so I don't want to speculate here, but that "feature" was introduced with v1 modules many years ago.  The CLI command to enable/disable was added a few years back to address a particular issue at the time.  

We believe there was a recent change specific to the J9538A modules that made it more susceptible to HOL blocking in some scenarios, which can cause latency and other performance issues..  Disabling tcp-push-preserve is a workaround.  

I will post back when I have more information.  

jondehen
Advisor

Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks

Is it possible that the no tcp-push-preserve will negatively impact other devices or throughput speed?  As it is a global setting I am hesitant to try it during production hours.  Honestly not sure what other devices might be using that setting.  We did find that about 25% of the TCP packets had the push flag set, for what it's worth.

We're planning to try a reboot sequence of the hosts and NAS devices, then those with the switch.  If the reboots do not alleviate the issue, we'll proceed with firmware rollback, as follows: 15.17.0007 (last known working).  If that works, we'll try the versions until we hit an issue: 15.17.0013 > 15.18.0013 > 16.01.0010 > 16.02.0013 (current)

 

Michael Patmon
Trusted Contributor
Solution

Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks

We believe this issue, specific to the J9538A module ports 4, 5, and 6, was introduced in K/KB.16.01.0008 and K/KB.16.02.0011.  

Disabling that feature should not negatively impact other devices or throughput.  Disabling it basically configures the switch to drop TCP push traffic as it would any other packet when the egress queue fills up.  The egress queue filling up incorrectly is the issue and "no tcp-push-preserve" helps in that condition.

 

jondehen
Advisor

Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks

Thank you for the information!  We'll try releases before those, and maybe with and after those.  Actually, we're seeing issues on ports besides 4, 5, and 6.  (for example, 2).  Switching from 2 to 8 seems to have helped.  Additionally, we're using hardware v2 of module J9538A.

parnassus
Honored Contributor

Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks

That's interesting.

AFAIK the 8 ports 10-GbE v2 zl Module (J9538A) has two static Channels built, respectively, grouping module ports 1, 4, 6 and 8 for Channel 1 and module ports 2, 3, 5 and 7 for Channel 2.

Each Channel provides a total aggregated bandwidth of 23.4 Gbps (so the total aggregated Channels bandwidth of the entire module is 46.8 Gbps).

The Ports assignment on each Channel of the J9538A Module is fixed so the aggregated bandwidth is shared whitin that specific set of physical ports (exactly when those ports switch to a "linked state" so are active), the Ports versus Channels schema is:

  • Channel 1: Port 1
  • Channel 1: Port 4
  • Channel 1: Port 6
  • Channel 1: Port 8
  • Channel 2: Port 2
  • Channel 2: Port 3
  • Channel 2: Port 5
  • Channel 2: Port 7

Basically this means that if you need full wire rate transfer speeds (10 Gbps Full Duplex, so 20 Gbps) you must not connect more than one 10 GbE port per Channel (so you must not connect more than one port every four ports of a Channel)...that's because the module applies oversubscription (simply it's not able to sustain 8 x 10 Gbps = 80 Gbps wire rate [*]).

Said so it's somewhat important to know what ports to connect (and what ports don't) to let those alone connected ports to reach wire speed.

Another interesting thing to pay attention of is that v2 zl Modules (like the J9538A) benefit of a maximum Bandwidth of 40 Gbps (per Slot) when the 5400R zl2 is operating in v2 Compatibility Mode, on the contrary v3 zl2 Modules (like the J9993A, successor of the J9538A) benefit of a maximum Bandwidth of 80 Gbps (per Slot) either when the 5400R zl2 operates in v2 Compatibility Mode or when it operates in v3 only Mode (v2 zl Modules will not be supported in this v3 only Mode of operation).

[*] Question: those 80 Gbps refers to Full Duplex or not?

Edit: it's also interesting to read the J9538A Module related defect ProVision CR_0000213551 report (this particular Issue was declared already fixed) in which - as a workaround (so it wasn't the cause for the issue we're discussing here!)  - HPE advised to use SFP+ Transceivers instead of SFP Transceivers on Ports 4, 5 or 6 or, if possible, use different remaining ports, 1, 2, 3, 7 or 8...


I'm not an HPE Employee
Kudos and Accepted Solution banner
jondehen
Advisor

Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks

We rolled back from 16.02.13 to 16.02.10 and the issues with discards and drops appears resolved.  We did not use the workaround command to ignore the PUSH flag.

We also rearranged our connections to more accurately adhere to the channels.  I find it strange that the channels are not 1-4/5-8 OR even/odd ports.

parnassus
Honored Contributor

Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks

That's good to know.

Yeah, channel <--> port binding order looks strange to me too...but it's that, look below (from the glorious HP 5400 zl Switches Installation and Getting Started Guide, Manual Port Number: 5998-2998 of June 2013):

Screenshot_1.png

or here (specifically sheet 9).

 

 

 

 


I'm not an HPE Employee
Kudos and Accepted Solution banner
parnassus
Honored Contributor

Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks

The ArubaOS-Switch KB.16.02.0014 Release Notes is worth reading especially regarding:

  • Enhancement: TCP Push Preserve mode is set to disabled by default now.
  • Fix: CR_0000216989 related to Switch Module "Switch performance degrades when using ports 4, 5, or 6 on J9583A switch" (partially IMHO related to - due to that ...may improve... on the workaround - the subject - TCP Push Preserve mode - of the above enhancement).

I'm not an HPE Employee
Kudos and Accepted Solution banner