- Community Home
- >
- Networking
- >
- Switching and Routing
- >
- HPE Aruba Networking & ProVision-based
- >
- Re: HP 5412 10GbE Module Issues / Troubleshooting ...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-20-2016 01:43 PM
10-20-2016 01:43 PM
We recently upgraded the firmware on our 5412Rzl2 (J9851A) as such: KB_15_17_0007 > KB_16_02_0013. All operations are normal except some issues with our single 8-Port 10-GbE SFP+ v2 zl module (J9538A) as follows:
- Occasional 1-2 second disconnect/reconnects in the switch log from the 10GbE ports
- DRASTIC, but occasional, slowdown of speeds, which seem to self-correct eventually
- High numbers of Discard Rxs
- High numbers of Drop Txs (but not nearly as high as Discard Rxs)
We are looking for any additional troubleshooting commands or techniques (other than show log and show interface <PORT>) which might yield insight into the issue.
Right now we're unsure if the issue is:
- The new firmware (both primary and secondary were updated but we can still rollback one of them)
- Failing hardware (cables, module, ports, NICs)
- Drivers (NICs)
- Hosts
More Info:
- All ports in this module are in an unroutable VLAN so there shouldn't be any commuication in/out
- Hosts have Intel X710s (rev 01) NICs (recent but not the latest firmware)
- I cannot verify if these ports had abnormal Discard or Drop counts before the firmware upgrade to compare to
Please let me know if I can provide any other details. Thank you!
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-20-2016 02:52 PM - edited 10-20-2016 03:00 PM
10-20-2016 02:52 PM - edited 10-20-2016 03:00 PM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
I'd suggest logging a case and getting them to escalate to level2/3, you may find out they already know about this and a fix is slated for release. I've seen a few problems on Kx.16, and have already escalated one issue with a POE software bug - which turned out to be a known bug with unknown fix release date.
I wouldn't use Kx.16 in production yet, maybe test it in a lab for now until those early release cycle problems are resolved.
A useful troubleshooting tool is "show tech", and also have a look at "show instrumentation"
switch# show instrumentation ?
- cam Show internal version-dependent counters for debugging.
- monitor Show latest values for monitored parameters.
- port Show internal version-dependent counters for debugging the specified port.
- resptime Show service response time data for performance sensitive operations registered for response time measurement.
- routing Show routing related instrumentation parameters.
- vlan Show internal version-dependent counters for debugging the specified VLAN.
There is also a debug mode I've used in the past, that goes really deep into the "tech support" areas, but it gets quite complicated and probably is deeper than most customers would want to go.
http://networkgeekstuff.com/networking/procurve-and-hidden-command-line/
Search for term: edomtset
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-21-2016 02:07 AM
10-21-2016 02:07 AM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
Curious to know if (and how) those eight 10GbE ports - of the HP 8-port 10-GbE SFP+ v2 zl Module - are all used concurrently (maybe ports overcommit/oversubscription could/couldn't enter in the picture so having a role in the issue)...first of all start with collecting the status of each Transceiver used on those ports, what the command
show interfaces transceiver n detail (where n is the port number)
reports?
Supposing that nothing else changed but the Firmware then the actual Firmware could be the first culprit one think of...but to diagnose that - without being necessarily biased by the concept "bad new Firmware versus good old Firmware" (I mean without considering other possible sources of issues) - you should be sure enough that exactly nothing else had changed in your environment before you did that Firmware upgrade.
I'm not an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-21-2016 05:38 AM
10-21-2016 05:38 AM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
Thanks for the replies! I dug through those commands and didn't find anything particularly useful, although perhaps much of it is past my understanding. I did check the transceiver statuses but not sure what to look for.
The firmware was the only thing which changed, unless you count the loss of network connectivity for the devices. The devices in question are some VMware hosts and a few NAS devices. We're going to try a full reboot of everything once we can afford downtime.
Can anyone explain the differences between firmware versions? (Major vs minor vs incremental) <MAJOR>.<MINOR>.<INCREMENTAL>
Perhaps the explaination of the three parts of the version number will help explain which firmware I should choose when upgrading...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-21-2016 05:47 AM - edited 10-21-2016 05:54 AM
10-21-2016 05:47 AM - edited 10-21-2016 05:54 AM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
Mmm...the best document I've read is the HP ProVision Software Release Process (2015): it should be explain exactly what you're looking for...
Don't you want to post and share (first trim all possible Serial Numbers and other relevant sensible information about your products/configurations) the result of the command above run against your various 10Gb Transceivers interfaces?
I'm not an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-21-2016 05:53 AM
10-21-2016 05:53 AM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
Excellent PDF, thank you! So the show interface transceiver n detail command is identical for all 8 ports, except for the incrementing Interface Index and the Serial Number....
Transceiver in L1
Interface Index : 353 (varies)
Type : SFP+DA7
Model : J9285B
Connector Type : Vendor specific
Wavelength : n/a
Transfer Distance : 7m (copper),
Diagnostic Support : None
Serial Number : <VARIES>
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-21-2016 06:35 AM
10-21-2016 06:35 AM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
Are those DAC Cables installed correctly (respecting the minimum bend radius, not below 1")?
I'm not an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-21-2016 07:47 AM
10-21-2016 07:47 AM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
@parnassus wrote:Are those DAC Cables installed correctly (respecting the minimum bend radius, not below 1")?
It appears that they are all installed with at least this minimum.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-21-2016 11:37 AM
10-21-2016 11:37 AM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
We are in the process of root causing an issue for that specific module (J9538A) on ports 4, 5, and 6. In the meantime there is a configuration option you can disable that should alleviate the symptoms:
HP-Switch-5406Rzl2(config)# no tcp-push-preserve
There is a low level issue causing head of line blocking on those ports in the presence of of large amounts ot TCP traffic with the push bit set.
HP-Switch-5406Rzl2(config)# tcp-push-preserve help
Usage: [no] tcp-push-preserve
Description: Enable TCP Push Preserve mode. This mode determines the
flow of the TCP packets that have the PUSH flag set. When
this mode is enabled and the egress queue is full, TCP
packets with the PUSH flag set are queued at the head of the
ingress queue for egress queue space. This might delay
subsequent incoming packets in the same queue. When this
mode is disabled and the egress queue is full, TCP packets
with the PUSH flag set are dropped from the head of the
ingress queue.
By default, this mode is enabled. Disable this mode when a
large number of TCP packets with the PUSH flag are being
dropped due to congestion.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-21-2016 12:56 PM
10-21-2016 12:56 PM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
Interesting, Michael. I think we're going to boot a dormant host on that module with a live DVD, mirror an active port to it, and use wireshark to investigate the actual traffic. We'll hopefully be able to see what is being dropped or discarded, as well as if any of the TCP packets are indeed using the PSH flag or not.
Our issues also seem low level, and we'll likely end up rolling back firmware. First, to the previous, and then second, to some newer ones (but not the absolute latest 16.02.xxxx).
I'll update here once we find more results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-21-2016 01:08 PM - edited 10-21-2016 01:09 PM
10-21-2016 01:08 PM - edited 10-21-2016 01:09 PM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
Interesting...that global setting was introduced and enabled by default since x.15.14.0007 (here) - I recall it was cited already on this Post - ...but the OP's Switch starting software version was KB.15.17...so the Switch was already running with a post KB.15.14 software version...at this point: is it possible that that setting was effective (because it was enabled by default on any version since the KB.15.14.0007) but didn't produced all the negative effects on his network until the last upgrade to KB.16.02 jumped in? why the negative effects didn't showed up before if no other changes (traffic consistently grown?) were introduced?
I'm not an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-21-2016 02:34 PM
10-21-2016 02:34 PM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
We're still in the process of root causing it so I don't want to speculate here, but that "feature" was introduced with v1 modules many years ago. The CLI command to enable/disable was added a few years back to address a particular issue at the time.
We believe there was a recent change specific to the J9538A modules that made it more susceptible to HOL blocking in some scenarios, which can cause latency and other performance issues.. Disabling tcp-push-preserve is a workaround.
I will post back when I have more information.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2016 07:31 AM
10-24-2016 07:31 AM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
Is it possible that the no tcp-push-preserve will negatively impact other devices or throughput speed? As it is a global setting I am hesitant to try it during production hours. Honestly not sure what other devices might be using that setting. We did find that about 25% of the TCP packets had the push flag set, for what it's worth.
We're planning to try a reboot sequence of the hosts and NAS devices, then those with the switch. If the reboots do not alleviate the issue, we'll proceed with firmware rollback, as follows: 15.17.0007 (last known working). If that works, we'll try the versions until we hit an issue: 15.17.0013 > 15.18.0013 > 16.01.0010 > 16.02.0013 (current)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2016 11:39 AM
10-24-2016 11:39 AM
SolutionWe believe this issue, specific to the J9538A module ports 4, 5, and 6, was introduced in K/KB.16.01.0008 and K/KB.16.02.0011.
Disabling that feature should not negatively impact other devices or throughput. Disabling it basically configures the switch to drop TCP push traffic as it would any other packet when the egress queue fills up. The egress queue filling up incorrectly is the issue and "no tcp-push-preserve" helps in that condition.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2016 12:26 PM
10-24-2016 12:26 PM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
Thank you for the information! We'll try releases before those, and maybe with and after those. Actually, we're seeing issues on ports besides 4, 5, and 6. (for example, 2). Switching from 2 to 8 seems to have helped. Additionally, we're using hardware v2 of module J9538A.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-24-2016 03:00 PM - edited 10-26-2016 12:45 PM
10-24-2016 03:00 PM - edited 10-26-2016 12:45 PM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
That's interesting.
AFAIK the 8 ports 10-GbE v2 zl Module (J9538A) has two static Channels built, respectively, grouping module ports 1, 4, 6 and 8 for Channel 1 and module ports 2, 3, 5 and 7 for Channel 2.
Each Channel provides a total aggregated bandwidth of 23.4 Gbps (so the total aggregated Channels bandwidth of the entire module is 46.8 Gbps).
The Ports assignment on each Channel of the J9538A Module is fixed so the aggregated bandwidth is shared whitin that specific set of physical ports (exactly when those ports switch to a "linked state" so are active), the Ports versus Channels schema is:
- Channel 1: Port 1
- Channel 1: Port 4
- Channel 1: Port 6
- Channel 1: Port 8
- Channel 2: Port 2
- Channel 2: Port 3
- Channel 2: Port 5
- Channel 2: Port 7
Basically this means that if you need full wire rate transfer speeds (10 Gbps Full Duplex, so 20 Gbps) you must not connect more than one 10 GbE port per Channel (so you must not connect more than one port every four ports of a Channel)...that's because the module applies oversubscription (simply it's not able to sustain 8 x 10 Gbps = 80 Gbps wire rate [*]).
Said so it's somewhat important to know what ports to connect (and what ports don't) to let those alone connected ports to reach wire speed.
Another interesting thing to pay attention of is that v2 zl Modules (like the J9538A) benefit of a maximum Bandwidth of 40 Gbps (per Slot) when the 5400R zl2 is operating in v2 Compatibility Mode, on the contrary v3 zl2 Modules (like the J9993A, successor of the J9538A) benefit of a maximum Bandwidth of 80 Gbps (per Slot) either when the 5400R zl2 operates in v2 Compatibility Mode or when it operates in v3 only Mode (v2 zl Modules will not be supported in this v3 only Mode of operation).
[*] Question: those 80 Gbps refers to Full Duplex or not?
Edit: it's also interesting to read the J9538A Module related defect ProVision CR_0000213551 report (this particular Issue was declared already fixed) in which - as a workaround (so it wasn't the cause for the issue we're discussing here!) - HPE advised to use SFP+ Transceivers instead of SFP Transceivers on Ports 4, 5 or 6 or, if possible, use different remaining ports, 1, 2, 3, 7 or 8...
I'm not an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2016 06:24 AM
11-07-2016 06:24 AM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
We rolled back from 16.02.13 to 16.02.10 and the issues with discards and drops appears resolved. We did not use the workaround command to ignore the PUSH flag.
We also rearranged our connections to more accurately adhere to the channels. I find it strange that the channels are not 1-4/5-8 OR even/odd ports.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-07-2016 09:40 AM - edited 11-07-2016 10:29 AM
11-07-2016 09:40 AM - edited 11-07-2016 10:29 AM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
That's good to know.
Yeah, channel <--> port binding order looks strange to me too...but it's that, look below (from the glorious HP 5400 zl Switches Installation and Getting Started Guide, Manual Port Number: 5998-2998 of June 2013):
or here (specifically sheet 9).
I'm not an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-10-2016 11:09 AM - edited 11-11-2016 05:12 AM
11-10-2016 11:09 AM - edited 11-11-2016 05:12 AM
Re: HP 5412 10GbE Module Issues / Troubleshooting Tricks
The ArubaOS-Switch KB.16.02.0014 Release Notes is worth reading especially regarding:
- Enhancement: TCP Push Preserve mode is set to disabled by default now.
- Fix: CR_0000216989 related to Switch Module "Switch performance degrades when using ports 4, 5, or 6 on J9583A switch" (partially IMHO related to - due to that ...may improve... on the workaround - the subject - TCP Push Preserve mode - of the above enhancement).
I'm not an HPE Employee
