- Community Home
- >
- Networking
- >
- Legacy
- >
- Switches, Hubs, Modems
- >
- Re: Internal frame loss on Procurve 2848 - need de...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-21-2007 02:33 PM
тАО03-21-2007 02:33 PM
Internal frame loss on Procurve 2848 - need debug help!!!
Facts:
- I'm losing about 1 frame in 6 somewhere inside the switch
- the traffic rates in question are well below 1GE line rate.
- dropping membership in the LAG down to 1 GE link completely clears the condition.
- no discard counters are incrementing on any interface on the switch
- replacing the 2848 with another produces the same loss
- port mirroring one of the ingress LAG interfaces to a sniffer shows all expected traffic on that link being received
- port mirroring on the egress interface shows that many frames are missing
- comparing Rx interface stats on the ingress ports and Tx interface stats on the egress port confirms the loss. E.g. Sum of Rx stats over both ingress ports is greater than what's transmitted out the egress port.
I dug out the "show tech stats" command. The "drops Tx" stat pasted below is a bit noisy, but it does increment very much inline with the discards I'm measuring.
Does anyone know how what this discard stat means?
Is there anywhere else in the switch I can look for better information about the discards?
sollabswitch12# show tech stat
internalstatistics
Status and Counters - System Wide Counters
External Totals (Since boot or last clear) :
Drops Tx : 1,319,882,752
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-21-2007 04:02 PM
тАО03-21-2007 04:02 PM
Re: Internal frame loss on Procurve 2848 - need debug help!!!
RX / TX Drops counter indicates that some ports were too busy to receive the data transmitted by the other side.
So this indicates that slower ports could not keep up with the packet stream coming from the other side.
Methods of troubleshooting this scenario include enabling more streamlined packet buffering on the switches by issuing the "qos-passthrough-mode" command at the config level.
More information in the following link: http://www.hp.com/rnd/library/troubleshoot_lan.htm
Good Luck !!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-21-2007 04:05 PM
тАО03-21-2007 04:05 PM
Re: Internal frame loss on Procurve 2848 - need debug help!!!
Are you using an 100Mbit devices in this test? If so, enable 'qos-passthrough-mode' (in fact I'd try that anyway).
At what rate are you trying to send this data through and what's your average packet size?
Also what firmware version are you using?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-22-2007 10:30 AM
тАО03-22-2007 10:30 AM
Re: Internal frame loss on Procurve 2848 - need debug help!!!
Additional info based primarily on your questions:
- I have set "qos-passthrough-mode" to each of "typical" "balanced" "one-port" and "optimized" . The "balanced" setting results in a 4% reduction in drops but still 1/6 or 1/7 messages don't get through. No appreciable difference with any of the other settings.
- The ports in use were somewhat scattered around the switch. I'm now using ports 39, 40 and 44. No difference there.
- All links and peer devices are running at at 1000Mbps auto.
- Aggregate steady-state traffic rate is just below 200Mbps
- The data consists of multiple TCP flows. In the direction the loss is occuring in, these consist of mainly ~1400 byte, ~900 byte or ~300 byte frames at layer 2. The reverse path generally has only min-sized TCP ACKs.
- The switch is running this:
Image stamp: /sw/code/build/mako(ts_08_5)
May 5 2006 09:47:52
I.08.98
189
I do not understand the internal architecture of this switch. I find it very curious that none of the port counters are indicating any discard. Is there another command that can tell me in more detail about the global "Drops Tx" 'show tech stat' is trying to say?
Thanks again for your input so far.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-22-2007 06:36 PM
тАО03-22-2007 06:36 PM
Re: Internal frame loss on Procurve 2848 - need debug help!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-23-2007 01:07 AM
тАО03-23-2007 01:07 AM
Re: Internal frame loss on Procurve 2848 - need debug help!!!
The traffic being LAGged into the ProCurve is being balanced using a hashing algorithm considering MAC DA/SA, IP DA/SA and TCP/UDP src/dest port. I've validated that the balancing is working properly. That is, individual TCP flows are fixed to the same physical interface (validated via port mirror and packet capture). My original suspicion was that the load balancing wasn't working properly resulting in TCP re-ordering and the resultant retransmissions and thus poor end/end application behaviour. This investigation did start out with reports of an application layer problem.
I have a test client which can set up to vary the frame size and rate across multiple simultaneous TCP streams. Using this I've been able to determine that the most important factor leading to the loss is frame size. Frames of 350 bytes and above incur loss, frames below 300 do not. The internal loss occurs at offered rates below even 8Mbps.
My working theory is that even though the traffic is being sent at a low rate, there is some burstiness that the switch can't handle when it's passing the frames from the ports to the internal switch fabric. With a single interface, the burstiness is smoothed on the way in by virtue that frames need to be serialized across the one link. But this is just a theory and that the switch can't accommodate little bursts like this would surprise me. I would have also thought there would be port counters for such a thing too.
What's more bizarre to me is the fact that we're able to actually capture the frames at one port on the switch and not see them leave another port. Again, not knowing how the internals of the switch are architected, the ability to capture and mirror the 'missing' frames is pretty conclusive proof that they were received fine. No Ethernet problems like IFG violations, FCS errors etc. Also the contents of the frame from the Ethernet header through to the TCP payload all check out.
Do you or anyone else know of any more internal counters I can pull out of this switch to better diagnose what it's doing?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-23-2007 02:38 PM
тАО03-23-2007 02:38 PM
Re: Internal frame loss on Procurve 2848 - need debug help!!!
They may be able to test the same thing with a different hardware platform, e.g, 5300, 3500, etc and see if this still occurs. Depending on the results they'd likely elevate it internally to understand if this is expected behaviour and if anything can be done to rectify it on the 2800.
If you could provide support with a written summary of the issue with the testing you've done, also include 'show tech all' reports from the switch and reference this thread if need be.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-25-2007 01:04 AM
тАО03-25-2007 01:04 AM
Re: Internal frame loss on Procurve 2848 - need debug help!!!
Dan.