Switches, Hubs, and Modems
cancel
Showing results for 
Search instead for 
Did you mean: 

HP3400 not really manageable?!

HP3400 not really manageable?!

I've been a big fan of HP's Procurve switches for years but now I'm starting to doubt. Previously I only used them for L2 switching, but now I've started to use them for routing as well. First I started with 2626 switches but learned the hard way that thay had way to small host route tables to do any serious routing.

I was then recommended by HP to use their, at the time, new 3400 instead. The 3400 solved all my performance issues and handles routing of my network just perfect with low latency and pratically no CPU utilization, which hints me that the routing is done by the ASIC as it should.

BUT... For some reason still unexplained to me, the switch stops responding to management traffic (SNMP and pings) during load. First it starts dropping mgmnt packets sporadically, and if the load increases it stops responding completely! This is completely unacceptable for me as it makes it impossible to monitor the switch health and port utilization resulting in false "device failure" alarms every night.

Now, if I were routing gigabits/s through this switch and maxing out it's capacity I would have understood that it must prioritize switching and routing before management and thous start dropping management traffic, but my network only handles about 50 Mbps!! It's NOTHING, and the switch still can't answer one silly ping packet per second!

I opened a support case with HP about this over half a year ago, and have provided them with all kind of logs, I've setup test environments proving the same results with really easy to reproduce setups, but the only answer I get from them is: "It works as it's designed todo! The switch prio 1 is to switch and route, management is a low-prio task".

I fully agree that's how the prioritization should be done, but what I can't understand is how a switch/router designed to cope with gigabit load, runnning with practially no CPU utilization can be so "overloaded" by routing 25-50 Mbps that it has to start dropping management packets! That's seems like a design error to me, that must be possible to fix without affecting switching/routing performance.

12 REPLIES
Anders_35
Regular Advisor

Re: HP3400 not really manageable?!

This has been a real headache for us, too.
Our 34xx and 6400 switches just go "black hole" on us with even very small loads.

For a while, we used ping-tests to the switches IP-interface to check whether our mesh links were up when making changes in the spanning-tree.
It turned out, whenever the spanning-tree changed, that was enough load for all the switches to stop responding for a few seconds. For some time, we thought our switches were totally useless, until we found out avout the low priority on the IP-interface.

there is no doc on it, and HP is telling us the same thing: "it's working as designed".


*sigh*
No, this is not acceptable. I can't live with not getting logging and status, or any kind of control etc. whenever my switches are a little busy. The auditors will have a fiest on this.

Re: HP3400 not really manageable?!

Nice to hear that I'm not alone thinking this is completely unacceptable!

I just had a spoke with a support tech for HP he is expecting an explaination from the lab within a day or two. It better be a good one this time :-) I'll let you all know when I get an answer.

//Mathias
Les Ligetfalvy
Esteemed Contributor

Re: HP3400 not really manageable?!

I never got very far when I complained about ping losses, only that they are normal. I had to lower the threshold in WhatsUp to stop the false alarms.

I have also been fighting an issue where PCM+ cannot sustain Top N, leaving my graphs with more notches than data. It too is going nowhere.

I trend as well with MRTG and it reports CPU to never exceed 14%.
Regnar Bang Lyngsø_2
Frequent Advisor

Re: HP3400 not really manageable?!

Re: HP3400 not really manageable?!

Regnar,



Les,

I could probably get away with raising my thresholds in my monitoring software for now, but I'm think about what will happend within maybe a year when the load I'm seeing on my peak hour right now will be the normal load all day and the peaks will be like double or tripple what I'm seeing now. Accoring to my simple UDP stream tests that will put the switch completely out of management!

So should I then have to call my customers and ask them to please hold their on-line backups for halv an hour so I can manage my router?!

I still wonder, is this a design error that will be fixed, or is the 3400cl simply not intended todo any serious routing with?

What about the 5300? Anyone who knows if it suffers from the same problem?
Sergej Gurenko
Trusted Contributor

Re: HP3400 not really manageable?!

For the 5300 virus throttling feature introduced. This function actually saves you from TCP/UDP session (hosts table) table overflows. Customers reported that 5300 freezes routing at all during massive nmap scans. With Virus Throttling enabled no such problems appears.

Re: HP3400 not really manageable?!

Sergej,

again that seems todo the reverse of what I want. If I understood you correctly the "virus throttling" feature throttles management traffic (traffic destinated for the switch/router itself) to ensure normal switching/routing is not affected, or do you mean that this feature also throttles normal routing/switching to keep resources for management traffic available?
Sergej Gurenko
Trusted Contributor

Re: HP3400 not really manageable?!

this feature throttles normal routing to keep resources for management traffic and routing available. This is downside of limiting any routed IP to iniciate N sessions per second. This limits are configurable. This limitations are done in hardware (AFAIK).
More info about Connection Rate Filtering (aka Virus Throttling) here:
http://www.hp.com/rnd/pdfs/virus_throttling_tech_brief.pdf
http://www.hp.com/rnd/training/technical/5300xl_E.09.xx.htm

Re: HP3400 not really manageable?!

Thanks for the info. That seems to be a good feature but not really what I'm looking for.

the details about my lab stress test in short what I did was that I used a linux machine with PKTGEN kernel module to generate a UDP stream of data. All I needed to generate was a stream of 2000 pps of 1400 bytes packets (about 21 Mbps). Nothing for a switch/router like the 3400cl, right? No chance in the world to fill up any host route or arp table as these were the only machines involved in the lab test and still the switch starts dropping management traffic.

There should be no need todo any throttling with this really low load, should it??

And what about buffering? Is there no input buffering queue in this kind of devices? If the switch is buzy with a high-priority task the very moment I send an icmp echo-request or snmp request to it, I'd expect it to end up in a buffer so it can be processed at a later time. If the CPU utilization were running at 100% I'd then expect the buffer to get filled up fast, and thous prioritization and packet dropping would be the result, but in this case the switch CPU utilization is never above 7%, so how come the switch don't have time to process this input queue??

If there is no such input buffer, why not implement one?
Sergej Gurenko
Trusted Contributor

Re: HP3400 not really manageable?!

Acording to the datasheets new series of the switches (build on provision asics) have

Switch CPU protection: provides automatic
protection against malicious network traffic
trying to shut down the switch

Re: HP3400 not really manageable?!

And? Are you saying that that might be the cause of my problems?

If so, how do I tell if it is that "feature" that is bugging me? There must be some kind of log or debugging command that can tell me if it's so, right?

I'm so tired of not beeing in control of my equipment! When I used linux machines todo my routing everything worked just perfect and if there ever was any problem I could debug it just as deep as it was necessary There must be some kind of debugging interface in these devices that I can use to profile the CPU utilization or monitor buffer levels to help me understand what the switch is so buzy with so it can´t answer my management traffic.
Preston Gallwas
Valued Contributor

Re: HP3400 not really manageable?!

Well, if you're looking for alternatives, don't go with allied telesyn 9816GB or GFs...we decided to use a few of those to save on cost...and our video editing / photo editing labs now completely by pass that architecture becuase for whatever reason, a 19 meg file takes 4.5 minutes to load when it goes through that...grr