Switches, Hubs, and Modems
cancel
Showing results for 
Search instead for 
Did you mean: 

Procurve 4108GL occasional networkwide connection loss

Tonn_1
Occasional Visitor

Procurve 4108GL occasional networkwide connection loss

Hello,

searched quite a while in all the support stuff, but found nothing helping me with our somewhat strange problem:

The 4108GL is the heart of a ~120 clients network, equipped with five 10/100 modules, running the latest firmware.

Since beginning of the year we expierence strange behavior of the network which can't be reproduced so far. Without any reasonable cause it starts with one or a couple of clients losing their connections to the servers / the dsl router. Once it has begun, one client after the other looses its connection, sometimes a whole group nearly at once, sometimes really slow one after another. It's definitely not a client software problem - after restarting the switch everythinf works quite fine.

The problem appears sometimes several times a day, sometimes only once a week. There's no logical connection to traffic amount - it happen's after nearly all clients have been switched off an only some ar running, and it happens under heavy load.
There have been no changes made, which could be a reason for the error - no new clients, new net equip, no changed cables or so. It just started happening one fine day. The firmware update did not fix the thing.

We have no more idea, what the reason could be - if anyone out there has an idea or a helpfull hint where/what to look at, we would be very happy. We got to the point were it seems to be attractive to open the window and throw... But maybe there's a solution and we should better throw the admin...

Thank you for your patience, and excuse the rather pidgin-like English ;-).

Kind regards,
Lucifer
5 REPLIES
Matt Hobbs
Honored Contributor

Re: Procurve 4108GL occasional networkwide connection loss

I would speak to HP support about this one. What I would suggest right now is to capture a 'show tech all' from the switch when it is working normally, and then another 'show tech all' when the problem occurs.

Support should be able to compare the two and hopefully it will give some clues as to what may be happening.
Eirram_1
Frequent Advisor

Re: Procurve 4108GL occasional networkwide connection loss

Well, depends on your admin :)

It would be interesting to know the age of the firmware. There are some fixes in newer code that kinda hint towards what you are suffering from. Is the 4100 switching only?

As said, open a call with support.

Goodluck.
Tonn_1
Occasional Visitor

Re: Procurve 4108GL occasional networkwide connection loss

Hi again,

thanks a lot for your hints / participation so far.
The switch is "only switching" and runs the latest firmware 7 something with 79 or so at the end. Just updated a few weeks ago, hoping this would fix the problem, but it did not.

But we looked at the log again and found a strange mass of error pakets on one port where one of the servers is connected. The error packets also occured on any other port this server has been connected to. Additionally we recognized a strange never changing frequency in the "blinking" of the act-LED on that port, which did not correspond to the actual activity which is much lower. So our latest try was (just today) to deactivate the corresponding NIC an now trying another one. The error packets and the strange light show don't appear any longer (at the moment) - we will keep an eye on this.

My question to the specialists out there: could it be possible that a slightly defect NIC (on this server) after some time causes so many errors, that the switch just can not handle it any more and refuses to work - which then would leed to the described "crash" of the whole network?

Thanks,

kind regards,

Lucifer
Matt Hobbs
Honored Contributor

Re: Procurve 4108GL occasional networkwide connection loss

"My question to the specialists out there: could it be possible that a slightly defect NIC (on this server) after some time causes so many errors, that the switch just can not handle it any more and refuses to work - which then would leed to the described "crash" of the whole network?"

It shouldn't be possible, but I've seen a lot of weird things in the past and wouldn't want to rule it completely out of the question.
Sergej Gurenko
Trusted Contributor

Re: Procurve 4108GL occasional networkwide connection loss

In case of someone affects you Root STP switch the whole switched network could flap. The bigger L2 domains you have the bigger problems you can have. A single loop (incorrectly connected HUB) can kill a whole L2 domain with amplified ARP requests (80-90% CPU utilization on the PCs).
Other possibility (from personal experience): Single flapping port can overutilize a switch if logging is enabled (syslog or/and internal buffer)
Special fancy tools exists for STP attacks.
Only L3 segmentation will isolate you network from such failures.
Quick overview here: http://www.ciscopress.com/content/images/1587201534/samplechapter/1587201534content.pdf