Operating System - HP-UX
1753901 Members
8683 Online
108810 Solutions
New Discussion юеВ

Re: Very Strange Time Out Problem

 
Icemoose
New Member

Very Strange Time Out Problem

I have worked in networks for 10 years and this is just the weirdest problem I have ever seen!

10 days ago we had users complaining of problems such as being disconnect from servers, write delay fails to document servers, being disconnected from Exchange etc etc and generally slow network performance.

This company has a basic network with a couple of HP 4104GL switches in the computer room driving out to 4 closets of edge switches.

Anyway when I ping all the switches they respond at less than 1ms to around 4ms but then suddenly one or more switch will go REQUEST TIME OUT for between 1 and 11 pings!

There is no pattern to the switches going REQUEST TIME OUT but they do!

I have put on Wireshark and am showing no problems at all, it isn't showing broadcasts, ARP, or any traffic that links in with them going offline for those seconds.

When I check the logs on the switches it doesn't mention anything about the ports going down. I have put full fauly finding on but still nothing in there apart from my telnet.

But the users are getting disconnected for split seconds while this is happening.

Any ideas from anyone!! JUST BIZARRE!!!!

The weird thing is that it is random which switch suddenly doesn't respond for 11 pings!

Thanks as always! And I owe a seriously big pint to anyone who helps me crack this.
8 REPLIES 8
Ivan Krastev
Honored Contributor

Re: Very Strange Time Out Problem

MAC spoofing? Duplicate IP addresses - someone testing DHCP service at its own pc?

Try to get more information from the switches - setup central syslog for collecting more information at the same time.

regards,
ivan
Michael Steele_2
Honored Contributor

Re: Very Strange Time Out Problem

Hi

More than likely you have a loop in your backbone between a core and edge switch when you more than one route, both pri and alt, active.

Got older swithces?
Added any additional switches?
Support Fatherhood - Stop Family Law
Michael Steele_2
Honored Contributor

Re: Very Strange Time Out Problem

Sorry, what is your multicast and broadcast traffic like?
Support Fatherhood - Stop Family Law
Icemoose
New Member

Re: Very Strange Time Out Problem

Hi

I have taken out all the second links for redundancy so we have single links from the core to all the edge switches.

Having sniffed the network there is very little multicast or broadcast traffic.

It is so strange!

It is always 11 pings that they time out for!!

But nothing in any of the switch logs, just doesn't make any sense.

Mike
Icemoose
New Member

Re: Very Strange Time Out Problem

Ok even stranger!

I replaced one of the HP switches with a Cisco switch this morning just to test.

I am seeing the same problem on that!

The trunk interface on that closes down occasionally and stays down until it is reset, it is not in an error state but it just stops working.

I am wondering if with the HP being 11 pings if the interface on an HP switch resets itself after a set amount of time??

M.
Icemoose
New Member

Re: Very Strange Time Out Problem

Just to add further info.

I can ping through the switch fine.

So for example a switch is IP 10.0.0.248 and will stop responding to pings for 11 pings. But while not responding I can ping a device that is hanging off of 10.0.0.248.

I check the CPU on 10.0.0.248 and it never goes about 15% as I assumed if it was running at 99% it might stop responding to ICMP but it isn't that.

Mike
Michael Steele_2
Honored Contributor

Re: Very Strange Time Out Problem

Verify firmware levels, one node has to be flaky.
Support Fatherhood - Stop Family Law
Icemoose
New Member

Re: Very Strange Time Out Problem

The firmware is a little out of date but has been stable on this level.

I can upgrade to the latest to be safe.

Do I just TFTP them up and it needs a reboot? Assume it remembers the old config?

(Sorry I am a Cisco guy!)

Thanks

Mike