1833875 Members
1917 Online
110063 Solutions
New Discussion

Re: ping spikes

 
Richard Munn
Frequent Advisor

ping spikes

I have a problem where we have several machines (mostly K580's but not all) running 10.20 which if you ping seem to get spikes in the response time. For example, if you ping any one of the machines it will normally show time=0 but every 100-200 pings it will jump to something like 200ms. The problem is that these spike occur on ALL the HP-UX machines but never occur on any of the Tru64, Solaris or Linux boxes even if they are on the same net. Also we have some C3000's and they too will show the spike but normally only around 50ms.
11 REPLIES 11
U.SivaKumar_2
Honored Contributor

Re: ping spikes

Are the HP-UX servers connected in same network or across the router

regards,
U.SivaKumar
Innovations are made when conventions are broken
Richard Munn
Frequent Advisor

Re: ping spikes

In general there are 2 or 3 routers between the two ping hosts. But this is also true if you pick a non-HP-UX host.
U.SivaKumar_2
Honored Contributor

Re: ping spikes

Hi,

What are the servers serving ? Is the CPU load normal ?
200ms will last for how many ping packets ?

regards,
U.SivaKumar
Innovations are made when conventions are broken
Richard Munn
Frequent Advisor

Re: ping spikes

We have been able to make all the machines tested idle so there is no CPU load. The spike only ever lasts for one response then returns back to 0 again.
U.SivaKumar_2
Honored Contributor

Re: ping spikes

Hi.

From which machine you are giving ping command
form windows 98 or windows 2000 or unix . Try from diffrent platform and see the same problem exists.

I don't think a single packet round trip time of 200ms should be considered as a problem

regards,
U.SivaKumar
Innovations are made when conventions are broken
Ron Kinner
Honored Contributor

Re: ping spikes

I assume you see this spike when pinging between HPUX on the same LAN so we can eliminate the routers? And you have checked

lanadmin
lan
display

to verify that you are not getting too many collisions and that you do not have a Duplex mismatch?

My bet would be that there is something in the output queue that has to be processed before it can send the echo reply (or in the input queue on either end). Are you perhaps running GATED (RIP or OSPF?) or anything else which sends something out regularly? Or maybe something external like MRTG or OpenView which is asking for a lot of data?


A sniffer might tell you what's going on.

Ron
Richard Munn
Frequent Advisor

Re: ping spikes

In response to the last two replies, the ping box can be many things but we mostly use a Tru64 5.1A, HP-UX 10.20 or Solaris 7 machine. All of them show the same results.

We have pinged from one host to another on the same LAN via a switch to avoid the routers. The spike is still there. GATED is not running, and lanadmin shows no errors and very few collisions. nettl has error reports turned on and does how quite a few TCP dropped packets, location 20 with RST sent and others with no RST sent, but I beleive this is normal. There are not reports against ICMP at all and it has WARNING all reports turned on.

There is a big OpenView machine on the net but I don't think it is too invasive.
Ron Kinner
Honored Contributor

Re: ping spikes

Does

netstat -p icmp

show anything funny? Probably not. I'm inclined to believe it's further up in the layers. Perhaps the CPU has some housekeeping it needs to do and is a little slow about getting back with the echo reply?

Just for fun do a

ping hostname 1400

which will send out big pings and see if you see anything interesting happening.

We no longer have any 10.20s around that are easy to ping. I tried it on an 11.0 and on an old 10.01 and never saw any increase. Perhaps you need the latest ARPA patch?

Ron

Richard Munn
Frequent Advisor

Re: ping spikes

The spike is still there with a packet size of 1400. The spikes do seem to be closer together than with a packet size of 64 but it could be co-incidence.

netstat -p icmp shows nothing in terms of errors etc.
Richard Munn
Frequent Advisor

Re: ping spikes

The problem seems to have been found. We noticed that on all the affected machine diagmond was running continuously around 9% CPU.

It was stopped and the spikes seem to disappear. Startit again and the spikes come back.

I'm not sure why diagmond should have this affect on the LAN however.
Steven E. Protter
Exalted Contributor

Re: ping spikes

A few things to think about.

Patches, are you up to date?

We had a situation where ping times gradually climbed until the machine was rebooted. That NIC card eventually had to go.

If you have a linux box, do a ping -f IP_ADDY and see what percentage gets through. This is a stress test, so you need to be careful.

If you get a figure below 100% you have router or network issues and they could be physical.

I have never seen ping repsonse times affected, even when the machine was doing 100% being stressed out by an oracle process.

While the oracle data is moving through the card, response times went up a little, but remained constant. To spikes at all.

Steve
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com