Operating System - HP-UX
1836392 Members
3461 Online
110100 Solutions
New Discussion

Re: -- Network trouble --

 
OLIVA_1
Regular Advisor

-- Network trouble --

Hello,

I have a strange behaviour with a D320 server.
When I ping a router (through LAN and WAN networks) I have Hello,

I have a strange behaviour with a D320 server.
When I ping a router (through LAN and WAN networks), the routers seams answer several times to the same echo reply request. In Final the % of packet lost is -105 ???

sppar1-nms-[/home/nms/nms/log]: ping bpar999
PING bpar999.adminnet.sita.net: 64 byte packets
64 bytes from 57.0.159.72: icmp_seq=0. time=2. ms
64 bytes from 57.0.159.72: icmp_seq=0. time=41. ms
64 bytes from 57.0.159.72: icmp_seq=1. time=2. ms
64 bytes from 57.0.159.72: icmp_seq=1. time=29. ms
64 bytes from 57.0.159.72: icmp_seq=2. time=3. ms
64 bytes from 57.0.159.72: icmp_seq=2. time=14. ms
64 bytes from 57.0.159.72: icmp_seq=2. time=26. ms
64 bytes from 57.0.159.72: icmp_seq=3. time=2. ms
64 bytes from 57.0.159.72: icmp_seq=3. time=17. ms
...
64 bytes from 57.0.159.72: icmp_seq=12. time=167. ms
64 bytes from 57.0.159.72: icmp_seq=13. time=3. ms
64 bytes from 57.0.159.72: icmp_seq=13. time=23. ms
64 bytes from 57.0.159.72: icmp_seq=14. time=3. ms
64 bytes from 57.0.159.72: icmp_seq=14. time=30. ms
64 bytes from 57.0.159.72: icmp_seq=15. time=2. ms
64 bytes from 57.0.159.72: icmp_seq=15. time=26. ms
64 bytes from 57.0.159.72: icmp_seq=16. time=2. ms
64 bytes from 57.0.159.72: icmp_seq=17. time=3. ms
64 bytes from 57.0.159.72: icmp_seq=18. time=2. ms
64 bytes from 57.0.159.72: icmp_seq=18. time=27. ms

----bpar999.adminnet.sita.net PING Statistics----
19 packets transmitted, 39 packets received, -105% packet loss
round-trip (ms) min/avg/max = 2/20/167



This phenomenon doesn't happen from another identical server on the same LAN.


Thanks for you help.
23 REPLIES 23
OLIVA_1
Regular Advisor

Re: -- Network trouble --

One more point !!!
This behaviour is sporadic, the server can work several days without any trouble....
Pete Randall
Outstanding Contributor

Re: -- Network trouble --

What are the NIC settings? Run "lanscan" to get the instance number of the NIC then run "landmin -x N" (where N is the instance number) to see what the card is set to.


Pete


Pete
OLIVA_1
Regular Advisor

Re: -- Network trouble --

Hi Pete,

# lanscan
Hardware Station Crd Hardware Net-Interface NM MAC HP DLPI Mjr
Path Address In# State NameUnit State ID Type Support Num
8/16/6 0x080009D262F6 0 UP lan0 DOWN 4 ETHER Yes 52
8/8/1/0 0x001083F97F03 1 UP lan1 UP 5 ETHER Yes 115
#
# lanadmin -x 5
Current Speed = 100 Full-Duplex Auto-Negotiation-OFF
Tim D Fulford
Honored Contributor

Re: -- Network trouble --

o Why is lan0 down?
o you supplied lanadmin -x 5 , but yopu only seem to have two lan cards (0 & 1) what outputs do you get for lanadmin -x 0 & lanadmin -x 1

Tim
-
Pete Randall
Outstanding Contributor

Re: -- Network trouble --

Well, that seems OK as long as the port on the switch is set the same way. Other than that, I don't have any other guesses at the moment.


Pete


Pete
Tim D Fulford
Honored Contributor

Re: -- Network trouble --

Could be linked to source quenching?? I do not know if you are running 10.20 or 11.0

If 11.0 it will be a ndd parameter

ndd -h | egrep que

If 10.20 it is nettune & I have forgotten about it!!!

Regards

Tim
-
Jeff Schussele
Honored Contributor

Re: -- Network trouble --

Hi Oliva,

I'll bet you're running HP-UX 10.2, correct?
If so that's why you're using 5 in the lanadmin command as 10.2 needs the NMID not the PPA.
I'd be interested to see netstat -in & netstat -rvn outputs to see what the setup & routes are, but I bet you're getting multiple ICMP echos because of a network configuration issue - but that's just an educated guess.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
doug mielke
Respected Contributor

Re: -- Network trouble --

I'd bet on network config as well. A couple of tracerts would be interesting, as well as looking for a spanning tree/ trunking error somwhere in the route.
Bill Hassell
Honored Contributor

Re: -- Network trouble --

Multiple answers to a ping are very common when there are duplicate IP's on the network. Disconnect the router from the network and ping again. If you get an answer then you indeed have a duplicate IP address. The router probably has tried to complain about this in it's logs. You can use OV Node Manager to locate the duplicates or use your local arp to find the MAC address(es) for that IP and then lookup the manufacturer's ID using the first half of the MAC address to ID the card.


Bill Hassell, sysadmin
OLIVA_1
Regular Advisor

Re: -- Network trouble --

Jeff,

# netstat -in
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
ni0* 0 none none 0 0 0 0 0
ni1* 0 none none 0 0 0 0 0
lo0 4608 127 127.0.0.1 1471682064 0 1471682064 0 0
lan0* 1500 none none 0 0 0 0 0
lan1 1500 57.7.19.0 57.7.19.10 -1131872614 0 1953706988 0 0
#
# netstat -rvn
Routing tables
Dest/Netmask Gateway Flags Refs Use Interface Pmtu PmtuTime
57.7.19.10/255.255.255.255
127.0.0.1 UH 01471681772 lo0 4608
127.0.0.1/255.255.255.255
127.0.0.1 UH 0 292 lo0 4608
default/0.0.0.0 57.7.19.1 UG 131953347533 lan1 1500
57.7.19.0/255.255.255.0
57.7.19.10 U 0 335571 lan1 1500
#

OLIVA_1
Regular Advisor

Re: -- Network trouble --

Bill,

I don't think we have a duplicate IP's on the network. Indeed this issue is sporadic and it doesn't occur from another server (same config, same LAN, etc...).
Jeff Schussele
Honored Contributor

Re: -- Network trouble --

Hi (again) Oliva,

Your netstat outputs look fine.
I now believe Bill Hassel has the answer - dupe IPs.
Follow his advice & down the I/F on that destination system, then ping that IP from another system. If you get a response, you definitely have dupe IPs & you'll need to track it down & have one of them changed to not conflict. I've seen that before & I should have remembered, but that's why Bill's so good - he does. Then again he's probably seen it many more times than I have.

Good Hunting,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Shannon Petry
Honored Contributor

Re: -- Network trouble --

ICMP echo or Ping is a broadcast. The only way you get more than 1 response to ping is when more than 1 IP is configured to the same address.

SunOS has a great feature, that detects duplicate IP's and warns "Someone is trying to be me", and will disable the NIC. Other OS's you are left to your own to find it.

If you shutdown the HP, and ping the address look at your arp table.

arp -a

and find the address. Compare these to NIC's on your network to find the culprit.

Regards,
Shannon
Microsoft. When do you want a virus today?
OLIVA_1
Regular Advisor

Re: -- Network trouble --

Tim,

The only "icmp" parameter I have is :

# nettune icmp_mask_agent
0
#
# nettune -h icmp_mask_agent
icmp_mask_agent:
Enables the ICMP address mask reply function. When set, a host
will reply to ICMP address mask requests. The default is off.
doug mielke
Respected Contributor

Re: -- Network trouble --

If it's a duplicate IP, HP/UX should tell you with and 'arp table overwritten' type of message. However, arp table is only for the local segment.If it's on another, you must look at the routers themselves.
Also look for a netmask error between Unix and the target. I've never seen it, but I suppose the address could be seen as a broadcast given the right netmask mis match.
Shannon Petry
Honored Contributor

Re: -- Network trouble --

If you want to test and see if it's not a dup IP, then next time you see the behavior, shutdown the HP. You should still get a response.


The sporadic nature is easily guessed as a bad DHCP setup, or bad bootp setup, or someone with a laptop who does not log in every day.

To understand why it's a DUP IP, let me explain how ping works.


Ping sends a broadcast out onto the network, asking the requested IP to respond. Ping has no clue nor care whether the IP is there or not, simply says "yell if your here".

If something has the requested IP, it says "yes, Im here". Of course more goes into this, but not that much more.

The only way beyond dup IP's you will receive multiple answers to a ping request, is if the address you ping is the broadcast address. Then all systems will reply.


Regards,
Shannon
Microsoft. When do you want a virus today?
Tim D Fulford
Honored Contributor

Re: -- Network trouble --

Olivia

I'm afraid I'm super rusty on 10.20 (it was 18 months ago I did any thing on 10.20 & that was to upgrade it to 11.00!!), so I'm afraid I'll have to decine on the nettune side of things. Maybe some one else could step in & help?

However, there is alot of talk about duplicate IPs. The way I usually prove/dis-prove it is to look at the Network Transport Layer log. It will EXPLICITLY say if it believe the server has seen any duplicate IPs.

# netfmt -t 10 -f /var/adm/nettl.LOG00

the "-t 10" means tail the last 10 messages (so you can go back as far as you need) it is time & date stamped so you should be able to get back to your last "trouble spot".

Regards

Tim
-
Ron Kinner
Honored Contributor

Re: -- Network trouble --

I've seen this before and I'm trying to remember what caused it.

In the meantime let me clear up some confusion about the nature of ping. It is not a broadcast. It is a unicast ICMP packet sent to the destination you want to ping via whatever gateway your routing table tells it to go.

The only broadcast normally involved is possibly an ARP at the end of the line to get the MAC address associated with the IP. Normally when I get a duplicate IP either one or the other will work or neither but never both so I don't think it's a case of a duplicate address.

What I think is happening is that for some reason the ping request is getting sent down two different paths. If you look at the replies the second one is generally about 25 ms later than the first. This second packet has been wandering around a bit and did not take the direct route.

It's coming back to me. We had a sat link with two receivers set up as a primary and backup link. For some reason the backup came up without the primary going down and this caused the echo requests to be duplicated. Everything downwind of the serial links was receiving duplicate packets which we didn't notice since TCP/IP allows for that and just discards extra packets. It was only when you did a ping that you noticed the problem.

So look for a backup WAN link that is going up when you don't expect it. Preferably one that would be transparent to the routers since if they know there are two routes available they are usually pretty good at chosing one and not trying to use both. It's also possible that two router in an HSRP configuration could have moments where they both think they are primary.

A traceroute (UNIX) or tracert -d (MS) would be your best bet to find the problem.

Ron
Ron Kinner
Honored Contributor

Re: -- Network trouble --

Just reread you question and finally noticed the last statement. Run a traceroute from both boxes to the destination. If they both go the same route then look for a difference in patches. There is always the odd software bug which can mess you up.

If I were your network admin I'd stick a sniffer on the LAN and see what was really happening. Are you perhaps sending out duplicate echo requests? Perhaps you have a software bug which send the same request out two different interfaces? Are you really getting duplicate replies or is your box making them up? I might also build a little filter which would count the number of echo request you sent out to the destination and the number you received and put it on each router in the chain to see where the dups are happening.

Ron
Steven E. Protter
Exalted Contributor

Re: -- Network trouble --

I had the exact same three years ago. I solved it by replacing the NIC card.

It took two hardware visits and a nasty conversation with an HP manager to convince them but the finally swapped out the card to shut me up.

It solved the problem.

I had the exact same symptoms as you, though over time my ping times got worse and worse.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
W.C. Epperson
Trusted Contributor

Re: -- Network trouble --

I think Ron's on the right track--looks like some sort of routing anomaly. The big difference in latencies lends some credibility to there being a backup WAN link in play, since backups are usually slower than the primary link.

Also look at netstat -r on the server and look for ambiguities about the route destination of the router's net.
"I have great faith in fools; self-confidence, my friends call it." --Poe
OLIVA_1
Regular Advisor

Re: -- Network trouble --

Thanks to all for your help !!!

I will investigate with your advices...
U.SivaKumar_2
Honored Contributor

Re: -- Network trouble --

Hi,

I agree .

This effect can be because of multiple path load balancing between links of routers. The interesting cause is that the links involved have different latency ( Round trip time is different ).

This can be seen in ISDN Dial on Demand setups also.

regards,

U.SivaKumar

Innovations are made when conventions are broken