Re: netperf - close to 100% packet drop using UDP

Mahesh Acharya · ‎07-16-2008

When I run netperf with UDP_STREAM, I am getting almost 100% as shown below:
netperf -t UDP_STREAM -H -l 20 -- -s 128K -S 128K -m 32K -M 32K

Socket Message Elapsed Messages
Size Size Time Okay Errors Throughput
bytes bytes secs # # 10^6bits/sec

262142 32768 20.00 72916 0 955.63
262142 20.00 4 0.05

This is between 2 linux boxes in the same subnet.

Steven E. Protter · ‎07-16-2008

Shalom,

Two Linux boxes, naturally an HP-UX question.

Possible causes:
1) network congestion. Collisions on the lan. Talk to the networking people.
2) Bad application. Talk to the people that wrote the application.
3) Boxes are overloaded.
4) Duplicate IP's on the network, (probably not connectivity would stop completely).
5) Network card is going bad.
6) Problem with port settings or physical network infrastructure.

Suggestions:
1) Try using tcpdump or wireshark for some packet analysis.

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Mahesh Acharya · ‎07-16-2008

But the similar tests with TCP_STREAM shows a throughput of 93 Mb/sec
netperf -t TCP_STREAM -H 16.138.181.45 -l 20 -- -s 256K -S 256K -m 128K -M 128K

Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

262142 262142 131072 20.02 94.13

rick jones · ‎07-17-2008

I am the people who wrote the application, and I can attest that while it may occasionally be nasty to a network, it isn't bad. At least not for a benchmark :)

Netperf establishes a TCP "control" connection in addition to whatever data "connection" (in this case UDP endpoints) is used. If there were duplicate IP's likely as not it would have affected the establishment of the control connection and there would have been no results reported at all.

Box overload at least at the level of CPU util can be checked by adding -c and -C options to the first part of the command line and report CPU utilization.

My first guess is that if Mahesh were to check the netstat stats for UDP on the recieving side he would see lots of UDP errors. That "4" in last line of the UDP_STREAM output suggests that there were four receives and no others.

Linux has intra-stack flow control for UDP, so the sending side reporting 955 Mbit/s implies the sender was a gigabit link. That the subsequent TCP_STREAM test only shows 94 Mbit/s implies that the receiver, or something between the sender and the receiver, is only 100BT. That would lead to the second guess, that checking switch stats on the switch between the two machines will show a lot of dropped traffic there.

And since the sends were 32768 bytes each, the IP datagrams carrying the UDP datagrams will be fragmented, and I'll wager that many of those fragments were lost at that 1G to 100BT point, which would lead to lots of IP fragmentation reassembly failures at the receiver, which I think can be checked by looking at IP statistics with netstat on the receiver.

Having been thinking as I type, if it is indeed the IP fragmentation business then it is rather less likely that the receiver UDP stats will show errors.

One other test to try would be a UDP_STREAM test with a send size of say 1024 bytes or 1472 bytes to avoid IP fragmentation. Then if it is the speed mismatch leading to issues with fragmentation, you will probably see the sender still sending near a Gbit/s but the receiver actually receiving at near 100Mbit/s.

there is no rest for the wicked yet the virtuous have no pillows

Mahesh Acharya · ‎07-17-2008

Thank you Rick for the detailed analysis.

I was trying to find the throughput in case of VM and NonVM systems. Below is the results of UDP_STREAM of VM and NonVM systems, both systems identical in terms of cpu, memory and swap space.

Not sure why the service demand is very high in case of VM? At least from senders (first line) perspective, the service demand should have been the same in both cases.
For smaller messages sizes (less than 1472), service demand is 30+, but for messages more than 1472, its very high as shown below

[root@RHEL4U5 vm]# netperf -t UDP_STREAM -H -c -C -l 20 -- -m 8K
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 16.138.181.41 (16.138.181.41) port 0 AF_INET : interval : demo
Socket Message Elapsed Messages CPU Service
Size Size Time Okay Errors Throughput Util Demand
bytes bytes secs # # 10^6bits/sec % SS us/KB

110592 8192 20.00 292533 0 958.4 7.44 2.546
109568 20.00 292465 958.2 24.26 2.074

[root@RHEL4U5 vm]# netperf -t UDP_STREAM -H -c -C -l 20 -- -m 8K
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 16.138.181.45 (16.138.181.45) port 0 AF_INET : interval : demo
Socket Message Elapsed Messages CPU Service
Size Size Time Okay Errors Throughput Util Demand
bytes bytes secs # # 10^6bits/sec % SS us/KB

110592 8192 20.00 292535 0 958.4 7.27 38271.934
109568 20.00 19 0.1 20.56 27053.768

rick jones · ‎12-04-2008

CPU utilization will be higher in a virtual machine guest than in a "bare iron" system because there is the overhead of the hypervisor. Now, where things become "complicated" is in actually measuring CPU utilization in a guest. What has been done elsewhere is run netperf in the guest, but look at the overall CPU util in the hypervisor.

there is no rest for the wicked yet the virtuous have no pillows

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: netperf - close to 100% packet drop using UDP

netperf - close to 100% packet drop using UDP