ProCurve / ProVision-Based
cancel
Showing results for 
Search instead for 
Did you mean: 

LACP Active vs LACP Passive and is the bond working?

SOLVED
Go to solution
duncan9322
Frequent Visitor

LACP Active vs LACP Passive and is the bond working?

Hi All,

I have been trying to set up link aggregation to connect two Linux servers together using 802.3ad. Both servers have two network interfaces and the switch is a Procurve 1810-24G.

My question is which mode should I select for the trunk?

I reckon that it should be LACP Active but am not sure as I always see this message in the logs when the network interfaces come up. I reckon this is a timing message:

bond0: Warning: No 802.3ad response from the link partner for any adapters in the bond

I think that the bond is configured correctly but am having difficulty confirming that it is working, as both iperf and iperf3 are reporting 940 Mb/s. However ifconfig shows that packets are being send and received on both physical interfaces:

bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
        inet 10.19.57.98  netmask 255.255.255.0  broadcast 10.19.57.255
        inet6 fe80::219:99ff:fe80:a170  prefixlen 64  scopeid 0x20<link>
        ether 00:19:99:80:a1:70  txqueuelen 1000  (Ethernet)
        RX packets 1722694  bytes 2468909572 (2.2 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1014071  bytes 1248791015 (1.1 GiB)
        TX errors 0  dropped 4 overruns 0  carrier 0  collisions 0

eth0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500
        ether 00:19:99:80:a1:70  txqueuelen 1000  (Ethernet)
        RX packets 419145  bytes 620170137 (591.4 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1012435  bytes 1248593357 (1.1 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 23  memory 0xfc900000-fc920000

eth1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500
        ether 00:19:99:80:a1:70  txqueuelen 1000  (Ethernet)
        RX packets 1303549  bytes 1848739435 (1.7 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1636  bytes 197658 (193.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device memory 0xfc800000-fc8fffff

Can I use wireshark to see if the lacp packets are being send? I tried this at some stage but didn't see any packets at this time.

Thanks
Duncan

6 REPLIES
Vince-Whirlwind
Honored Contributor

Re: LACP Active vs LACP Passive and is the bond working?

The stats you have shared with us seem to indicate that the Linux server is not using eth1 for sending packets.

The switch seems to be sending packets to the server at a ratio of 13 (eth1) : 4 (eth0) - although some of that could be traffic from a different source for all we know.

So it appears the trunk is up and you're maybe getting a bit of balance from the switch, but none from the server.
I wonder what you see on the other trunk, the one between the switch and the other Linux server?
Normally if you're sending traffic from a single host to one other host, you would see all the traffic using a single physical interface unless you had some kind of load balancing configured.

Also, for fun, setup a direct LACP between the two servers and see what you get.

As far as your question goes, I don't see any benefit to using "active" - why would you ever want some dynamic process to run on a physically static configuration?
However, when trying to setup link aggregation between two devices running completely different vendor implementations of a supposed "standard", my rule has always been, "test and go with whatever works".

 

duncan9322
Frequent Visitor

Re: LACP Active vs LACP Passive and is the bond working?

Many thanks for you reply

The other server had this:

bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
        inet 10.19.57.99  netmask 255.255.255.0  broadcast 10.19.57.255
        inet6 fe80::219:99ff:fe9a:89aa  prefixlen 64  scopeid 0x20<link>
        ether 00:19:99:9a:89:aa  txqueuelen 1000  (Ethernet)
        RX packets 1013703  bytes 1250028610 (1.1 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1719962  bytes 2472016239 (2.3 GiB)
        TX errors 0  dropped 11 overruns 0  carrier 0  collisions 0

eth0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500
        ether 00:19:99:9a:89:aa  txqueuelen 1000  (Ethernet)
        RX packets 804094  bytes 1088001997 (1.0 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1718234  bytes 2471284574 (2.3 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 23  memory 0xfd500000-fd520000  

eth1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500
        ether 00:19:99:9a:89:aa  txqueuelen 1000  (Ethernet)
        RX packets 209609  bytes 162026613 (154.5 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1728  bytes 731665 (714.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device memory 0xfd400000-fd4fffff  

So most of the traffic was between the two hosts.

I did some other tests, I ran the command:

iperf -c 10.19.57.98 -P 8 -R

from two hosts simultaneously and would have expected much more than 940 Mb/s but the result was about 1.2 Gb/s

tcpdump on the eth0 showed that LCAP packets where being send and recieved, so the configuration looks correct.

 I also tried changing xmit_hash_policy to layer3+4 and this made no difference and from what I read this might not be fully supported.

Instead of using mode 4 I tried mode 0 (balance-rr) and set the switch to use a static trunk and this setting did show a throughput of 1.8 Gb/s; sort of expected but proves that there is no hardware problem and everything is capable.

I don't have cross-over cables available to test what happens when connecting the hosts together.

The manual was not clear about the different trunking modes. The description of static being set up by an administrator does not help very much nor does the description of active and passive. I was really trying to figure out which mode is used in which circumstance. So, for example, it could be that connecting two switches together LCAP active is used on one switch and passive on the other.

To explain why I'm looking at this, I was asked to monitor the network for duplicate ip addresses and finding a whole bunch I discovered that the servers have three interfaces (one IPMI) being connected to an unmanaged switch. So what I'm planning is to bond the interfaces together using mode 1, 5 or 6 and creating an alias for the second ip address. Later I'll replace the switches with managed switches, create a vlan for the ipmi traffic and trunk the other two interfaces together.

Thanks again,
Duncan

 

Vince-Whirlwind
Honored Contributor
Solution

Re: LACP Active vs LACP Passive and is the bond working?

Generally I will use "trunk" on a Procurve switch as my first choice and only try LACP if static trunk wasn't working.

Passive LACP usually only works if the other side of the trunk is Active LACP. So if a static trunk wasn't working, I would use Active LACP.

I suspect the full 2x interface throughput you are achieving with balance-rr<---->static trunk is just pure luck. You have a 50/50 chance for each of the two streams to use any one of the outgoing switch interfaces in the trunk.
So you probably have a 25% chance in that setup of getting the result you got.
You could manipulate this by giving your server interfaces spoofed MAC addresses, by trial and error.
(I think the default switch hash method is Source MAC).

Link aggregation isn't load balancing. It will "balance" traffic flows better the more flows you have, via the law of averages.
That's why link aggregation is great on a core switch<---->Access switch link, because the access switch will have many hosts on it, and the more hosts, the more "balance" you will get on the uplink.

duncan9322
Frequent Visitor

Re: LACP Active vs LACP Passive and is the bond working?

Thank you again for your help and thank you for the tip on the steps to take.

I see what you said about the 50% chance. With balance-rr there is 100% chance that the traffic is sent over to the switch on both interfaces and only a 50% chance that is it sent to the other host over two interfaces.

I also see that 802.3ad is not designed to distribute the traffic over the interfaces, more for spreading the traffic when many of host are using the system and if one of the interfaces goes down the connection remains up.

I hadn't thought about the xmit_hash_policy only working from the sending host and not working from the switch to the receiving host.

Thanks again and sorry for being such a noob.
Duncan

Highlighted
Mark Wibaux
Trusted Contributor

Re: LACP Active vs LACP Passive and is the bond working?

Just to confirm the algorithm for lacp load balancing.

The hashing algorithm used by the switch for LACP is SA/DA (source address/destination address). So from the switches perspective, once a communication path between 2 devices has been established (and cached in the hashing table), the traffic flow between these two devices will always take the same path. So therefore will only ever use a single link.

By setting the host server to RR you can get it to push out traffic over both links. However, at the OS level a bonded interface pair will generally appear to utilise a common MAC address, so the SA/DA algorithm would then only send on to a single path for the destination traffic.

duncan9322
Frequent Visitor

Re: LACP Active vs LACP Passive and is the bond working?

Thanks took me a while but I understand how LACP is working.

The two ethX and the bonding adapter all have the same MAC address so the switch must be sending the traffic an adapter or both adapters. So the throughput of 1.8 Gb/s with balance-rr must be going through both adapters. How the switch determines which port to use is not clear.

Thanks,
Duncan