Switches, Hubs, and Modems
1752569 Members
5004 Online
108788 Solutions
New Discussion юеВ

Re: Better load balancing on trunk lines

 
SOLVED
Go to solution
Willman Lee
Occasional Advisor

Better load balancing on trunk lines

I have two idential 5300xl switches connected to each other with 3 gigabit lines on one trunk (trk1) with ports (A1, B1, C1). We are running into an issue where one of the trunk lines (A1) has a bandwidth utilization of 50-60% of traffic while the 2nd (B1) is at about 20% and the 3rd (C1) at about 15%. Due to the high amount of traffic between the switches, we are getting a lot of dropped packets on A1. The trunk is configured as a static trunk group and I have tried using LACP and Trunk protocols with no difference in the dropped packets.

We have already tested the cables, modules and switches for faults and everything is clean. We have about 20 VLANs currently configured and the two switches are running IP routing. Each VLAN is using approximately the same amount of bandwidth, there is no one VLAN that is using an excessive amount more than another VLAN.

I need to figure out a way to properly balance out the traffic to eliminate the errors.

Would OSPF help balance the traffic?
Would adding a 4th trunk line help?
Would going to a MSTP design with three trunks (with 2 gigabit lines per trunk) and assigning 1/3 of the VLANs to each MSTP instance/trunk be possible?
Is there any difference in removing the static trunk configuration and use a dynamic LACP to configure the trunks instead?

Any other suggestions would be greated appreciated as we are really struggling with trying to eliminate the dropped packets due to the high utilization. This is a video network, not a data network so that is why eliminating dropped packets is critical.
17 REPLIES 17
Matt Hobbs
Honored Contributor
Solution

Re: Better load balancing on trunk lines

The trunking uses a basic algorithm which hashes the SA/DA mac-addresses to randomly assign conversations to a given link. Usually this results in good distribution when many SA/DA pairs are involved.

With the 5300 series you can add up to 8 ports in a trunk group, so I would see if adding a 4th port gave you better distribution.

Don't bother with Dynamic LACP.

With MSTP that could work, but you'd need to spend a fair bit of time trying to determine your traffic patterns to see if that would work. It's also a large change without a guarantee of success.

OSPF, once again this is unlikely to solve your problem as it does load-balancing via destination network only (not host).

One thing I would definitely try to prevent some of these dropped packets would be to simply enable 'flow-control'. I spent a lot of time recently with another customer that was seeing random drops and this miraculously fixed their problem. I didn't even need to enable flow-control per port, I just enabled it globally and bingo problem solved.

So I would try enabling flow-control first, followed by adding more links to the trunk (up to 8). If the majority of traffic is being generated by only a few hosts though, the load balancing isn't going to be as effective and you may need to look into 10GbE.
OLARU Dan
Trusted Contributor

Re: Better load balancing on trunk lines

Matt needs 10 points.

What I've done for a 4-gig uplink 4108GL==cisco 3750 is put 1 heavy-vlan on 1 optical circuit (not-trunked Cisco-wise), one other heavy-vlan on another optical circuit (not-trunked Cisco-wise), 2 lightly used vlans trunked (Cisco-wise) on 1 optical circuit, and 10 non-important vlans on the 4-th optical circuit trunked (Cisco-wise). Would this be ok for you (I had NO packet drops on any of the 4 optical circuits)

Cheers,
Dan
Willman Lee
Occasional Advisor

Re: Better load balancing on trunk lines

Thanks for the good information Matt. I will definately try enabling the flow control to see the effect. The problem I see it might have is that essentially flow control attempts to load balance by sending pause frames to the overloaded links. In our network 99% our data is live video streams which can not be retransmitted like packets of regular data so sending pause frames would likely cause video frame drops.

I'm hoping just adding one or two more ports into the trunk will be enough to load balance. We seem to just be unlucky in our SA/DA algorithm hash right now. In our network we have a guaranteed and sustained 700 Mbit unicast video streams spread on 17 VLANs, then a dynamic multicast video stream on another VLAN (potential max of 430 Mbit) and three more VLANs with minimal regular data packets (probably 100 Mbit). There are about 900 hosts generating the majority of traffic going to about 30 different destinations.


Could you explain to me how you have configured the 4-gig uplinks Dan? We may have to spread out the VLANs manually if nothing else will work. I had tried once to configure two different trunks between the switches and tagging different VLANs to the trunks but due to STP it ended up blocking a trunk from being used.

Thanks,
Will
Matt Hobbs
Honored Contributor

Re: Better load balancing on trunk lines

In regards to enabling 'flow-control' - I only needed to enabled it globally which I suspect enables it internally between the modules. Since there was no need to enable it on ports, there was no risk of pause frames being sent outside of the switch.

Syntax: [ no ] flow-control
Enables or disables flow-control globally on the switch, and is required before you can enable flow control on specific ports.
To use the no form of the command to disable global flowcontrol, you must first disable flow-control on all ports on the switch. (Default: Disabled)

Mainly I think it just helped improve buffering.
Willman Lee
Occasional Advisor

Re: Better load balancing on trunk lines

Ok we tried this last night:

Enabled flow control globally = no effect
Enabled flow control on trunked ports = no effect
Disabled LACP on all ports = no effect
Added two more ports to the trunk (for a total of five) = little to no effect
Hard coded port speeds on the trunk = no effect

This is getting really frustrating as we can't figure out why the SA/DA is so bad at randomly assigning conversations across all the trunk ports. The other strange issue is that all the dropped packets are occuring on one switch whereas the other switch is almost at zero.

I've attached a snapshot of the port utilization on the two switches. The top switch is the one having all the dropped packets. Currently I have ports b15, b16, c15, c16 and D15 assigned to the trunk. As you can see port B15 is way above the other trunk ports. If I physically remove the patch from B15, the +40% traffic moves to B16. If I remove B16, then the traffic moves to C15. It seems to always utilize the first trunk port for the majority of the traffic. That should rule out hardware problems with the actual module. We have even tried using fibre between the ports and there is no difference. Right now all the cables are CAT6 tested. We have even replaced the switch chassis with a spare new one and there is no change.

To give a little more background on the network layout. There are about 800 cameras each with different MAC addresses streaming to 17 network video recorders (unicast) and 12 viewing stations. Each recorder is in it's own VLAN with a unique MAC address, and there are about 48 cameras in each VLAN streaming to the recorder in the respective VLAN. Each viewing station can randomly pull up to 34 multicast streams from the cameras (all depends what the operator wants to see). We are running IGMP, RIP and PIM on the switches and there are no errors in the logs other than the dropped packets. We have done the bandwidth calculations and the theorical maximum bandwidth that would need to go over a trunk line is about 1 Gbit, of which 700 Mbit is a sustained unicast streams from the cameras to the recorders.
Matt Hobbs
Honored Contributor

Re: Better load balancing on trunk lines

Hi Willman,

A couple of other ideas. First of all, it looks as though you're using the J4907A 16-port gig module which is over-subscribed by design. I would avoid using this module for switch-to-switch links if at all possible. You're much better off using the 4-port Gigabit module for this purpose.

Further to this, I'm assuming that it's Drops TX that are incrementing. Have a look at the reviewers guide and check out the section on Packet Buffer Memory Management - http://www.hp.com/rnd/pdfs/5300_Reviewers_guide_May06.pdf

The 'walkmib ifoutdiscards' and 'walkmib ifindiscards' commands will give you some idea of if the packets are being dropped inbound or outbound.

What you may be able to do is adjust the size of the outgoing buffers by changing the queue depth with the Guaranteed Minimum Bandwidth commands.

Syntax: [ no ] int < port-list > bandwidth-min output [ < queue1% > < queue2% > < queue3% > < queue4% >]

Alternatively, you could try strict queuing by going into the port that is dropping packets and enter 'no bandwidth-min output'

None of this of course will help you with better load balancing between the members of the trunk group. You seem to have enough devices that I would imagine would give better distribution than what you're seeing, but they definitely seem to be hashing those conversations on to that first link.

Right now, my strongest recommendation would be to use the 4-port gigabit module for your trunk.

Matt
OLARU Dan
Trusted Contributor

Re: Better load balancing on trunk lines

Willman,
this time I need 10 points ;-)

please find attached 2 combed configs one for the cisco an one for the hp procurve

it worx for me brilliantly

Cheers,
Dan
OLARU Dan
Trusted Contributor

Re: Better load balancing on trunk lines

???
OLARU Dan
Trusted Contributor

Re: Better load balancing on trunk lines

!!!