Switches, Hubs, and Modems
cancel
Showing results for 
Search instead for 
Did you mean: 

Core port rate-limiting causing issues on 'tenanted' switching infrastructure

hardwire82
Occasional Contributor

Core port rate-limiting causing issues on 'tenanted' switching infrastructure

We have a 5400zl (J8697a) chassis with 2 J8702a 24 port 10/100/1000 PoE modules installed acting as a core switch in a large 8 switch deployment. All edge switches are J9089a 2610s.

 

We have a 100meg EFM circuit into the building, the router for which patches straight into the core switch on a port set to 1000 full duplex and untagged into vlan 10 which we call the 'internet vlan'. This Vlan is used to serve up internet connectivity to various devices on our TENANTED network.

 

Tenants obtain internet either from a virtual interface on our internal firewall (with their own dedicated vlan to punt traffic across network - rate limiting is carried out by site firewall, this is their ONLY route to internet) OR, and this is where we are seeing problems, by using their OWN router/firewall patched to a port in vlan 10 with, importantly, rate-limiting on the core switch described using the syntax RATE-LIMIT ALL IN KBPS 1024 (for 1 meg in) and RATE-LIMIT ALL OUT KBPS 1024 (for 1 meg out). This allows them to program up their device with a public IP, subnet and gateway of the site router, thus their router is internet facing and does not touch our own firewall. This model allows us to sell bandwidth on a tenant by tenant basis, we have currently sold around 50 meg worth to 10 different clients with varying partitions and therefore different core switch port rate limiting commands ranging from 2048 (2 meg) to 25600 (25 meg).

 

Both our largest bandwidth tenant and others on smaller partitions are complaining of poor download and upload speeds. We've ruled out local connectivity issues and want to focus here on the fact we're rate limiting to different degrees ports that are untagged in vlan 10 on the ore switch. We have obtained bandwidth stats from our carrier for the circuit itself and worryingly we see max usage at about 15 meg. Clearly, looking at our internal monitoring we can see that the clients core switch ports (essentially the feed to their WAN on their router / firewall) are actually producing (all together) way in excess of this 15 meg, most clearly reaching their allocated bandwidth.

 

Our concern is that the above clearly doesn't make sense - summing the traffic at 1 point on all core ports allocated to client firewalls clearly exceeds that which our carrier is indicating is actually getting to the web. Concerningly we can also see the same kind of 'shape' to client core port traffic graphs, i.e both client a and b will see their lowest traffic mark on this graph at pretty much exactly the same time.

 

Questions for your support forum gurus :o)

 

Is this a sensible use of rate-limiting on the core switch - i.e can I tenant our internet feed (from vlan 10) into 'packages' of 2 meg, 4 meg, etc using rate-limiting on all switch ports in vlan 10 and CRUCIALLY can I vary these rate limits or will the lowest denominator, highest, or average dictate actual volume allowed to traverse the vlan at any 1 point and is thus a bad thing to do for the larger clients.

 

In my head given that this is a layer 3 device (kind of) traffic on vlan 10 should only travel to the ports needed. I.e I have a public ip of 1.2.3.4 on my router in port a1 on the core. Only traffic destined for 1.2.3.4 should travel down port a1? It would seem to fit though, given that we are seeing similar shaped graphs, albeit on different rated ports, that all traffic for vlan 10 travels down each port untagged in vlan 10, thus traffic not destined for client a is still arriving at their device, and at heavier usage time sof the day utilising all their smaller pipe, resulting in poor performance for users behind that firewall and traffic genuinely destined for that port and firewall / router?

 

Could the below be another part of the issue?

 

The core switch does NOT have a public IP from vlan 10 assigned to it, thus we cannot run arp queries on vlan 10 devices and in my mind this would exclude the switch from being able to route direct to clients (i.e because it cannot arp the devices on vlan 10 it has to broadcast every packet to every device until the destination is reached)

 

**********

 

Apprecaite this is like a book, but am really struggling on this one, so any advise would be very much apprecaited.

 

I can post screen snaps of config etc or any other detail that people need if they think it'll help!

Fell free to ask questions too if I haven;t explained myself!