BladeSystem Virtual Connect
cancel
Showing results for 
Search instead for 
Did you mean: 

10's millions discards on Virtual Connect

Sindall
Visitor

10's millions discards on Virtual Connect

For some reason we are getting 10's of millions of discards, these are mainly coming from the x8 port but having a not down affect, all being reported by solarwinds.

 

I have checked the switches and also all the blades and none of them are showing discards. 

 

We run two VC Flex-10 modules in Bay 1 and 2, these are internally stacked linked through x7 and x8 (x7 showing no discard issue).  From here we run a active/active config.  Ports x5 x6 are trunked and etherchannelled together on both modules.

 

We do not use Shared Uplink Sets and prefer the method of connecting our Ethernet Networks direct to the uplink ports, x5 x6.  These are then attached two the blades as vm_trunk_01 for the left module and vm_trunk_02 for the right module, this allows vmware to direct traffic as it sees fit to either side of the modules and allows use as required.

 

Any ideas to why we are getting these discards????

 

 

8 REPLIES
Hongjun Ma
Trusted Contributor

Re: 10's millions discards on Virtual Connect

i have seen a couple of instances where customers report the similar problem. It seems that every time the problem is somewhere else on the network. For example, if VC receives some packets being unicast flooding.

 

My suggestion is to clear counter on these ports(you can do it in GUI or just ssh into VC module and do clear reset statistics enc0:1:x5 and etc).

 

after clear counter, monitor these ports to have an idea how fast these counters can go up.

 

If they go up really fast, it may worth to have a sniffer trace set up at upstream switch to see what they are sending to us. You may also want to open a HP support case so support can take a look at your VC support dump file to see what's going on inside VC.

 

I haven't seen any performance impact with these counters.

 

 

My VC blog: http://hongjunma.wordpress.com



Sindall
Visitor

Re: 10's millions discards on Virtual Connect

Thank you for your reply.

 

The strange thing is, if we change the configuration to a Shared Uplink Set and associate the EXACT same vlans networks to the SuS all the discards disappear.

 

We have put the same feature on the cisco switches and only allowed certain vlans down to the flex-10, and this did nothing.

 

From all the tests I have been running this seems to be a issue within either ESX5i or the Flex-10, that does not like vlan tunnelling or sending traffic out from internally somewhere to a external network.

 

We had our vMotion network to go internally only within the chassis, we thought this could  be the issue, so removed that and put back into the main mangement port with a vlan which was not routeable to outside the switches and this also did not help.

 

We have upgraded to the latest version of ESXi 5 with all the patches and put on the HP patches but makes no difference.

 

We wanted to keep the vmware environment dynamic and only have to change or add vlan's within vmware and not the Flex-10 or the Cisco tunnel.

 

We also removed CDP from the cisco switches and only allow LLDP to see if that was the issue and again same problem millions of discards.

 

We would prefer our main trunk to the switches to be defined in the Ethernet Networks and attached as uplinks there and not in the SuS to help keep management to a minimum and so only change vlan's in one place.

 

I have no idea why this would happen as we followed the HP guides, I think based on scenario 7 for a active/active uplink with a SuS network for the DMZ.

Sindall
Visitor

Re: 10's millions discards on Virtual Connect

Forgot to attached screen shots of the result from changing our Bay 2 Flex 10 from a Ethernet Network into using a Shared Uplink Set.

 

We have HP looking into this as well.  And they are just esclating this higher and higher, so thought quicker to post in case anyone has seen this issue before or had this issue and what they did.

Hongjun Ma
Trusted Contributor

Re: 10's millions discards on Virtual Connect

sorry for late reply. have been very busy.

 

it's good that you have opened HP support case. With tunnel vnets, definitely you increase the traffic being flooded because of the nature of tunnel vnets. I'll do some internal checking to see if I can come up with anything and let you know if I did. But HP support case is the way way to get escalated at this time.

My VC blog: http://hongjunma.wordpress.com



Hongjun Ma
Trusted Contributor

Re: 10's millions discards on Virtual Connect

BTW, what's the VC version?

My VC blog: http://hongjunma.wordpress.com



Sindall
Visitor

Re: 10's millions discards on Virtual Connect

This on version 3.51.

 

We have not got anywhere with HP at the moment they seem to not understand that the problem is internal to the chassis as we have blocked all non-used vlans on the cisco and we still got discards at the same rate.

 

We limited vlans on the internal side of the VC and discards stopped right away.

 

Is there a way to look at the traffic leaving the VC so that we can tell what is causing the problem? I think this may be the easy way as mentioned HP Support has dried up, we have tried to get it esclated to 3rd line but nothing, they say they have but we are still getting first line support.

 

 

 

 

 

Hongjun Ma
Trusted Contributor

Re: 10's millions discards on Virtual Connect

could you try one thing?

 

In VC3.51, there is one server-side loop control feature introduced in VC to protect any misconfig on server side causing loop, it's called active loop control where VC will actively send out some frames and monitor if we see the came frames coming back. It's a multicast frame.

 

could you turn OFF this feature? by default this feature should be enabled. The setting should be under etherent->advanced settings.

 

There is a chance that VC discard these frames under interfaces like cross links and uplinks.

 

My VC blog: http://hongjunma.wordpress.com



gazJones
Occasional Collector

Re: 10's millions discards on Virtual Connect

Hi Sindall, did you ever get to the bottom of this? We're seeing exactly the same issue i.e. a few million discards per hour.  We use VLAN tunnelling to pass all tagged VLANs down from a Cisco 10GE port and then carve the VLANs up at a VMware distributed switch.  Our throughput between hosts seems to be reasonable however we've had a few issues with poor VoIP performance which could be explained by this.

 

Thanks,

 

Gaz.

 

Update: just spotted this post which pretty much answers my question: http://h30499.www3.hp.com/t5/HP-BladeSystem/High-Number-of-Discards-on-VirtualConnect-Flex10-shared-uplinks/m-p/4715278#M11985