HPE OneView
cancel
Showing results for 
Search instead for 
Did you mean: 

VMs intermittent loses network connectivity

Highlighted
Eugen Rodekuhr
Frequent Advisor

VMs intermittent loses network connectivity

VMs intermittent loses network connectivity

 

 

We are experiencing intermittent VM disconnects that we have to resolve by doing vMotion to another host. We have also experienced packet drops on some VMs in the past. Not sure how common it is right now.

 

Our basis are HP ProLiant 460c G9 server in C7000 Enclosures.

 

The main issues is, that we can't nail down the root cause as it is intermittent. It is happening in our production environment and for that reason we can't do a deep troubleshooting and in most cases we need to vMotion the VM to get it back to work.

 

We have involved VMware and HPE but both are thinking that the issues is related to a network problem in the data centre network switches and is a layer 2 issue, somehow related to MAC address issues or duplicate MAC Addresses.

 

We are not in control of the DC Network switches for that reason we decided to tackle the issues from the server side onwards, to eliminate involved components one by one.

 

The start of the happenings could not be exactly fixed to one or the other action we took to keep the VMware environment in a supported status.

 

It seems that the first time we saw it was with the following setup:

 

 

  • OA                                          4.70
  • VC                                          4.60
  • Blade Bios                             I36 - 02-17-2017
  • FLB650 Firmware                11.2.1226.20
  • FLB650 Driver                      11.2.1149.0
  • ESXi-Build 6.0.U3d             5572656

 

 

We did some research and updated in single steps to the versions below.

 

 

Current Setup:

  • OA                                          4.70
  • VC                                          4.60
  • Blade Bios                             I36 - 25-10-2017
  • FLB650 Firmware                11.2.1263.19
  • FLB650 Driver                      11.2.1149.0
  • ESXi-Build 6.0.U3d             6921384

 

 

Is looks like the amount of failures has decreased, but we are not 100% sure, as the failure is intermittent.

 

So far no we could not find any final solution from anybody, for that reason we decided the do a step by upgrade using the following steps and try to finally solve the issue.

 

 

Future Setup:

  • OA                                          4.70
  • VC                                          4.62
  • Blade Bios                             I36 - 01-22-2018
  • FLB650 Firmware                11.4.1223.x
  • FLB650 Driver                      11.4.1210.0
  • ESXi-Build 6.0.U3d             6921384

 

 

HPE has released VC Firmware 4.62 and SPP 03.2018, but the HPE has not committed that there is an issue, a fix or solution is not mentioned in the release notes.

 

To keep the setup in a supported condition we do not really have another chance then to proceed this way.

 

By the way the mentioned supported bios/firmware is nit available for download in all sites we searched for it, and at least it is a very old version,

 

The newer released versions should fix the issue anyway.

 

Our next steps are:

 

 

  1. Verify if the amount of failures has really decreased
  2. Update to VC firmware 4.62 and check if the failure is gone
  3. Update to SPP 03.2018 and check if the failure is gone

 

 

We keep you posted about the outcome.

 

 

Any feedback and suggestions are welcome.

 

 

Best Regards

 

 

Eugen

EuRo
4 REPLIES
Eugen Rodekuhr
Frequent Advisor

Re: VMs intermittent loses network connectivity

Hello All,

 

sorry for any inconvenience caused, but we need to postpone our upgrade activities for one week.

 

Our customer has released a change freeze und due to that we can't go forward right now.

 

Regards

 

Eugen

 

Keep you posted about our progress.

EuRo
Eugen Rodekuhr
Frequent Advisor

Re: VMs intermittent loses network connectivity

We will start our update activity on the 20th of match 2018. we will start updating to SSP 2018-03 to eliminate the Onboard NIC (FLB650) as the root cause. The second step will be to update the VC firmware to 4.62 to see if we are further improving.

Keep you posted about the outcome.

EuRo
Eugen Rodekuhr
Frequent Advisor

Re: VMs intermittent loses network connectivity

Hello All.

we have been able to create an custom SSP from the HPE SSP site, this bundle was accepted from Oneview and we could update our environment to the versions below:

  •  HP ProLiant BL460c G9 BIOS to I36 22.01.2018
  • HP ProLiant BL460c G9 FlexFabric 20Gb 2-Port 650FLB Firmware 11.4.1231.6
  • HP ProLiant BL460c G9 HP QMH2670 16Gb FC HBA Firmware v2.1.57.1
  • HP ProLiant BL460c G9 iLO-4 firmware 2.55
  • C7000 Onboard Adminiastrator firmware to 4.80
  • C700 Virtual Connect firmware 4.62 (Ehternet 4.62 / FC 8GB/20Port 2.15 - FC 8GB/24Port 3.09)
  • VMware ESXi Version 6.0 Update 3 Build 6921384

It seems that our original issue is solved now and after roughly one week we do not see any issues.

We have not been able to clearly identify thge root cause of the issue, but is seems to be solved now.

From our point of view we think that the issues was caused by a mix of incompatibility of the Server NIC firmware, the VC firmware and the VMware Server version (Build).

Seems that this version is running as needed.

Keep you posted about the outcome.

Regards

 

Eugen

EuRo
JKytsi
Honored Contributor

Re: VMs intermittent loses network connectivity

Thank You for these findings...we have had exactly the same issues ongoing....well since Gen9 + FLB650 + FlexFabric VC modules.

We'll test the same combination also =)

Remember to give Kudos to answers! (click the KUDOS star)

You can find me from Twitter @JKytsi