BladeSystem Virtual Connect
cancel
Showing results for 
Search instead for 
Did you mean: 

Understanding link state/DCC issues with ESX4 and Virtual Connect flex-10.

chuckk281
Trusted Contributor

Understanding link state/DCC issues with ESX4 and Virtual Connect flex-10.

There have been a lot of questions surrounding the use of the latest Broadcom drivers with the Dynamic Control Channel(DCC) feature with VMware ESX4 operating system. Brian wanted to set things straight:

 

*************************************************************************************************************

 

Hi All.

 

I’m writing this email as I think there is a lot of confusion about this issue.

 

So I really hope that this email will help out internally. so we can help our customers fix/workaround this issue.

 

 

Description of issue:

 

When using virtual Connect Flex 10 with virtual connect firmware 2.3x together with newer Broadcom firmware on a FlexNic you can experience issues.

 

The common Issues can be “but not limited to”

 

1. When installing/or running ESX4 all nics are disconnected:

 

11i475974867D31A667

 

 

 

2. When installing ESX4U1 nics are not recognized and installation will not continue.

 

 12iFD073EF73C9259BB

 

 

3. If Server is unplugged or reset server X is used via the OA CLI on a system running ESX4U1, the server is started in recovery shell and is displaying the following:

 

13i19B8E454EA8C2CC7

 

 

4. If A Virtual connect profile is change on a ESX4/ESX4U1 it might ask you to power down the Server !

 

 

 

A bit of information about DCC & a Q&A section

 

 

Why did we introduce DCC as it can cause all these issues !?

The idea behind DCC is actually good :)

 

The issues we had before DCC was for example that when changing a Virtual connect profile you would have to shutdown the host.

Another issue was that Smartlink did not work with a flexnic.

 

What will DCC give us of functions/why would I upgrade to enable DCC ?

Some current features are for ex that you can change all relevant settings in the VC profile without having to shutdown the Server (even bandwidth on the nics)

Also Smart Link is now working on a flexnic.

 

Which firmware versions support DCC ?

Virtual Connect firmware need to be 2.3x. currently 2.30,2.31,2.32

Broadcom nic firmware with boot code 5.0.11 or newer “5.2.7” (linux fw component 2.2.3,2.2.4,2.2.2.6) (windows fw component 2.1.5.7,2.1.5.5,2.1.5.1)

 

So what is the real problem with current DCC implementation and why are people seeing all these issues ?

The problem with DCC is that some of these functions are moved out into the Driver Layer (link state for ex).

So if the correct driver isn’t  installed you can encounter several issues.

 

 

I have create a Q&A sections which will cover most common issues.

 

 

Q: Does the bnx2x driver version 1.45 or 1.48 support link state / DCC ?

A: NO. you will need to use 1.52 for supporting link state/DCC

http://downloads.vmware.com/d/details/esx_40_broadcom_bnx2xu1_dt/ZHcqYmRqcGhiZGVqdA

 

Q: If the links are down on my ESX4U1 host when using VC fw 2.3x and nic FW 2.2.4 Will updating the nic firmware to 2.2.6 fix the issue ?

A: No it will not fix the issue.

 

Q: If the links are down on my ESX4U1 host and I’m not able to upgrade the driver because it hangs or boot in recovery mode, what to do ?

A:

  1. Boot the FDT 1.70 and you will regain link on the nics  / ESX4U1 will see the nics correctly again.
  2. Download the 1.52 Driver and update the host with that specific driver.
    http://downloads.vmware.com/d/details/esx_40_broadcom_bnx2xu1_dt/ZHcqYmRqcGhiZGVqdA

 

Ps: if you don’t want to wait until the FDT load all the packages and reboot automatically you can just reset the server after “Activating available network interfaces”

 

Q: If I use vmware update manager and patch the host with all current patches will the driver be updated to 1.52.

A: At the time of writing vmware patch manager will not update the bnx2x driver to 1.52 !

 

Q: I’m using non DCC VC fw 2.1x and nic boot code firmware 5.0.11 but my nic are not enumerated correctly in ESX ?

A: Update boot code to 5.2.7 via the latest Broadcom firmware package as this firmware will resolve an issue where some NC532 devices may not enumerate correctly.

 

Q: Can FDT 1.70 cause my nics to be unlinked ?

A: yes I have seen this at multiple sites and in our lab. When the nic firmware is updated the server can enter a state where it needs to communicate with VC via a DCC

Capable driver to regain link. If you encounter such a situation just boot the FDT again and you will regain link.

Remember to update the driver afterwards.

 

Q: What if I have an unsupported “OS” which does not have a DCC capable driver ?

A: Well that’s really a big issue because if the server is unplugged from the enclosure or if the customer encounters a enclosure power loss

You will not get link on those nics again until a DCC capable driver is loaded

OS examples could be ESX 3.5 or XenServer 5.5.

 

Q: I have an DCC driver unsupported OS and my links are down. How do I regain link ?

A: I know of two ways.

 

  1. Just boot FDT 1.70 (as it will load a DCC capable driver). The server will remain linked until it is unplugged or a reset server x command is used.
  2. Downgrade the nic boot code to version 4.8.0 and you will not have the issue again. (please be aware that if you customer use the FW cd without removing
    the Broadcom online nic fw you will have the same issue again

 

Q: Is there an easy way to reproduce this issue I my environment, so we can see what the customer is experiencing ?

A: The easiest way is just to install ESX4U1 and don’t upgrade to bnx2x driver to 1.52 (native is 1.48). After installation do a reset server X from the OA cli and you will

See the issue when the server is booting.

 

If you have any questions or comments please let us know. Join the conversation.

 

******************************************************************************************************************

5 REPLIES
RonKuper
Occasional Visitor

Re: Understanding link state/DCC issues with ESX4 and Virtual Connect flex-10.

Hi,

 

Thanks for the info!

 

Citrix made a new broadcom driver available for XenServer 5.5  -

http://support.citrix.com/article/CTX124995

 

"The new driver includes various improvements, including support for the latest Broacom NIC firmware, and HP's Virtual Connect firmware version 2.31."

 

Do you know if this driver supports DCC?

Will it be able to identify link state correctly on a flex nic?

 

Thanks,

 

Ron Kuper

Bank Hapoalim Group.

 

StanO
Occasional Contributor

Re: Understanding link state/DCC issues with ESX4 and Virtual Connect flex-10.

The link provided for the drivers (http://downloads.vmware.com/d/details/esx_40_broad com_bnx2xu1_dt/ZHcqYmRqcGhiZGVqdA) includes a note that explicitely states Flex-10 functionality is not supported with this version:

 

While the 1.52.12.v40.3 version of the driver for Broadcom NetXtreme II Ethernet Network Controllers supports Flex-10 capable devices, this version of the driver is not supported in an HP Virtual Connect environment with Flex-10 enabled. Testing of this feature is underway and, once complete, we will update this download location with further details. In the meantime, customers should continue to use the driver shipped with ESX/ESXi, which does not support the DCC/SmartLink functionality.

 

This appears to be in conflict with this posting.  Has the testing mentioned in the note been completed and the VMware site is simply out-of-date, or is this to say that full functionality of Virtual Connect/Flex-10 is not supported by VMware with this driver version?

 

Thanks,

 

Stan

StanO
Occasional Contributor

Re: Understanding link state/DCC issues with ESX4 and Virtual Connect flex-10.

Nevermind.  The Note explicitely stating Flex-10 functionality is not supported by VMware has been removed as or 6/8/2010.

bb4vm
Advisor

Re: Understanding link state/DCC issues with ESX4 and Virtual Connect flex-10.

I have been dealing with all of this with ESX and Flex-10.  I have a thread going on the VMware Communities that you can read here:

 

http://communities.vmware.com/thread/273033?

 

There was also talk about how this issue would be corrected with ESX 4.1 that would have a new driver and new VirtualConnect and NIC firmware, well that is not the case.  HP customer advisory was recently published and you can read it here:

 

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c02476622&jumpid=reg_R1002_USEN

costi1010
Visitor

Re: Understanding link state/DCC issues with ESX4 and Virtual Connect flex-10.

Hi ,

 

I was running into the same issue. See post

http://h30499.www3.hp.com/t5/HP-BladeSystem-Virtual-Connect/bl490c-G6-Broadcom-driver-crash-bnx2x-panic-dump-in-ESXi-4-1/td-p/1159751

I was opening support call on both vendors. I experienced a pin point game here where hp mentioned VMWare should solve the ssue as soon VMWare mentioned that  they will think about this issue but they need more feedback from the customers as well from hp.

I guess both VMWare as well hp should provide more attention on these VC Flex-10 issues, otherwise i guess VC(Flex-10) with VMware will risk major instability issues and this constelation would not make any sense then for the customers.

 

Regards