BladeSystem - General
1751936 Members
4584 Online
108783 Solutions
New Discussion

Re: Virtual Connect update: online or offline?

 
chuckk281
Trusted Contributor

Virtual Connect update: online or offline?

Andrea was looking for some experience and best practices when updating firmware:

 

****************************

 

Statistically or methodologically speaking, when you upgrade a virtual connect upgrade from version 2.xx to 3.xx do you plan a mainteneance window where the network services are to be down or you prefer to do it online?

 

I’m asking this because in the last firmware upgrades expecially to 3.15, I’ve noticed some network downtime (10-20seconds) enough for example to trigger an ESX host isolation response. This downtime happens when you Activate (resetting, power cycling or removing-reinserting) the last module, the active one.

 

Also in the virtual connect logs, after an update you see messages in wich External uplinks are shutted down and up for some while by the VCM.

 

******************************

 

Vincent provided his "Best Practices" advice:

 

*********************************

 

I recommend doing it online only if the config has been checked and tested for redundancy, i.e. during a maintenance window, reboot each module, one after the other, see if the servers stay online.

In your case, 10-20 seconds is abnormally long for a failover indeed. if this happens when you reset a module even outside a firmware upgrade, this would seem to indicate some other problem.

 

********************

 

Carl also chimed in:

 

************************

 

I would have to throw my opinion on the “misconfigured” side for systems that see an extended outage during an update cycle. With a properly designed/configured environment that should not be happening. Any one module or uplink should be able to take a hit with perhaps the loss of only a couple of pings.

 

*********************

 

Any one else with help for Andrea?

3 REPLIES 3
NMiller
New Member

Re: Virtual Connect update: online or offline?

I saw exactly an 18 second outage during a flex 10 upgrade.

The problem seems to occur when the flex10 is activating.  THe flex doesn't issue a 'link down" to the hosts for about 18 seconds after it stops forwarding traffic.  Hosts relying on link up/down for failvover detection thus never fail over.  Once the flex reboots, the server sees link drop, and comes back online.

 

I've worked around this by staging the firmware upgrade, and then powering the flex on/off.

THere has been some issue with a connected VCM Failover causing an outage during this upgrae process too, though.  Working my way through that.

 

JL Martinez
Advisor

Re: Virtual Connect update: online or offline?

My experience tells me that many problems can arise on a VC firmware update, so the best advice is to plan a maintenance window if you can.

 

It seems that the latest VCSU versions and the lastest firmware packages are less error-prone than the old versions, but I personally prefer to do things with caution. A long unplanned outtage is one of the worst things that can happen to any IT professional.

 

HEM_2
Honored Contributor

Re: Virtual Connect update: online or offline?

VCSU 1.6.0 has an enhancement that eliminates the possibility of a 17-20 second failover time when a VC ethernet module is being shutdown as part of the activation process.  I definitely recommend using VCSU 1.6.0.

 

Blade NIC and Firmware need to be at the latest to ensure that DCC (Device Control Channel) is working properly between NIC and VC downlink port.  If DCC is not available, Smartlink may not disable FlexNICs when the associated SUS/network is down.

 

Also, the default activation order of odd-even for VCSU works well on VC domains that have a stacking link mesh topology but will NOT work well for a simple stack topology.  Look at the following doc for descriptions of mesh versus simple stack topologies:

http://bizsupport2.austin.hp.com/bc/docs/support/SupportManual/c02102153/c02102153.pdf