- Integrated Systems
- About Us
- Integrated Systems
- About Us
10-21-2015 07:09 AM
Does OneView solve Virtual Connect upgrade outage?
This article discusses the associated outage that occurs during VC upgrade using VCSU automatic mode, even if OS/Hypevisor is configured with active/active NICs. To upgrade the VCs with VCSU without an outage, manual steps need to be taken.
Does OneView intelligently upgrade the VCs so as to eliminate outages during upgrade?
10-21-2015 04:22 PM
Re: Does OneView solve Virtual Connect upgrade outage?
The author of the blog is a bit mistaken. There are a few key corrections that need to be made (which I have contacted the blogger you linked to):
Notice the bit about the reboots above? That’s when network connectivity can be lost. On page 12 the document talks about how network outages can be avoided via redundant configuration and NIC bonding but then on page 13 it clarifies that because the reboot is a graceful one there is a possibility that there could be a 20 second network outage because the blade hardware (and the OS running on it) might not be notified that the VC module is down. You see, something called the SmartLink and DCC protocol are responsible for informing the blades that the VC modules are down and so the NICs they map to are down – and so they should fail over to another NIC using the backup VC – but because the firmware is being upgraded the SmartLink and DCC protocol are unavailable, no one informs the blades. So it only when the OS in the blades realize that it has lost network connectivity and must take corrective action, does the OS fail over to using the backup NIC – leading to a potential 20 second outage.
On page 13 of the referenced whitepaper, it clearly states (emphasis is mine):
“In VC firmware v3.18 and earlier there was the potential for a network outage of up to ~20 sec due to a physical link on the NICs staying up even though the forwarding path was being blocked by VC during graceful module shutdown for firmware activation. Both VCSU v1.6.0 and VC v3.30 resolved this issue by forcing the physical link down on all VC Enet module interfaces prior to activating firmware for the module. “
When we introduced VCSU 1.6.0 and Virtual Connect 3.30 Firmware, downlink ports are put into a “Link Down State”, so that will trigger the NIC on the server side to report the link is down (i.e. “cable unplugged.”) SmartLink is not in the equation here, as the Virtual Connect downlink port is physically being programmed as such.
SmartLink is a technology we implemented within Virtual Connect to help with Uplink Port state tracking. Meaning, if all Uplink Ports associated with a given Virtual Connect Ethernet Network, all associated Downlink Ports and Server Connections will be put into a “down” state. That way as to not “black hole” North/South traffic.
As long as you are following these rules, you should never experience a 20 second network outage:
- Use VCSU 1.6.0 or newer
- Virtual Connect 3.30 or newer firmware
- NIC Teaming is properly setup on the host (LACP, Mode 4, Switch-Assisted are not supported with Virtual Connect)
- HBA MPIO software is properly installed and configured
Now, to your question @MattGG. HP OneView requires a minimum of VC 4.10 for Interconnects to be managed. Interconnect Firmware is delivered similarly to how VCSU delivers firmware. During the Activation process (which today in HP OneView 1.20 and older does not provide the same VCSU Odd/Even Activation order control), the same command sequnce is programmed on the Interconnects prior to "Activating" (aka rebooting) modules. So, to mimic VCSU behavior in HP OneView, you first must stage the Logical Interconnect Firmware. After the firmware has been staged, you can then manually Activate interconnects.
I hope this helps answer your question.
I am an HPE employee