Comware Based
cancel
Showing results for 
Search instead for 
Did you mean: 

ISSU Upgrade issue on A5500-HI

Unison
Occasional Contributor

ISSU Upgrade issue on A5500-HI

Hi All

We are currently completing acceptance testing on a pair of A5500HI switches, stacked using IRF.

 

The switches were provided to us running software version r5101p01, which we were able to upgrade to r5101p05 using ISSU in force mode, which resulted in an acceptable upgrade - minimal packet loss, etc.

 

From r5101p05 we have tried to upgrade to r5105-us, which it appears is not a supported ISSU upgrade. This to us is a fundamental issue, as we need as near to hitless as possible for all upgrades (a couple of seconds outage is acceptable – anything exceeding 10 seconds would be considered a failure).

 

Is it a common experience that software versions cannot be upgraded using ISSU, and in this case what options do I have for a hitless upgrade?

 

dis version comp-matrix file flash:/a5500hi-cmw520-r5105-us.bin

Number of Matrices in Table = 1

Matrix for A5500-HI

 

 

Running Version:Release 5101P05

Version Compatibility List:

Unknown (Unknown)

 

Thanks

 

 

3 REPLIES
Peter_Debruyne
Honored Contributor

Re: ISSU Upgrade issue on A5500-HI

Yes, ISSU process is not always straight-forward.

AFAIK:

* patch releases (R__PXX) for the same main release is a compatible ISSU. This means that you can reboot unit by unit in the IRF system, and the original and Patch release can live together in the same operational IRF. Only link-agg failover times, typically between 100-400ms.

* major release updates (new R___ number) are typically incompatible ISSU. This means that the updated unit cannot join the IRF system after the reboot. It will be in standby (external ports shutdown, IRF ports UP, but IRF protocol not operational), until the "switchover" command is used. At that moment, the original production system will reboot and the standby unit will enable its interfaces, ensuring a brief interruption, typically between 1-3sec.

Since the updated switch did not synchronize the control plane (did not join IRF), it requires MAC relearning, STP setup (if you use STP), which all goes within these 1-3 seconds. For L3 routing however, be carefull with OSPF, since new adjacencies must be established, so recommended to set 1 second OSPF timers and point to point interface config, if possible in your setup. OSPF failover will be between 5-15 sec.

* ISSU Unknown releases : unknown to me why they even have this, since the previous method should work for any release (after all, it is just a quick shut of v1 and activation of v2, not really ISSU IMHO)

So I struggled a while with this and then considered this as a solution:

- You basically want to update 1 unit while the other unit remains online. When the new unit comes online, you want the other unit to shutdown and reboot.

- So I figured I would call this the "MAD assisted firmware update". It is critical to configure MAD first, and to start the update process on the unit that would "win" the MAD process, the unit with the lowest member ID, typically unit1.

1/ Note the 10G ports which are used by IRF (dis irf configuration)

2/ copy firmware images to both units, configure the boot-loader to use the new images. Do not reboot yet.

3/ Reboot unit1 (reboot slot 1), unit2 will resume the master role. While unit1 reboots, connect to unit2 (use console if possible) and shutdown the 10G IRF interfaces (do not save the config). This will ensure that the unit1 will not discover unit2 during the reboot (if it would it would either downgrade the firmware again to match the master or fail to join the stack, depending on your IRF firmware update policy config). Unit1 will come online with the new version, once it comes online, MAD will come into action and unit2 will shutdown all its external interfaces. At this moment the link-agg failover has occured (like the "switchover" command for the normal ISSU).

4/ Reboot unit2 (reboot slot 2). It will boot in the new firmware and join the existing master. Update done.

 

Note1 : I recommend to use MAD BFD for this procedure, which is using a standards based protocol for the MAD detection. The MAD LACP is proprietary, and may be subject to changes. I had an issue once with this procedure that the MAD LACP packet format seemed to have changed between versions, so the MAD detection between v1 and v2 would not work. (after all, MAD is supposed to detect split brain of the same version, so I cannot blame them for this, so therefor the MAD BFD recommendation).

 

Note2 : The link-agg failover times were slightly enhanced with the "lacp period short" configured (1 sec LACP instead of 30sec) on the interfaces to the peer devices, and with the "link-aggregation lacp traffic-redirect-notification enable" active. This will notify the peer LACP device that the link should not be used anymore, right BEFORE the reboot of a unit, giving the peer a chance to redirect the traffic to the surviving link BEFORE there is a link failure due to the unit reboot.

 

I hope this helps in your validation,

 

Best regards,Peter

manuel.bitzi
Trusted Contributor

Re: ISSU Upgrade issue on A5500-HI

Hi all

 

On Comware v5 ISSU is not implemented very well. With very very small changes you can upgrade ISSU compatible (no interruption), With small changes ISSU incompatible (short interrruption) is usually possible. But if they make changes in kernel or a other central component, ISSU is not possible and there is no way then doing an normal upgrade with rebooting the whole switch at ones.

 

If you need a switch with reliable ISSU you ned Comware v7. In Comware v7 hitless ISSU upgrade is promissed with each step. For Example have a look on the HP 5900.

 

br

Manuel

H3CSE, MASE Network Infrastructure [2011], Switzerland
Howiedoit
Frequent Advisor

Re: ISSU Upgrade issue on A5500-HI

Hi all, 

 

Ok..I have this same sorta issue and cannot get any clear instructions on method. I am currently on the same release as above R5101P01.

 

I have 4 of these A5500 also in a IRF Stack. I also need to upgrade these for various reasons. I was told ISSU is NOT available in the release he mentioned. R5101P01. So how is it he was able to do this??  I was told that it is only available in R5203 and later.

 

After reading the above comment, I really do not want to update mine at all, it just sounds like trouble waiting to happen. I do not have 4 of these in a LAB to test, so everything I need to do is on a LIVE system. And is scares the heck out of me.

 

Good chance to loose my job if something goes south. What a scary proposition.

 

The fact that ISSU is implemented poorly, is not very reassuring at all.

 

Who has the definitive guide for doing ISSU on A5500 switches in an IRF Fabric??? Anyone??