Comware Based
1753529 Members
4882 Online
108795 Solutions
New Discussion ī„‚

HPE 5130 JG937A IRF auto update fails from 3113P05 to 3208P16

 
ar1y
Occasional Contributor

HPE 5130 JG937A IRF auto update fails from 3113P05 to 3208P16

IRF has this nice feature of being able to update firmware on new slave members automagically to same version  with master switch. This was the first time I've seen it failing with 5130.

IRF member #1 with version 3208P16 failed to send firmware to new switch out-of-the-box with version 3113P05. Member #2 had to be updated alone manually. Is this because these firmware releases are too far from each other? With Juniper switches there are limitations which JUNOS version can be upgraded to what, but I haven't seen any mention about Comware 7 having this kind of limitation.

%%10STM/4/STM_AUTO_UPDATE_FAILED: Slot 2 auto-update failed. Reason: Timeout when loading. This message repeated 30 times in last 10 minutes.

7 REPLIES 7
parnassus
Honored Contributor

Re: HPE 5130 JG937A IRF auto update fails from 3113P05 to 3208P16

Pretty strange...what was the IRF status before you started the upgrade process? I mean...did you perform all the usual checks as per Release Notes/Best Practices? if so what was the IRF health?


I'm not an HPE Employee
Kudos and Accepted Solution banner
Ivan_B
HPE Pro

Re: HPE 5130 JG937A IRF auto update fails from 3113P05 to 3208P16

Hi Ar1y!

There is no mention in the documentation for 3208P16 about compatibility issues. Since it was a timeout error (from the perspective of IRF Master) we have a couple of possibilities:

- There was no enough space on Standby slot to hold binaries of new s/w release
- There was an error during copy operation
- Some incompatibility issue

Normally in such cases we collect all files from "logfile" directories on each IRF member and analyze what exactly happened during that time. When a device exits a stack , even temporary it starts to log its own events in a local logfile. So I would inspect logfile from that Standby slot that failed to update to see what was the situation from its own perspective. Maybe opening a case with our Support is a good idea in this situation

 

 

I am an HPE employee

Accept or Kudo

parnassus
Honored Contributor

Re: HPE 5130 JG937A IRF auto update fails from 3113P05 to 3208P16

Yes, worth to add that if IRF Standby had really not enough storage space to hold binaries that IRF Master was trying to push the generated STM message on Slot 2 should have had another failure reason ("Disk full when writing to disk" instead of "Timeout when loading")...thus, at first sight (I could be wrong), it looks like IRF links went probably down before IRF Master was able to initiate the file transfer (reason?).


I'm not an HPE Employee
Kudos and Accepted Solution banner
Ivan_B
HPE Pro

Re: HPE 5130 JG937A IRF auto update fails from 3113P05 to 3208P16

Yes, in theory you are right and full disk should produce appropriate error messages, but life is full of surprises and particular order of events may produce error messages that point to one direction, but the root cause lies down in another one

Totally agree with you that "IRF links went probably down before IRF Master was able to initiate the file transfer" should be investigated as a possible cause, also let's not forget that something could happen during the file transfer, not before it... 

Possibilities are many, that is why it is critically important to inspect the diag taken after this event plus all the files I've mentioned previously. Even this could be not enough to conclude what has happened, but without these details it is not possible for sure.

I am an HPE employee

Accept or Kudo

parnassus
Honored Contributor

Re: HPE 5130 JG937A IRF auto update fails from 3113P05 to 3208P16

I absolutely agree with you...providing all possible/available diagnostic evidences is the first essential step to let pro support to investigate.

I'm not an HPE Employee
Kudos and Accepted Solution banner
VoIP-Buddy
HPE Pro

Re: HPE 5130 JG937A IRF auto update fails from 3113P05 to 3208P16

Ar1y,

That is not how it works.  Before you can merge a new member into an existing IRF stack it must be running the same code as the stack.  Once it is running that code, it will merge into the stack and then using the boot-loader command you can upgrade them all to a newer version.  It doesn't work the way you suggest.

Regards,

David

I work for HPE in Aruba Technical Support
ar1y
Occasional Contributor

Re: HPE 5130 JG937A IRF auto update fails from 3113P05 to 3208P16

Got hit by this broken feature again, and forgot IRF autoupdate does not work any more with 5130. This need to add one member more to make 2 member stack does not happen often.

IRF autoupdate used to work with 3116P05, but last year we started upgrading to 3208 and now we're mainly on 3506, some 3506P02.

Again this is a formerly single 5130 without IRF members, and adding #2 with too old firmware gets stuck in 

%Jun 23 16:30:36:317 2020 some-sw STM/4/STM_AUTO_UPDATE_FAILED: Slot 2 auto-update failed. Reason: Timeout when loading.

IRF-cable used is HPE original 10G SFP+ stack cable.

<some-sw>display irf

MemberID    Role    Priority  CPU-Mac         Description

 *+1        Master  2         00e0-fc0f-8c02  ---

   2        Loading 1         00e0-fc0f-8c03  

--------------------------------------------------

 * indicates the device is the master.

 + indicates the device through which the user logs in.

 

 The bridge MAC of the IRF is: 2c23-3a9b-cd59

 Auto upgrade                : yes

 Mac persistent              : 6 min

 Domain ID                   : 0