Web and Unmanaged
1752390 Members
6116 Online
108788 Solutions
New Discussion юеВ

IRF experts needed

 
SOLVED
Go to solution
vzwnickf
Occasional Visitor

IRF experts needed

Hello,

I have an issue regarding an IRF framework between two 1950 officeconnect switches that HPE support has been absolutely no help with so far. We had a power issue last week, and lost our master switch (complete power loss). The secondary took over as master and we just had the failed switch replaced today. The problem is, I dont know how to restore the IRF fabric without bringing the entire network down.

So far HPE support was able to load the management IP onto the new switch and I was able to load the backup config through the web gui. The way the IRF is setup, we have physical port 2/0/50 on the switch 02 bound to IRF port 2 (this is the CURRENT master, the one that never went down). After loading the backup config back onto Switch 01 (the new, replaced switch), I can confirm that IRF port 1 on this switch is bound to 1/0/49. I have a SFP cable onnection between 1/0/49 and 2/0/50; however, the moment I connect the two together, switch 02 (which has the majority of my network on it right now) immediately goes down (red sys light), and will not boot back up until I remove the SFP again. I am trying to get this IRF fabric back to operational but I have no clue how this technology works on these HP switches. I have no previous experience with these switches.

HPE support has not been able to give me any help with this. Are there any experts in here that may be able to shed some light. I can post more details if needed. Thanks!

6 REPLIES 6
parnassus
Honored Contributor

Re: IRF experts needed

Have no time exactly now to write you more but IMHO you shouldn't restore any configuration backup (of what exactly?) on the brand new Switch (the replacement you're using) because the whole IRF stack configuration is always stored on Master member and replicated/synced on all other IRF stack members: you should simply and just prepare the brand new Switch (which I presume is identical to the one that had the power failure) to be part of the existing IRF stack as the old Switch 1 was (the actual Master member is - or should be - after an induced Master Election due to the IRF splitting caused by the offline of Switch 1 - your physical Switch 2), don't focus on the role.

First, connect on the Switch 2 (the only available running IRF stack member) with the brand new switch disconnected and issue these commands on its CLI:

display irf
display irf configuration
display irf topology
display interface brief

to examine current IRF stack (single member) general status.


I'm not an HPE Employee
Kudos and Accepted Solution banner
HP-Browniee
Respected Contributor

Re: IRF experts needed

Hello

Like parnassus said, you don't have to push your backup config on the new switch. Just prepare the switch to be added in the stack. (configure priority, domain, irf-ports and don't forget the command irf-configuration active)

Make sure that the IRF priority on you replacment switch is lower then you current master switch. Else your current master switch will sync his config with your replacement switch. And you don't want that :)

And make sure you are working with the same type of switches and use the same software version.

And this topic should go under comware switches not provision :)

vzwnickf
Occasional Visitor

Re: IRF experts needed

Hi everyone, thanks for the responses; it means a lot to be able to get help with this. I understand your perspective on this, and I can try wiping the new switch and giving that a try. However, this is the way we have set up all of our stacks with the same models of switch and they always work. So we will typically have configs loaded with management addresses, etc, and then activate the IRF fabric and it will come right up. This is why Im so confused. 

Maybe I can clarify one thing: how long should a reboot take to complete after an IRF merge? It seems like I gave the second switch at least a good ten minutes before taking out the SFP. Could a firmware mismatch cause the second switch to fail to reboot? Just to reitirate, the second switch is failing to reboot after attaching the IRF cable back to switch 1, which is set to be the master. I understand switch 2 *should* be rebooting under these circumstances but the fact that its not running again after ~10 minutes is concerning. Can anyone clarify what I should be seeing after an IRF merge / reboot? Thanks again

HP-Browniee
Respected Contributor

Re: IRF experts needed

Hi

What is the point of configuring the switch with management addresses when he will lose it anyway when you add it to the IRF stack?

Take a close look at the logging messages of the switch who isn't rebooting after the irf merge. Maybe the problem is mentioned there. Normally the reboot time shoudln't be any different then a normal reboot, maybe a minute longer. 

I suggest  to set the software on the switches the same to avoid any problems related to that. And don't forget to enable irf on both switches with the comand irf-port-configuration active.

parnassus
Honored Contributor
Solution

Re: IRF experts needed

Hi @vzwnickf, to what @HP-Browniee already mentioned, I would also add to gave this [*] interesting document a read (clearly it doesn't substitute any IRF Configuration Guide reading but it is really useful for fast issue resolution): Sections 3.1 and 3.2 "Switch x Config" explain how to generically prepare a Switch to be part of (to form) an IRF Stack and Section 4.5 "Replace a Failed Switch" explains steps to replace a failing IRF Member switch (Switch 2 was used as a failure example) with one identical unit (which is your case) or with one of the same family with more ports (which is not your case actually).

In your case Switch 2 is OK and Switch 1 was KO, you have now a new Switch that should be prepared to take the membership that was owned by your failed unit (so you should know the IRF Members numbering {1,2 in the example} and priorities {32=Highest, 1=Lowest in the example} of your two units before the IRF splitted <- note how was configured Switch 2 - the second unit added to IRF - on Section 3.2 with respect to how was configured the very first unit added to IRF on Section 3.1): If I were you - other than standard IRF checks written above - I will start to check your Switch 2 about:

  • IRF Member priority status (is it lower/higher than what was assigned to your failed Switch 1 unit?)
  • IRF Member numbering status (it should be 2 since the Switch 2 was the 2nd added member to your IRF Stack)
  • Software version in use on Switch 2 (in order to understand if your new replacement unit Switch 1 already match or not: probably not, so upgrade both units to latest version available to start from a known update common software version).

and only then I will start preparing the new Switch 1 for its addition to the IRF Stack considering (adapting) what you have found above (pay attention that IRF Ports should be physically connected AFTER you set them up logically, see configuration steps described on the linked document).

[*] Credits to @Richard Litchfield to be the creator of that document (shared on this discussion).


I'm not an HPE Employee
Kudos and Accepted Solution banner
vzwnickf
Occasional Visitor

Re: IRF experts needed

Hi all, 

I was able to finally resolve this by just allowing switch 2 to remain the master rather than try to force the election on the new switch. Normally switch 1 would be the master (the one that failed), but I determined it would be more efficient to just adjust the priorities of the two switches and amend our documentation to reflect that. After setting switch 2 with the higher priority, I wiped the IRF config from switch 1, lowered it's priority and then opened it's ports and made the physical connection. This enabled me to add switch 1 as member/slave and avoid any more network downtime (switch 2 was carrying the entire network after the first switch failed, by leaving it as master it did not require a reboot).

I am marking parnassus as the correct answer because it was their documentation/link that finally helped me understand the proper order of operations-- namely that the master must be configured first, then connect the member(s) with a physical link and wait for restart.  

 

Thanks!