BladeSystem Virtual Connect
cancel
Showing results for 
Search instead for 
Did you mean: 

Virtual Connect -Fibre Channel flogin failure - RESOLVED

chuckk281
Trusted Contributor

Virtual Connect -Fibre Channel flogin failure - RESOLVED

A fairly long problem that Thomas had with Virtual Connect - Fibre Channel logins was resolved:

 

*************************

It took nine weeks but we finally sorted out this VC-FC flogin failure. It turns out it is an issue related to Cisco Fibre Channel switches connected the Virtual Connect FC modules when you use VC assigned World Wide IDs and also Qlogic HBAs. I have included the detail below, which we got from Cisco, it is evidently in some of their release notes but not documented at all from the HP side (I asked that that be changed). This can bite you at any site where you have the combination of products above, also on Virtual Connect FlexFabric modules connected to Cisco or Nexus switches. It will occur on a per Cisco switch / VSAN basis as it runs out of available Fibre Channel IDs due to the anomaly. The workaround is listed below as well.

 

This workaround corrected the issue at my account, although it took two months to fix – ugh.

 

                                                                                                                Thanks, Tom K.

 

 

 

From Cisco PDF: (http://www.cisco.com/en/US/docs/switches/datacenter/mds9000/sw/5_0/configuration/guides/fabric/nxos/cli_fabric.pdf )

Fibre Channel ID Allocation for HBAs

Fibre Channel standards require a unique FC ID to be allocated to an N port attached to a Fx port in any switch. To conserve the number of FC IDs used, Cisco MDS 9000 Family switches use a special allocation scheme.

Some HBAs do not discover targets that have FC IDs with the same domain and area. Prior to Cisco SAN-OS Release 2.0(1b), the Cisco SAN-OS software maintained a list of tested company IDs that do not exhibit this behavior. These HBAs were allocated with single FC IDs, and for others a full area was allocated.

 

The FC ID allocation scheme available in Release 1.3 and earlier, allocates a full area to these HBAs. This allocation isolates them to that area and are listed with their pWWN during a fabric login. The allocated FC IDs are cached persistently and are still available in Cisco SAN-OS Release 2.0(1b) (see the “FC ID Allocation for HBAs” section on page 12-11).

 

To allow further scalability for switches with numerous ports, the Cisco NX-OS software maintains a list of HBAs exhibiting this behavior. Each HBA is identified by its company ID (also known known as Organizational Unique Identifier, or OUI) used in the pWWN during a fabric login. A full area is allocated to the N ports with company IDs that are listed, and for the others a single FC ID is allocated. Regardless of the kind (whole area or single) of FC ID allocated, the FC ID entries remain persistent.

 

 

It appears that the reason certain HBAs reserve an area in the first place is because they can't see targets when the first 4 digits of the FCID are the same as their own FCID. If HP is saying that isn't the case, that helps a lot. It would bring into question why reserve an area in the first place of course though.”

 

Actually HP doesn’t do anything about the first 4 digits. It is how the Cisco switch reacts to the QLogic HBAs and assigns area IDs. I found the below information in the Cisco 5K troubleshooting guide that explains what is occurring and how to deal with it on the Nexus 5K. I don’t know whether this work around can be used on the 9205 in the current FOS instead of the command presented in the Cisco firmware release notes or HP advisory. I would say these are more questions for Cisco.  These questions are based on how NPIV and FCID Allocation is implemented in relation to hardware the switch sees presented.

 

From Cisco PDF: (http://www.cisco.com/en/US/docs/switches/datacenter/nexus5000/sw/troubleshooting/guide/n5K_ts_sans.htm l)

 

However, because of an old issue with Qlogic HBAs, the Cisco Nexus 5000 domain server assigns a

separate area for each Qlogic HBA that matches a certain OUI by default. Therefore a conflict between

legacy gateway requirements and the Cisco domain allocation scheme exists. Cisco still implements this

to support old existing Qlogic HBAs in the field.

 

Solution

 

The solution is to configure "no fcid-allocation area company <oui>" for all used Qlogic OUIs (ensuring

flat fcid allocation in the future), force all affected blades to log out of the fabric, delete the already

created persistent fcid entry from the Nexus 5000 switch configuration, and allow the blade to login

again.

 

In the following "show flogi database" sample output, all devices obtain a unique area id (x01, x08, x0c).

 

Sample:

 

Fc2/1 2 0x620104 20:10:00:05:1e:5e:6a:85 10:00:00:05:1e:5e:6a:85

Fc2/1 2 0x620800 21:01:00:1b:32:a3:c0:2e 20:01:00:1b:32:a3:c0:2e

Fc2/1 2 0x620c00 21:01:00:1b:32:33:8b:8e 20:01:00:1b:32:33:8b:8e

 

Because of the specific area id requirement of the legacy switch, the last 2 blades must also have area

x01. Use the following procedure to force the Qlogic adapters to login again and obtain fcid in 0x6201xx

range.

 

Step 1 Configure (force) the future FCID allocation scheme to be flat for all WWNs matching the OUIs that are

in this situation:

 

switch(configure)# no fcid-allocation area company 0x001B32

 

Step 2 Force the fcid under reconfiguration to logout of the fabric.

 

Note It is not enough to simply shutdown the Nexus 5000 interface that serves as the primary uplink for that

server, since it will simply login through another one. The appropriate way is to shutdown the affected

blade and ensure that FLOGI for the WWN is gone.

 

Step 3 Delete the automatically created configuration entry for persistent fcid allocation as shown in the

following sample:

 

Sample:

 

switch(config)# fcdomain fcid database

switch(config-fcid-db)# no vsan 2 wwn 21:01:00:1b:32:a3:c0:2e fcid 0x620800 area dynamic

 

Step 4 Bring up the blade and ensure that it gets a proper fcid.

 

Sample:

 

Fc2/1 2 0x620104 20:10:00:05:1e:5e:6a:85 10:00:00:05:1e:5e:6a:85

Fc2/1 2 0x620123 21:01:00:1b:32:a3:c0:2e 20:01:00:1b:32:a3:c0:2e

 

 

**********************

 

Any comments or additions for what Thomas found and the solution?