Disk Enclosures
cancel
Showing results for 
Search instead for 
Did you mean: 

MSA1000, Linux & Multi Pathing

Nicole Angelis
Occasional Advisor

MSA1000, Linux & Multi Pathing

Hi Admins,

We installed a new MSA1000 this weekend. Here is the configuration:

MSA1000 with 1 enclosure and 10 disks
Dual controllers
Dual MSA SAN switches
Storage is shared among 8 blade bl20ps running RH ES3 update 4.

I am very familiar with zoning so I know it is correct, but it is as follows (we're testing on only 3 blades right now):

Fabric A: (3) Zones made up of the following aliases:
blade1 "1,0; 1;1"
blade2 "1,0; 1;2"
blade3 "1,0; 1;3"

Fabric B: (3)Zones made up of following aliases:
blade1b "2,0; 2;1"
blade2b "2,0; 2;2"
blade3b "2,0; 2;3"

Active cofig shows all zones correctly on both fabrics.

Here are the questions I have:

1. When we did the zoning on the switches, there were no fibre channel cables connected, since we used port zoning, that is ok. We resetted the MSA1000 and everything looks good, the controllers show "STARTUP COMPLETE", however, when we connect to fabricA switch we can see the active config with all the zoning information (cfgShow). When we look at the second switch (fabric B), it says there are no active configurations. Is this normal? The MSA CLI using SHOW CONNECTIONS shows all 6 WWWPNs, which lead me to believe zoning is ok.

2. We are using the latest Qlogic Driver kit in order to take advantage of the failover features without Secure Path. The driver installation went fine, the 3 blades can see the MSA and luns presented to them. However, one of the blades sees the lun via the Backup controller and not the active one. We rebooted the server and still saw the LUN via the Backup controller. We then unpresented all luns to all servers and presented them again and the lun showed via the Active controller. Does this point to some sort of hardware failure on of the FC HBAs or perhaps the FC cable?

3. When using ACU and SSP, should the SSP screen show only the (3) WWWPNs on the active controller (fabric A), or should it show all 6 (2 per blade)?
We had a situation in which (5) WWWPNs appeared, one of them was connected to fabricB, after a reboot we saw 4 and then 6. What should the default behaviour be?

4. What is the best way to test failover, disconnecting a cable? Puling out a controller? When using the Qlogic driver in multipath failover mode, the process should be transparent just as it would on secure path right?

5. When we ran lssg or probe-lun, we saw only a single path per lun, does this mean the Qlogic driver does some sort of lun masking the way SP did or is something not configured right and should see two paths?

Has anyone had any similar situations or any advice on how to address this issues? Our goal is to get this servers and storage in production by Sunday. Any help or advice is greatly appreciated.

Thanks!

Nicole.
1 REPLY
Uwe Zessin
Honored Contributor

Re: MSA1000, Linux & Multi Pathing

1.) You can do port- and WWN-based zoning without the need to have the device attached on Brocade switches. It sounds like the zoning was not activated on fabric B. In this case it means that all ports can talk to each other. do a "switchShow" and you should see whether zoning is enabled or not.

2.) I wonder if it is possible that a boot has caused a failover to the second MSA1000 controller...

3.) You need to see all six connections so that you can assign the correct profile and make the correct SSP setup. There are sometime problems (not limited to the MSA1000) that a storage array does not recognize a fibre channel adapter. I have found that it often helps to toggle the adapter's port ("portDisable", "portEnable") or to poll the storage array from the adapter BIOS.

4.) You can pull a cable or manually disable the switch port. Pulling a controller can cause certain MSA1000s to lock-up. The failover should be transparent, but it can take a little while.

5.) I have yet to play with the Qlogic driver, but I expect that normal behaviour. The driver works similar to Secure Path and shields all paths from the upper layers in the operating system.

Technically speaking, it is not called LUN masking - that would be if the driver blocks all access to one or more LUNs from the storage array completely.
.