1833774 Members
2427 Online
110063 Solutions
New Discussion

Strange SAN problem.

 
Leif Halvarsson_2
Honored Contributor

Strange SAN problem.

Hi,
I have two identical A400 servers (server1 and server2), both with A5158A FC-HBAs. OS is HP-UX 11.11 and identical patching.

In the SAN is one Vixel 1000 FC-HUB and one Qlogic Sanbox2 FC-Switch.

If I connect server1 to the switch there is no problem, the link is up and the devices on the SAN is listed with ioscan.

But, if connecting server2 to the switch I don't get link up. What seems strange to me is that the link comes up if I connect server2 to the HUB.

The Switch port settings is identical, I have also tried changing the cables and the ports.

I can't remember I have changed any settings from default for the HBAs on any of the servers.

When booting server2 with the HBA connected to the Switch I get the following message in the syslog.log.

Jul 6 19:58:15 c169 vmunix: 0/2/0/0: Fibre Channel Driver detected a NOS/OLS st
orm. Frame Manager Status Register = 0x800800D0.

Summary:

- All components (HBAs, Switch, HUB) seems to be OK.
- Both servers get link up if connected to the HUB.
- Only server1 get link up if connected to the switch.
- I can't find any differences with the two servers.



9 REPLIES 9
Michael Tully
Honored Contributor

Re: Strange SAN problem.

Hi Leif,

Seeing the HBA cards themselves don't carry any type of firmware, but they must have driver software in which to run them. Do you run the same driver version for both systems? There should be no need to change the settings for the HBA cards seeing they are auto-sensing.

Have a look here, if you haven't already.

http://www.software.hp.com/cgi-bin/swdepot_parser.cgi/cgi/displayProductInfo.pl?productNumber=FibrChanl00&oper=install

Also note the patch dependency.

HTH
Michael
Anyone for a Mutiny ?
Trevor Roddam_1
Valued Contributor

Re: Strange SAN problem.

Just wondering if there is any Zoning set on the switch?

I know you mentioned using different ports etc but worth checking.

Trevor.
Baldric, I have a plan so cunning you could pin a tail on it and call it a weasle.
Massimo Bianchi
Honored Contributor

Re: Strange SAN problem.

Hi,
another check you can do is if you have a software like SecureManager, and you have set some secutiry using the WWN.


But what hit me is the message:

Jul 6 19:58:15 c169 vmunix: 0/2/0/0: Fibre Channel Driver detected a NOS/OLS storm. Frame Manager Status Register = 0x800800D0.


Usually I see this "storm" messages when there are problem with the card itself, or an incompatibility between switch and card (settings?).

Did you try a check with stm or, better, fcmsutil, to see if you changed some standard parameter?

Like FC-AL settings....

A switch usually has some better logic in it, i don't know the Vixel ones, but maybe there are some special setting to have more than one server attached.

HTH,
Massimo


Massimo Bianchi
Honored Contributor

Re: Strange SAN problem.

Hi,
another hint:
take a look at this thread:

http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0x17c9a2db8513d6118ff40090279cd0f9,00.html


it had exactly the same error message, even with the frame register value.

Massimo
Leif Halvarsson_2
Honored Contributor

Re: Strange SAN problem.

Hi,

Still not have much luck with this problem.

What I could find was that fabric mode was not supported in earlier versions of the A5158A driver. But, as I have autodetection set for the switch ports I think the port should change to loop mode if this was the case. And the port for Server1 come up in fabric mode.

I also find a difference in patching which is related to FC but the problem solved with this patch only apply to V-class servers.

And, more strangness, the fcmsutil command gives the following output for Server1 (the server connected to the switch):

fcmsutil /dev/td0

Vendor ID is = 0x00103C
Device ID is = 0x001028
PCI Sub-system Vendor ID is = 0x00103C
PCI Sub-system ID is = 0x000006
Topology = Topology is Unknown
Local N_Port_id is = 0x010100
Memory fault(coredump)

But, for Server2:

fcmsutil /dev/td0

Vendor ID is = 0x00103C
Device ID is = 0x001028
PCI Sub-system Vendor ID is = 0x00103C
PCI Sub-system ID is = 0x000006
Topology = IN_LOOP
Local N_Port_id is = 0x000001
Local Loop_id is = 125
N_Port Node World Wide Name = 0x50060B00000AC8E9
N_Port Port World Wide Name = 0x50060B00000AC8E8
Driver state = ONLINE
Hardware Path is = 0/2/0/0
Number of Assisted IOs = 24722
Number of Active Login Sessions = 1




I use the "FibrChanl-00 B.11.11.03" software, the latest version seems to be B11.11.09. Maybe it is an idea to try to install the latest software. I




Brian King
Advisor

Re: Strange SAN problem.

Hi Leif,

I'd like to get some more information about your design. It seems that from your description, the problem could be a configuration issue more so than anything else. Which SAN disk array are you using and are the ports configured for Arbitrated Loop or Fabric Login (or a mix)? How are you using the FC-Hub, is it connected to the Qlogic switch for future expansion? Is it your intention to have the pair of A400 servers connected directly to the Qlogic switch, or to the FC-Hub? The Vixel 1000 FC-HUB, by nature, supports only Arbitrated Loop while the Qlogic Sanbox2 supports both Arbitrated Loop and Fabric Login. You need to insure that all components are configured for one or the other on a given path. There are more configuration options to consider, such as zoning and security (HP SecureManager, etc). Have you tried connecting server2 to the same port that server1 is working on? Let's assume so, then at first glance, it appears to me that server2, the port that it is connected to on the Qlogic switch (possible all of the ports), and the SAN disk array port(s) are all configured with Fabric Login turned on and server1 is configured as Arbitrated Loop (hence, the lack of link light on the Qlogic Switch and presence of the link light on the Vixel FC-Hub).

Another design consideration is to upgrade your FC cards currently installed in the A400 servers to the more current A6795A fibre channel card, which allows for up to 2GB port speed (supported by the Qlogic Sanbox2 FC-Switch).
Brian King
Advisor

Re: Strange SAN problem.

Leif,

This is a modification to my original message to you:

Have you tried connecting server2 to the same port that server1 is working on? Let's assume so, then at first glance, it appears to me that server1, the port that it is connected to on the Qlogic switch (possible all of the ports), and the SAN disk array port(s) are all configured with Fabric Login turned on and server2 is configured as Arbitrated Loop (hence, the lack of link light on the Qlogic Switch and presence of the link light on the Vixel FC-Hub).

I had server1 and server2 swapped.

Brian
Leif Halvarsson_2
Honored Contributor

Re: Strange SAN problem.

Hi,

This small SAN was primary intended for a backup solution. All components was connected to the HUB, there was only 3 hosts (2 HP-UX and one Windows), one library (SureStore 2/20) and one disk array (CMD) connected. There was plans for expanding with more backup clients in the future.

I know that it is not recommended to connect librarys to a HUB but this configuration has worked well for about one year.

When we needed to connect more backup clients my idea was to move the two HP-UX hosts, the library and the disk to a switch, connect the HUB to one switch port and then connect the new backup clients to the HUB.
I suspect the problem is the software "FibrChanl-00 B.11.11.03", I upgraded Server1 to 11.11.09 and it solved the fcmsutil problem described in my previous message. I will try to upgrade Server2 tomorrow and see if it then is possible to connect to the switch.
Leif Halvarsson_2
Honored Contributor

Re: Strange SAN problem.

Hi,
It seems as the "FibrChanl-00 B.11.11.03" was the problem. After installing B.11.11.09 it was no problem with connecting to the switch.