StoreEver Tape Storage
1752588 Members
4915 Online
108788 Solutions
New Discussion юеВ

Re: FC connection drop to MSL 6030

 
Havard_3
Occasional Advisor

FC connection drop to MSL 6030

We're experiencing problems with loss of connection to a MSL 6030 tape library from a Windows 2003 server running Backup Exec.
The same problem appears with ntbackup, so fairly certain that the problem is at the hardware configuration end of things and I am hoping someone can possible shed some light on a fix :)

Some key system information;
HP c7000 Enclosure with blades
HP EVA 6000 SAN with both F-SCSI and FATA disks
HP Brocade 4/12 SAN switches
HP FC 2243 (Emulex) controller card (backup server)
HP MSL 6030 Tape library with two Ultrium 960 drives

The library firmware is 5.20, and the drives are on G65W.
The e1200-320 has been updated to 5944.

As far as the controller card on the backup server is concerned it's running firmware version 2.72A2, with driver version 5-2.00A12 (driver name: elxstor).
Boot bios version is 5.02a1 and Remote Manager Server Version is 31.0a23

The backup server is non-blade server which is connected to the SAN switches with two fibre cables.

For some reason, we're now experiencing intermittent loss of connection to the library and one of the fibre links always ends up being dropped shortly after reboot. The link that is being dropped is not the one on which the MSL is zoned, but this configuration has worked for several months.

The MSL connection drop occurs at a random interval after the server and MSL have been rebooted.
Some times the backups can run for a few minutes, and sometimes a few hours.
Swapped the fibre cables on the controller card round, but still same link that fails.

According to our logs, no maintenance or updates were performed for about three weeks prior to the error appearing so kind of lost as to the cause of this. Anyone got any ideas?
16 REPLIES 16
IBaltay
Honored Contributor

Re: FC connection drop to MSL 6030

Hi,
is the library correctly zoned/isolated in the separate zone sets with the appropriate servers?
the pain is one part of the reality
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

I'm not 100% sure any more as I didn't originally create the zoning for this system, and been staring myself a bit blind on the problem now.


In the alias list for second Brocade we have the following under WWNs, along with the HSVs and SAN disks.

CROSSROADS SYSTEMS, INC 10:00:00:E0:02:03:AA:45
CROSSROADS SYSTEMS, INC 10:00:00:E0:02:23:AA:45
[28] "HP NS E1200-320 5944"

EMULEX CORPORATION 20:00:00:00:C9:5E:36:D6
EMULEX CORPORATION 10:00:00:00:C9:5E:36:D6
[50] "Emulex FC2243 FV2.72A2 DV5-2.00A12 MAKING-MANAGER"



The zone defined for the MSL <-> Backup Server contains two alias;
Making_Manager_Right
Making_MSL6000


Making_MSL6000 contains;
CROSSROADS SYSTEMS, INC 10:00:00:E0:02:23:AA:45

Making_Manager_Right contains;
EMULEX CORPORATION 10:00:00:00:C9:5E:36:D6
CROSSROADS SYSTEMS, INC 10:00:00:E0:02:03:AA:45


On the first Brocade we also have a zone defined for the MSL <-> Backup Server that was created to test FC cables recently (with no luck).
It contains two aliases;
Making_Manager_Left
Making_MSL6000


Making_MSL6000 contains;
CROSSROADS SYSTEMS, INC 10:00:00:E0:02:23:AA:45

Making_Manager_Left contains;
EMULEX CORPORATION 10:00:00:00:C9:5E:36:D7


None of these WWNs are available on the first Brocade, as the FC cable from the MSL 6030 goes to the second Brocade, and the link to the backup servers FC2243 card is down.
However, when it was created as a test, the FC cables were moved over to this Brocade - the link failing issue remained when doing this.
Also worthy of note is that if we swap FC cables, the link still fails between the connected Brocade and the same port of the FC controller from the backup server.

So, remove the zone for MSL <-> Backup Server on the first Brocade, and remove CROSSROADS SYSTEMS, INC 10:00:00:E0:02:03:AA:45 as a member of Making_Manager_Right?
Robin T. Slotten
Trusted Contributor

Re: FC connection drop to MSL 6030

http://www11.itrc.hp.com/service/cki/docDisplay.do?docLocale=en&docId=emr_na-c00659626-6

Check the above link for an alert about scsi cable paths and termination. After correcting my MSL6060, things improved greatly.

Rob...
IF you do it more than twice, write a script.
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

Thanks for that, but the cabling is already as described in the document.
IBaltay
Honored Contributor

Re: FC connection drop to MSL 6030

Hi,
1. what are the brocade firmware versions?
2. did you try to replace the fc cables and/or port SFPs?
3. what are the NSR firmware versions?
the pain is one part of the reality
IBaltay
Honored Contributor

Re: FC connection drop to MSL 6030

Hi,
1. what are the brocade firmware versions?
2. did you try to replace the fc cables and/or port SFPs?
3. what are the NSR firmware versions?

the pain is one part of the reality
IBaltay
Honored Contributor

Re: FC connection drop to MSL 6030

Hi,
you can troubleshoot the lib and host ports via the porterrshow in a reasonable time frame e.g. 1 hour to see if the error counters are growing there. If yes, its an indication of the marginal link problem (faulty host hba port or host/lib cable or switch port sfp or switch mainboard
the pain is one part of the reality
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

Thanks for your reply.

1. Fabric OS version is v5.0.5c on both brocades
2. I've only swapped the FC cables round that go from the two ports on the controller in the backup server to the brocades. Even with cables swapped round, it's the same port on the controller that drops it's link. Resetting the controller port can temporarily bring the link back up but it never stays up for long.

3. On the e1200 it's 5944 if that's the one you're thinking of?
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

Just to clarify, do you mean manually run the command every five minutes or so?
Out of interest, is there a way to reset the current count values? Reason is I see some errors listed, but this could be related to pulling FC cables in and out during cable testing earlier


Current output on both is as follows, pardon the wacky formatting but I've attached it in a text file aswell;

sw1fab2:admin> portErrShow
frames enc crc too too bad enc disc link loss loss frjt fbsy
tx rx in err shrt long eof out c3 fail sync sig
=====================================================================
0: 3.7g 2.6g 0 0 0 0 0 454k 0 22k 3.8k 7.5k 0 0
1: 2.8g 797m 0 0 0 0 0 19m 0 1 116 0 0 0
2: 165m 26m 0 0 0 0 0 48k 0 0 14 0 0 0
3: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
4: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
5: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
6: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
7: 10m 5.8m 0 0 0 0 0 123k 0 0 16 0 0 0
8: 10m 5.6m 0 0 0 0 0 13k 0 0 16 0 0 0
9: 115m 239m 0 0 0 0 0 974k 0 0 26 0 0 0
10: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
11: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
12: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
13: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
14: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
15: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
16: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
17: 665m 2.8g 0 0 0 0 0 1 0 0 3 2 0 0
18: 707m 522m 0 0 0 0 0 0 0 0 3 2 0 0
19: 2.8g 4.0g 0 0 0 0 0 172k 297 182 855 447 0 0
20: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
21: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
22: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
23: 0 0 0 0 0 0 0 0 0 0 2 2 0 0


sw1fab1:admin> portErrShow
frames enc crc too too bad enc disc link loss loss frjt fbsy
tx rx in err shrt long eof out c3 fail sync sig
=====================================================================
0: 659m 2.3g 0 0 0 0 0 4.8k 33 27 68 73 0 0
1: 2.8g 796m 0 0 0 0 0 243k 0 0 42 0 0 0
2: 164m 26m 0 0 0 0 0 5 0 0 14 0 0 0
3: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
4: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
5: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
6: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
7: 4.0g 4.1g 0 0 0 0 0 6.4k 0 0 16 0 0 0
8: 579m 2.6g 0 0 0 0 0 187k 0 0 16 0 0 0
9: 116m 239m 0 0 0 0 0 455m 0 0 249 0 0 0
10: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
11: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
12: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
13: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
14: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
15: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
16: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
17: 3.1g 3.1g 0 0 0 0 0 0 0 0 3 2 0 0
18: 693m 500m 0 0 0 0 0 0 0 0 3 2 0 0
19: 2.5g 919m 0 0 0 0 0 2.4m 294 158 530 453 0 0
20: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
21: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
22: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
23: 0 0 0 0 0 0 0 0 0 0 2 2 0 0