Storage Boards Cleanup
To make it easier to find information about HPE Storage products and solutions, we are doing spring cleaning. This includes consolidation of some older boards, and a simpler structure that more accurately reflects how people use HPE Storage.
Tape Libraries and Drives
cancel
Showing results for 
Search instead for 
Did you mean: 

FC connection drop to MSL 6030

Havard_3
Occasional Advisor

FC connection drop to MSL 6030

We're experiencing problems with loss of connection to a MSL 6030 tape library from a Windows 2003 server running Backup Exec.
The same problem appears with ntbackup, so fairly certain that the problem is at the hardware configuration end of things and I am hoping someone can possible shed some light on a fix :)

Some key system information;
HP c7000 Enclosure with blades
HP EVA 6000 SAN with both F-SCSI and FATA disks
HP Brocade 4/12 SAN switches
HP FC 2243 (Emulex) controller card (backup server)
HP MSL 6030 Tape library with two Ultrium 960 drives

The library firmware is 5.20, and the drives are on G65W.
The e1200-320 has been updated to 5944.

As far as the controller card on the backup server is concerned it's running firmware version 2.72A2, with driver version 5-2.00A12 (driver name: elxstor).
Boot bios version is 5.02a1 and Remote Manager Server Version is 31.0a23

The backup server is non-blade server which is connected to the SAN switches with two fibre cables.

For some reason, we're now experiencing intermittent loss of connection to the library and one of the fibre links always ends up being dropped shortly after reboot. The link that is being dropped is not the one on which the MSL is zoned, but this configuration has worked for several months.

The MSL connection drop occurs at a random interval after the server and MSL have been rebooted.
Some times the backups can run for a few minutes, and sometimes a few hours.
Swapped the fibre cables on the controller card round, but still same link that fails.

According to our logs, no maintenance or updates were performed for about three weeks prior to the error appearing so kind of lost as to the cause of this. Anyone got any ideas?
16 REPLIES
IBaltay
Honored Contributor

Re: FC connection drop to MSL 6030

Hi,
is the library correctly zoned/isolated in the separate zone sets with the appropriate servers?
the pain is one part of the reality
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

I'm not 100% sure any more as I didn't originally create the zoning for this system, and been staring myself a bit blind on the problem now.


In the alias list for second Brocade we have the following under WWNs, along with the HSVs and SAN disks.

CROSSROADS SYSTEMS, INC 10:00:00:E0:02:03:AA:45
CROSSROADS SYSTEMS, INC 10:00:00:E0:02:23:AA:45
[28] "HP NS E1200-320 5944"

EMULEX CORPORATION 20:00:00:00:C9:5E:36:D6
EMULEX CORPORATION 10:00:00:00:C9:5E:36:D6
[50] "Emulex FC2243 FV2.72A2 DV5-2.00A12 MAKING-MANAGER"



The zone defined for the MSL <-> Backup Server contains two alias;
Making_Manager_Right
Making_MSL6000


Making_MSL6000 contains;
CROSSROADS SYSTEMS, INC 10:00:00:E0:02:23:AA:45

Making_Manager_Right contains;
EMULEX CORPORATION 10:00:00:00:C9:5E:36:D6
CROSSROADS SYSTEMS, INC 10:00:00:E0:02:03:AA:45


On the first Brocade we also have a zone defined for the MSL <-> Backup Server that was created to test FC cables recently (with no luck).
It contains two aliases;
Making_Manager_Left
Making_MSL6000


Making_MSL6000 contains;
CROSSROADS SYSTEMS, INC 10:00:00:E0:02:23:AA:45

Making_Manager_Left contains;
EMULEX CORPORATION 10:00:00:00:C9:5E:36:D7


None of these WWNs are available on the first Brocade, as the FC cable from the MSL 6030 goes to the second Brocade, and the link to the backup servers FC2243 card is down.
However, when it was created as a test, the FC cables were moved over to this Brocade - the link failing issue remained when doing this.
Also worthy of note is that if we swap FC cables, the link still fails between the connected Brocade and the same port of the FC controller from the backup server.

So, remove the zone for MSL <-> Backup Server on the first Brocade, and remove CROSSROADS SYSTEMS, INC 10:00:00:E0:02:03:AA:45 as a member of Making_Manager_Right?
Robin T. Slotten
Trusted Contributor

Re: FC connection drop to MSL 6030

http://www11.itrc.hp.com/service/cki/docDisplay.do?docLocale=en&docId=emr_na-c00659626-6

Check the above link for an alert about scsi cable paths and termination. After correcting my MSL6060, things improved greatly.

Rob...
IF you do it more than twice, write a script.
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

Thanks for that, but the cabling is already as described in the document.
IBaltay
Honored Contributor

Re: FC connection drop to MSL 6030

Hi,
1. what are the brocade firmware versions?
2. did you try to replace the fc cables and/or port SFPs?
3. what are the NSR firmware versions?
the pain is one part of the reality
IBaltay
Honored Contributor

Re: FC connection drop to MSL 6030

Hi,
1. what are the brocade firmware versions?
2. did you try to replace the fc cables and/or port SFPs?
3. what are the NSR firmware versions?

the pain is one part of the reality
IBaltay
Honored Contributor

Re: FC connection drop to MSL 6030

Hi,
you can troubleshoot the lib and host ports via the porterrshow in a reasonable time frame e.g. 1 hour to see if the error counters are growing there. If yes, its an indication of the marginal link problem (faulty host hba port or host/lib cable or switch port sfp or switch mainboard
the pain is one part of the reality
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

Thanks for your reply.

1. Fabric OS version is v5.0.5c on both brocades
2. I've only swapped the FC cables round that go from the two ports on the controller in the backup server to the brocades. Even with cables swapped round, it's the same port on the controller that drops it's link. Resetting the controller port can temporarily bring the link back up but it never stays up for long.

3. On the e1200 it's 5944 if that's the one you're thinking of?
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

Just to clarify, do you mean manually run the command every five minutes or so?
Out of interest, is there a way to reset the current count values? Reason is I see some errors listed, but this could be related to pulling FC cables in and out during cable testing earlier


Current output on both is as follows, pardon the wacky formatting but I've attached it in a text file aswell;

sw1fab2:admin> portErrShow
frames enc crc too too bad enc disc link loss loss frjt fbsy
tx rx in err shrt long eof out c3 fail sync sig
=====================================================================
0: 3.7g 2.6g 0 0 0 0 0 454k 0 22k 3.8k 7.5k 0 0
1: 2.8g 797m 0 0 0 0 0 19m 0 1 116 0 0 0
2: 165m 26m 0 0 0 0 0 48k 0 0 14 0 0 0
3: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
4: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
5: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
6: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
7: 10m 5.8m 0 0 0 0 0 123k 0 0 16 0 0 0
8: 10m 5.6m 0 0 0 0 0 13k 0 0 16 0 0 0
9: 115m 239m 0 0 0 0 0 974k 0 0 26 0 0 0
10: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
11: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
12: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
13: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
14: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
15: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
16: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
17: 665m 2.8g 0 0 0 0 0 1 0 0 3 2 0 0
18: 707m 522m 0 0 0 0 0 0 0 0 3 2 0 0
19: 2.8g 4.0g 0 0 0 0 0 172k 297 182 855 447 0 0
20: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
21: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
22: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
23: 0 0 0 0 0 0 0 0 0 0 2 2 0 0


sw1fab1:admin> portErrShow
frames enc crc too too bad enc disc link loss loss frjt fbsy
tx rx in err shrt long eof out c3 fail sync sig
=====================================================================
0: 659m 2.3g 0 0 0 0 0 4.8k 33 27 68 73 0 0
1: 2.8g 796m 0 0 0 0 0 243k 0 0 42 0 0 0
2: 164m 26m 0 0 0 0 0 5 0 0 14 0 0 0
3: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
4: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
5: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
6: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
7: 4.0g 4.1g 0 0 0 0 0 6.4k 0 0 16 0 0 0
8: 579m 2.6g 0 0 0 0 0 187k 0 0 16 0 0 0
9: 116m 239m 0 0 0 0 0 455m 0 0 249 0 0 0
10: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
11: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
12: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
13: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
14: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
15: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
16: 0 0 0 0 0 0 0 0 0 0 2 0 0 0
17: 3.1g 3.1g 0 0 0 0 0 0 0 0 3 2 0 0
18: 693m 500m 0 0 0 0 0 0 0 0 3 2 0 0
19: 2.5g 919m 0 0 0 0 0 2.4m 294 158 530 453 0 0
20: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
21: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
22: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
23: 0 0 0 0 0 0 0 0 0 0 2 2 0 0
IBaltay
Honored Contributor

Re: FC connection drop to MSL 6030

Hi, the most important is to repeat/compare the error counter values on the impacted ports
the pain is one part of the reality
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

Ok, here's the process done following your input.

Ran a short backup just to produce the error and start with a clean sheet.
Rebooted backup server only, e1200 & library didn't show up and link to first brocade down.
Rebooted library and server, e1200 & library shows up and link to first brocade down

Then started portErrShow, noted values.
Initated ntbackup job, ran for 56 minutes and 42 seconds then failed.
During backup job portErrShow was re-run about every 5 to 10 minutes, but values didn't change.
Once the error appeared I again ran portErrShow, and the only value that had changed was under disc c3, which went up from 298 to 340 for fc connection to the controller card as seen below.

0 = e1200
19 = controller card in server


frames enc crc too too bad enc disc link loss loss frjt fbsy
tx rx in err shrt long eof out c3 fail sync sig
=====================================================================
0: 3.7g 2.6g 0 0 0 0 0 454k 0 22k 3.8k 7.6k 0 0
19: 2.8g 4.0g 0 0 0 0 0 172k 298 182 867 459 0 0


frames enc crc too too bad enc disc link loss loss frjt fbsy
tx rx in err shrt long eof out c3 fail sync sig
=====================================================================
0: 3.7g 2.6g 0 0 0 0 0 454k 0 22k 3.8k 7.6k 0 0
19: 2.8g 4.0g 0 0 0 0 0 172k 340 182 867 459 0 0


I then checked HBanywhere on the backup server;
Link to second brocade up and HSV 200's visible, but drives & e1200 not listed.
Link to first brocade down, still.

Verified this in turn by going into the web mgmt interface for the second brocade, and here the port connection to the controller card was still alive (19) - but the link to the e1200 was down (0).
As mentioned above, the only place where a value changed was the port connecting the controller card on the backup server - not the one to the e1200. Yet that is the one that drops.
IBaltay
Honored Contributor

Re: FC connection drop to MSL 6030

Hi,
can you connect he library to the second ("good") fabric to see if the problem there persists or not to isolate the problem?
the pain is one part of the reality
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

-Removed the zoning configuration that was related to the backup server and MSL on both switches.
-Moved the MSL onto the other SAN switch, along with the backup server (using the "good" port on the controller card).
-Verified that they appeared on the switch, and recreated aliases and a zone for them.
-Ran the backup job again.

Same error after a while. MSL/e1200 drops out of the SAN switch, but the backup server remains.
Same results as before on portErrShow.

I've placed an order for a new fibre cable just to see if I can get the controller card to connect to both SAN switches and rule out a cable error, but it does look like the error is related to the library since it gets disconnected from them both?
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

Replaced the fibre cable going from the second port on the controller card on the server and to the switch, no change.

Put the same new fibre cable between the e1200 and the switch, and backups have now been running for over two hours. So if all goes well I'll let a job run over night and see what happens.


Rhoderick B. Boral
Occasional Visitor

Re: FC connection drop to MSL 6030


Hi,

Do you have the solution already?
This is the same case for us right now.
MSL 8096, Brocade switch, Windows clients, HBA (EMulex).

Thanks for your reply!
Havard_3
Occasional Advisor

Re: FC connection drop to MSL 6030

Hi,

Yes, more or less. The last fibre cable that we replaced has fixed the backups. Still only have one connection up from the emulex board, but we're replacing the server soon so once the backups got running fine we just left it as is.