1832854 Members
3194 Online
110047 Solutions
New Discussion

Re: A6685A failure

 
Stuart Powell
Super Advisor

A6685A failure

I have a K580 System running HP-UX 11i v1 with two A6685A HBA's. This machine provides our development environment. After a recent patch application and reboot I noticed that one of the HBA's was not listing any of the LUN's on the available arrays. We still have connectivity, but have lost our contingency path. The Driver state for the HBA from fcmsutil /dev/td1 is:
Driver state = AWAITING_LINK_UP
Most everything else on the "faulted" HBA matches the active card. I double checked that the zoning in the SAN is correct, which it is. I also found this note in OLDsyslog.log:
Dec 12 09:15:04 th9kkux vmunix: ct_query failed. hw_path = 8/8/1/0
So my question is; what other troubleshooting paths can be suggested before I break down and swap out the HBA.

Oh, yes; the system is no longer on a support contract, but I have a donor system waiting to have parts harvested.

Thanks in advance,

Stuart
Sometimes the best answer is another question
13 REPLIES 13
Sandman!
Honored Contributor

Re: A6685A failure

Check the fibre channel cable for problems; do a software reset of the HBA with the fcmsutil tool. You could also try physically reseating the HBA.

~hope it helps
Steven E. Protter
Exalted Contributor

Re: A6685A failure

Shalom,

It might be the fabric network.

Trying the donor system will help you determine if the issue is the HBA. They have no moving parts, they don't often fail, but its not unheard of.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Stuart Powell
Super Advisor

Re: A6685A failure

Thanks for the thoughts.

I have checked the cable. I see light on each end of each fiber.

I may try reseating the card before I do a transplant, but I need to make sure I've exhausted all uptime test first. (Our developers don't like to have the development system down when they are around)

Everything else in the SAN and particular zone are functioning properly. I've tried two other ports on the switch; one in the same Gbic and one in another Gbic.

Stuart
Sometimes the best answer is another question
Sandman!
Honored Contributor

Re: A6685A failure

Could you post the output of the following:

# ioscan -H 8/8/1/0
# fcmsutil /dev/td1 ns_query_ports

~thanks
Stuart Powell
Super Advisor

Re: A6685A failure

Sandman;

Here they are:
$sudo ioscan -H 8/8/1/0

H/W Path Class Description
=======================================
8/8/1/0 fc HP Tachyon TL/TS Fibre Channel Mass Storage Adapter

$ sudo fcmsutil /dev/td1 ns_query_ports
ERROR: Topology should be TD_FABRIC or TD_PUBLIC_LOOP.


Stuart
Sometimes the best answer is another question
Sandman!
Honored Contributor

Re: A6685A failure

As a follow-up could you post the output of the following (change might be as simple as changing topology of HBA card instead of replacing it):

# ioscan -funC fc
# fcmsutil /dev/td0
# fcmsutil /dev/td1

assumes the device special file for your other HBA is /dev/td0??
Denver Osborn
Honored Contributor

Re: A6685A failure

Have you looked at the HBA to see if it's xmitting light?

Before replacing the HBA, connect a loopback cable and see if it syncs.

-denver
Stuart Powell
Super Advisor

Re: A6685A failure

Denver;
HBA is transmitting light and it is visible on the switch end of the fiber.

Sandman;
See outputs below:
$ sudo fcmsutil /dev/td1

Vendor ID is = 0x00103c
Device ID is = 0x001028
TL Chip Revision No is = 3.0
PCI Sub-system Vendor ID is = 0x00103c
PCI Sub-system ID is = 0x000006
Previous Topology = UNINITIALIZED
Local N_Port_id is = 0x000000
Local Loop_id is = 126
N_Port Node World Wide Name = 0x50060b0000135a4d
N_Port Port World Wide Name = 0x50060b0000135a4c
Driver state = AWAITING_LINK_UP
Hardware Path is = 8/8/1/0
Number of Assisted IOs = 0
Number of Active Login Sessions = 0
Dino Present on Card = YES
Maximum Frame Size = 992
Driver Version = @(#) libtd.a HP Fibre Channel Tachyon
TL/TS/XL2 Driver B.11.11.12 PATCH_11.11 (PHSS_31326) /ux/kern/kisu/TL/src/common
/wsio/td_glue.c: Sep 5 2005, 10:14:40

$ sudo fcmsutil /dev/td0

Vendor ID is = 0x00103c
Device ID is = 0x001028
TL Chip Revision No is = 3.0
PCI Sub-system Vendor ID is = 0x00103c
PCI Sub-system ID is = 0x000006
Topology = PTTOPT_FABRIC
Local N_Port_id is = 0x030000
N_Port Node World Wide Name = 0x50060b0000135875
N_Port Port World Wide Name = 0x50060b0000135874
Driver state = ONLINE
Hardware Path is = 8/4/1/0
Number of Assisted IOs = 92739952
Number of Active Login Sessions = 2
Dino Present on Card = YES
Maximum Frame Size = 992
Driver Version = @(#) libtd.a HP Fibre Channel Tachyon
TL/TS/XL2 Driver B.11.11.12 PATCH_11.11 (PHSS_31326) /ux/kern/kisu/TL/src/common
/wsio/td_glue.c: Sep 5 2005, 10:14:40

$ sudo ioscan -funC fc
Class I H/W Path Driver S/W State H/W Type Description
=====================================================================
fc 0 8/4/1/0 td CLAIMED INTERFACE HP Tachyon TL/TS Fibre
Channel Mass Storage Adapter
/dev/td0
fc 1 8/8/1/0 td CLAIMED INTERFACE HP Tachyon TL/TS Fibre
Channel Mass Storage Adapter
/dev/td1

Everything looks the same as far as I can tell on the HBA's.

Thanks,

Stuart
Sometimes the best answer is another question
Denver Osborn
Honored Contributor

Re: A6685A failure

I'd still suggest a loopback cable. If the hba stays in AWAITING_LINK_UP with the loopback attached, then I would swap it out with your other hba. If it links up, then check the fc cable, gbic or switch port (which you already checked). Remember that just because you see the light on the switch end of the cable, doesn't have to mean it's good.

-denver
Cem Tugrul
Esteemed Contributor

Re: A6685A failure

Stuart,

could you try the command as below,
fcmsutil /dev/td1 reset
Our greatest duty in this life is to help others. And please, if you can't
Stuart Powell
Super Advisor

Re: A6685A failure

Denver,

Valid point. I'll see if I have a cable that will work.

Cem,

I have tried that, but I didn't note it when I created the thread.

Stuart
Sometimes the best answer is another question
Stuart Powell
Super Advisor

Re: A6685A failure

Thanks to all for the troubleshooting input. I tried a loopback and the HBA did not link up. I'll do a reseat of the card, and if that doesn't produce results I'll install a new card.

Stuart
Sometimes the best answer is another question
Stuart Powell
Super Advisor

Re: A6685A failure

Closing thread.

Stuart
Sometimes the best answer is another question