HPE Community read-only access December 15, 2018
This is a maintenance upgrade. You will be able to read articles and posts, but not post or reply.
Hours:
Dec 15, 4:00 am to 10:00 am UTC
Dec 14, 10:00 pm CST to Dec 15, 4:00 am CST
Dec 14, 8:00 pm PST to Dec 15, 2:00 am PST
Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

HBA CARD NOT RESPONDING ON I64

 
Mario Dhaenens
Frequent Advisor

HBA CARD NOT RESPONDING ON I64

Hi

On my VMS box I have two HBA cards.
Both are online (See below) but on each disk the path via FGB0 is in state not responding. (See next paragraph)

How can I solve this without rebooting the server. (some command ??)

NVJ$ sho dev fg

Device Device Error
Name Status Count
FGA0: Online 0
FGB0: Online 0

NVJ$ sho dev dsa100/ful

Disk DSA100:, device type HSV300, is online, mounted, file-oriented device,
shareable, available to cluster, error logging is enabled, device supports
bitmaps (bitmaps active).

Error count 0 Operations completed 267579797
Owner process "" Owner UIC [SYSTEM]
Owner process ID 00000000 Dev Prot S:RWPL,O:RWPL,G:R,W
Reference count 20408 Default buffer size 512
Total blocks 419430400 Sectors per track 128
Total cylinders 25600 Tracks per cylinder 128
Logical Volume Size 419430400 Expansion Size Limit 2147475456

Volume label "I64VMS" Relative volume number 0
Cluster size 16 Transaction count 2238
Free blocks 79256224 Maximum files allowed 16711679
Extend quantity 5 Mount count 8
Mount status System Cache name "_DSA100:XQPCACHE"
Extent cache size 64 Maximum blocks in extent cache 7925622
File ID cache size 64 Blocks in extent cache 516240
Quota cache size 0 Maximum buffers in FCP cache 8261
Volume owner UIC [1,1] Vol Prot S:RWCD,O:RWCD,G:RWCD,W:RWCD

Volume Status: ODS-5, subject to mount verification, protected subsystems
enabled, write-through caching enabled, special files enabled.
Volume is also mounted on NVR, NVC, NVL, NVS, NVG, NVE, NVD.

Disk $1$DGA100:, device type HSV300, is online, device has multiple I/O paths,
member of shadow set DSA100:, served to cluster via MSCP Server, error
logging is enabled.

Error count 0 Shadow member operation count 192353471
Current preferred CPU Id 6 Fastpath 1
WWID 01000010:6001-4380-024D-1182-0000-5000-0046-0000
Host name "NVJ" Host type, avail HP rx6600 (1.59GHz/12.
0MB), yes
Alternate host name "NVR" Alt. type, avail HP BL860c (1.67GHz/9.0
MB), yes
Allocation class 1

I/O paths to device 5

Path FGB0.5001-4380-025A-A808 (NVJ), primary, not responding
Error count 0 Operations completed 184053785
Last switched to time: Never Count 0
Last switched from time: 2-DEC-2010 12:06:40.57

Path FGA0.5001-4380-025A-A809 (NVJ), current
Error count 0 Operations completed 8198165
Last switched to time: 2-DEC-2010 12:06:40.57 Count 1
Last switched from time: Never

Path FGA0.5001-4380-025A-A80D (NVJ)
Error count 0 Operations completed 49852
Last switched to time: Never Count 0
Last switched from time: Never

Path FGB0.5001-4380-025A-A80C (NVJ), not responding
Error count 0 Operations completed 51669
Last switched to time: Never Count 0
Last switched from time: Never

Path MSCP (NVR)
Error count 0 Operations completed 0
Last switched to time: Never Count 0
Last switched from time: Never

Disk $1$DGA200:, device type HSV300, is online, device has multiple I/O paths,
member of shadow set DSA100:, served to cluster via MSCP Server, error
logging is enabled.

Error count 1 Shadow member operation count 13046124
Current preferred CPU Id 6 Fastpath 1
WWID 01000010:6005-08B4-0008-EAE8-0000-7000-0036-0000
Host name "NVJ" Host type, avail HP rx6600 (1.59GHz/12.
0MB), yes
Alternate host name "NVR" Alt. type, avail HP BL860c (1.67GHz/9.0
MB), yes
Allocation class 1

I/O paths to device 5

Path FGA0.5001-4380-025A-A499 (NVJ), primary, current
Error count 0 Operations completed 4100201
Last switched to time: 2-DEC-2010 12:09:38.33 Count 2
Last switched from time: 12-NOV-2010 09:47:02.80

Path FGA0.5001-4380-025A-A49D (NVJ)
Error count 0 Operations completed 51188
Last switched to time: 30-OCT-2010 17:53:18.46 Count 1
Last switched from time: 30-OCT-2010 17:53:22.97

Path FGB0.5001-4380-025A-A498 (NVJ), not responding
Error count 1 Operations completed 8841740
Last switched to time: 12-NOV-2010 09:47:02.80 Count 1
Last switched from time: 2-DEC-2010 12:09:38.33

Path FGB0.5001-4380-025A-A49C (NVJ), not responding
Error count 0 Operations completed 52995
Last switched to time: Never Count 0
Last switched from time: Never

Path MSCP (NVR)
Error count 0 Operations completed 0
Last switched to time: Never Count 0
Last switched from time: Never

All help is welcome.

/Dirk
3 REPLIES
Jon Pinkley
Honored Contributor

Re: HBA CARD NOT RESPONDING ON I64

Dirk,

I am not yet convinced that the problem is the HBA.

Path FGB0.5001-4380-025A-A498 (NVJ), not responding
Error count 1 Operations completed 8841740
Last switched to time: 12-NOV-2010 09:47:02.80 Count 1
Last switched from time: 2-DEC-2010 12:09:38.33

What happened at 2-DEC-2010 12:09:38.33 (the time the path switched from FGB to FGA).

There are many possibilities. One possibility is that someone was configuring the FC switch and inadvertently deleted the HBA WWID from the zone config. The switch is the first place I would be looking, since you can see if there are errors on the port, and you can verify the zone config. There is also the possibility that there is a loose cable, a bad GBIC or less likely a bad switch port.

If the problem is not the HBA, then you should not need to reboot. You just need to fix what caused the error and the path should come back. You may need to force a switchover to verify that the original path is operating correctly.

Jon
it depends
Mario Dhaenens
Frequent Advisor

Re: HBA CARD NOT RESPONDING ON I64

Hello,

A SAN switch rebooted on 2 dec 2010. The SAN switch is replaced but still we see this problem.


/Dirk
Jon Pinkley
Honored Contributor

Re: HBA CARD NOT RESPONDING ON I64

>>A SAN switch rebooted on 2 dec 2010. The SAN switch is replaced but still we see this problem.

Dirk,

If the SAN switch that rebooted was the one connected to the same fabric that the FGB device was connected to, then it is almost certainly a zoning problem. My guess is that someone enabled a config but it wasn't saved. When the switch rebooted, it used the last saved configuration, and that probably didn't have the WWN of the FGB port of the HBA.

Do you control the switches? If so, can you log into the switch and verify that the current (effective) config has the WWN of the HBA port at FGB? (In my last post I said WWID, but that's something different that is associated with a specific device like a vdisk.)

How was the replacement switch configured?

Some reading material (this assumes you have a Brocade switch) found with Google. If you have a different switch, you can more than likely find useful info with Google..

Brocade Guide to Understanding Zoning

http://www.brocadejapan.com/resources/tl/pdf/HOWTO_53-0000213-01.pdf

How to Zone a Brocade SAN Switch - Ryan's Tech Notes

http://technotes.twosmallcoins.com/?p=16

Brocade Secure SAN Zoning Best Practices

http://www.brocade.com/downloads/documents/white_papers/Zoning_Best_Practices_WP-00.pdf

SAN Switch cheat sheet (command line commands for several types of FC switches)

http://www.datadisk.co.uk/html_docs/emc/emc_switch_cs.html

Jon
it depends