Operating System - HP-UX
1844119 Members
3067 Online
110227 Solutions
New Discussion

NO_HW causing problem on tape san

 
SOLVED
Go to solution
Jonathan McCormick
New Member

NO_HW causing problem on tape san

Hi there,

I'm having a problem with some LTO2 drives we have shared between 4 HP-UX servers.

The drives are all zoned and configged up correctly from the Switch side and our Technical Support here installed the drivers successfully. At this point we had no issues with the drives and everything appeared fine.

Since then I have been informed that due to howw HP-UX works (reqularly probing devices) we've had some problems. Essentially that when some config work has been done, the server has lost connection to the drive - hence the NO_HW state. When it comes back up, however, it has created a new driver for teh drive which is set CLAIMED. So we have two drivers linked to 1 drive. One NO_HW and one claimed. At this point we can no longer use the drive.

Anyone have any idea why losing connection to the drive (for probably all of 1-2 secs) would cause this to happen and any suggestions how to avoid this problem?

Thanks

Jonathan
9 REPLIES 9
Steven E. Protter
Exalted Contributor

Re: NO_HW causing problem on tape san

Shalom Jonathan,

I would first be certain that there is not an actual intermittant hardware problem.

Then I'd do some analysis to see if this has to do with a particular piece of cable/fiber or particular system.

I would look for SCSI device id conflicts between the systems, go over the design of the san and try and isolate the problem further.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Jonathan McCormick
New Member

Re: NO_HW causing problem on tape san

Thanks for the reply Steven,

I am as certain as I can be that this issue is not hardware related as due to this being a shared infrastructure the same 10 drives are being used without issue by The Solaris master server. I feel it's solely OS related as the 10 drives are shared between 5 servers in total with teh other 4 being HP-UX and all showing the same symptoms, yet the drives are still fully useable on the solaris box.

Thanks

Jonathan
Sp4admin
Trusted Contributor

Re: NO_HW causing problem on tape san

Is there a switch involved? If so you may want to check the switch (SPF) port my be bad. Also if you use insf -e that should come back with H/W Claimed. If not I would start to check the Fibre cable. Are you using Netbackup you can change the device with tpconfig.

Nyland
Solution

Re: NO_HW causing problem on tape san

Jonathan,

If the tape drive is re-appearing with a different device file name, then that means something in the SAN is changing. HP-UX tracks storage devices in a SAN by FCID. SO if the FCIDs for the tape drives are changing, then the HP-UX device file will change. The only situation where a FCID might change is:

a) on brocade and mcdata FC switches, if the storage is moved to a different port or to a different switch.

b) on a Cisco FC switch, if 'persistent FCIDs' hasn't been enabled and the switch is rebooted.

As for HP-UX regularly probing tape devices on a SAN - it shouldn't if it is setup correctly:

a) ensure the kernel parameter st_an_safe is set to 1

b) ensure that POLL_INTERVAL is set to 0 in /var/stm/config/tools/monitor/dm_stape.cfg

HTH

Duncan

I am an HPE Employee
Accept or Kudo

Re: NO_HW causing problem on tape san

Jonathan,

see also my comments in this thread:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=861459&admit=-682735245+1142847120216+28353475

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Florian Heigl (new acc)
Honored Contributor

Re: NO_HW causing problem on tape san

There are logs for low-level scsi (fc) debugging, and I'd ask You to check them, to find the hickups reason, be it tape or OS.(See bottom)

Why a second, (different?) driver instance is being bound, I do not know, and it is not regular behaviour. I can just hope someone else here has an explanation for this.

The only reason I could think of would be a multipathing or wwn issue.

I have to say one more thing:
You have problems due to same issue with Your SAN (might as well include HP-UX as the root) that makes the tape connectivity get lost, not due to HP-UX probing if the tapes are still there.
We're presenting >300 VLT devices to HP-UX hosts here and I have yet to hear complaints from the backup admins.

Checking driver logs:
issue the command 'xstm' or 'mstm' and then select to view the 'logtool activity log'

http://docs.hp.com/en/diag/stm/help/utility/logtoolc.htm
yesterday I stood at the edge. Today I'm one step ahead.
Jonathan McCormick
New Member

Re: NO_HW causing problem on tape san

Thanks for all the replies so far. Plenty to investigate there. I'll update as to my progress when I have some.
Jonathan McCormick
New Member

Re: NO_HW causing problem on tape san

The kernel param set to one seems to have done teh trick thanks duncan and thanks to everyone else who replied.
Jonathan McCormick
New Member

Re: NO_HW causing problem on tape san

Closing Thread