StoreEver Tape Storage
1752808 Members
5495 Online
108789 Solutions
New Discussion юеВ

EBS problem - hanged NSR and dropped off line library in Windows OS:

 
SOLVED
Go to solution
Jolanta Sulima
Occasional Advisor

EBS problem - hanged NSR and dropped off line library in Windows OS:

I've got two separated EBS installation with similar backup problems - hanged NSR and dropped off line library in Windows OS:

Problems after some tests look like:

1. NSR hangs from time to time (1,5-2-3 weeks) and it does not matter what is NSR firmware and library firmware (tested with NSR 4-3.13 - library 4.14 and NSR 5-1.04-library 4.21) - but Ethernet NIC configuration was set to 10/100Mbit auto mode (from NSR site and LAN switch) - now I changed it to 100Mbit full duplex (from NSR site and LAN switch) - it hangs permanently : IPconsole hangs, Serial console hangs, it looks like second SCSI port hangs on it (LED is ON), Library Ethernet connection was 10Mbit and connected to LAN switch with 10/100Mbit auto mode - after NSR hanging library could hangs too during unloading the tape (command from LCD pane).

2. second problem :
cannot open exchanger control devices (details unknown) or the request could not be performed because of an I/O device error ; DP cannot see library, windows cannot see it too.
I think these is directly connected with other fault which I can see in the Windows System Event - The device, \Device\Scsi\CPQKGPSA2, did not respond within the timeout period.
- It happens from time to time (in normal work from 1,5 too 2 weeks periods) but in tested configuration more often.
Tested configuration: short parallel backups (about 1,5 GB for a tape device) - for 2 LTO1 tape device. The CPQKGPSA fault happens usually when for one backup is:
Completed disk agent for a backup and it tarts to be medium header verification before unloading the medium.

I find that when the NSR firmware is 4-3.13 and library 4.14 with tape devices with firmware E33W - it is not as often like with a new firmware but permanent - session is aborted but it hangs and other session could not start.
With NSR firmware 5-1.04 and library 4.21 with tape devices E38W - CPQKGPSA fault are more often not at every end of the backup, but it looks that sessions are not aborted but it could hangs for different periods (in short tested backups for about 10 min., in longer 1 hour backups for about 2 hours) and after that time could finish with OK results - at the same time second parallel session could work.

When the library is connected directly to the dedicated 2 channel SCSI controller in the backup server it looks like everything is OK.

Questions:
1.how to solve the problem with NSR hangs?
2.how to solve the problem with CPQKGPSA faults


SAN configuration:
-SAN storage with a few servers connected to it by 2 SAN switches (Compaq SAN switch 2/16).
-MSL5060 with 2 LTO1 drives and an external NSR - N1200 connected to one of the 2/16 SAN switch. The library, NSR, backup server and SAN storage ports are in a separated, dedicated zone.
- backup server - in the first configuration with 2 HBA - FCA2101 (LP950) and in the second configuration - 2 HBA - StorageWorks 64-Bit/33-MHz PCI-to-Fibre Channel Host Bus Adapter for Microsof Windows NT? (PN:176479-B21) known as DS-KGPSA-CB (LP8000); operating system W2K with SP3, and Secure path 4.0C. The server is a dedicated backup server in a SAN with : Data Protector 5.1 (Cell Manager and Installation server at the same time). Data Protector was patched (patches putted in order: DPWIN_00038; DPWIN_00022; DPWIN_00027 and DPWIN_00037.

HBA firmware- functional:3.91a1, boot BIOS 163a1; HBA drivers 5-4.82a16; W2K register configuration for CPQKGPSA: Topology =1; ResetTPRLO=2; emulex option = 0xBA00, other by default.
NSR N1200 configuration: Custom indexed map: (CTRL -protocol AF LUN=0; CHGR ; protocol PSCSI, target 0 LUN=0, Tape, prot. PSCSI, Target 1 LUN0, Tape2 ?prot.PSCSI, Target 3 LUN0); SCSI BUS config. Settings ; Discovery Delay = 30sec; FC port config. - Port Mode = N Port; Hard AL-PA ; disabled; Discovery Mode= auto discovery on reboot; Buffered tape writes ; enabled, Buffered tape queue depth=1; default map = indexed; perf. Mode =2Gbit; Force FCP response code = disabled; Ethernet config: static IP address; 100Mbit (full duplex) ; was 10/100 Mbit auto sense (but I decided to change it to avoid NSR hangs ; results not know till today ( 2 days test); NSR firmware ; were different 4-3.13; 4-3.19 and 5-1.04

Library: firmware (were different 4.14 and 4.21); LTP1 firmware tested E33W and E38W.

In W2K - Removable storage services - disabled, windows driver for changer - disabled; windows drivers for tape devices - it looks does not mater ? tested with windows driver and with data protector driver.

5 REPLIES 5
Johan Magnusson
Occasional Advisor
Solution

Re: EBS problem - hanged NSR and dropped off line library in Windows OS:

For the second problem:

Make sure that you enabled persistent binding on your HBA (Lputilnt.exe). This will keep your connection to the nsr permanent.

Another thing is that the HP agent "Fibre Array Information" must be disabled. This agent can cause problems since it will "talk and listen" on the path.

These changes made life a lot easier for med with my NSR and DP 5.1.

Regards
Jolanta Sulima
Occasional Advisor

Re: EBS problem - hanged NSR and dropped off line library in Windows OS:

Hi Johan.
Thank you for the replay.
I checked persistent binding and I've got as fallows:
Automap All Targets: enabled
Automap All Luns:enabled
Unmask ALL Luns:enabled

I have not HP agent "Fibre Array Information" on the W2K services list. The only HP or Compaq management agents I have are:
Compaq remote monitor service
HP Insight :Event Notifier, Foundation agent,
NIC agent, Server agent, Storage agent, WEB agent.

Regards,
Jola
David Ruska
Honored Contributor

Re: EBS problem - hanged NSR and dropped off line library in Windows OS:

> Another thing is that the HP agent "Fibre Array Information" must be disabled. This agent can cause problems since it will "talk and listen" on the path.

Here's the details on this issue:

Compaq Insight Management Agents for Windows version 6.40 (and 6.20/6.30 with the Fibre Channel Information Agent SoftPaq update) may cause the robotics to stop responding. The timeout used by these agents for log sense commands was too short, which can result in a timeout and abort sequence that may hang the robot.

The long term solution is to update to Compaq Insight Management Agents 7.0.0.0 or better. The short term workaround is as follows:

1. Click on Start > Settings > Control Panel.
2. Double-click on HP Management Agents.
3. From the Services tab, click on "Fibre Array Information" under the Active Agents list.
4. Click on Remove (Fibre Array Information will move to the Inactive Agents list).
5. Click on OK (agents will shut down and then restart).
The journey IS the reward.
Jolanta Sulima
Occasional Advisor

Re: EBS problem - hanged NSR and dropped off line library in Windows OS:

Hi Johan and David,

You are right.
After disabling HP agent "Fibre Array Information" everything looks better in tests. I need more time to be sure if it solves my problem at all.
After first Johan reply it was my fault that I couldn't find that agent. I was looking for it incorrectly in Windows services not in control panel - HP Management Agents.

Thank you.
Regards,
Jola
Johan Magnusson
Occasional Advisor

Re: EBS problem - hanged NSR and dropped off line library in Windows OS:

Hi Jolanta,

In the app lputilnt under persistent bindings you will find your mappings to various subsystems. Highlight the adress to the nsr (should begin with 1000) and click add. This will enable that binding as persistent and the binding will now begin with a symbol of PB.

If you have 2 HBA you will have to repeat this step for the second hba as well.

Regards
Johan