StoreEver Tape Storage
1753474 Members
4973 Online
108794 Solutions
New Discussion юеВ

Re: Sure Store 2/20 Problems

 
SOLVED
Go to solution
Aaron Tatone
Frequent Advisor

Sure Store 2/20 Problems

All,

We're using an HP SureStore E 2/20 LTO library hooked up via a SAN to an L2000 and N4000. The library is configured correctly and can be seen and operated by every device on the SAN (provided two servers don't use it at the same time). The environment is a mixed Windows 2000 and HP-UX environment.

Every once in a while, Omniback cannot proceed after mounting the tape. We have Omniback 4.0 configured correctly for SAN devices, and the lock files are setup correctly.

It usually starts with a backup mounting a tape into a drive. Nothing else happens. Omniback then times out and puts in the next tape. This proceeds until we stop it. We can't even do an mt from the command-line when this happens. We don't know what is causing it, but a reboot of the library solves the problem (albeit sometimes causing more problems on the servers).

I've heard that EMS is not supposed to be configured for the stape devices, but nobody can tell me why. I believe EMS is turned off on the servers (at least the EMS Agent is too).

Anyone have any ideas on what is causing this problem and how we can resolve it?

Thanks

Aaron
6 REPLIES 6
Deshpande Prashant
Honored Contributor

Re: Sure Store 2/20 Problems

HI
I'm going with similar kind of problems with my library shared between 5 servers.

The explaination I got so far for ems is during backup running, if ems pools drives it might cause the problem and time out. I was advised to turn of only the tape portion from ems.
We also have HP bridges in our configuration and was required to change kernel parameter "st_ats_enabled" to make it more successful.
What kind of SAN devices (switches/bridges) are involved in your configuration.

Thanks.
Prashant.
Take it as it comes.
Aaron Tatone
Frequent Advisor

Re: Sure Store 2/20 Problems

Harry,

Yes, all the latest OB2 patches are installed. I am getting this problem from both my HP-UX server and Windows 2000 servers.


Deshpande,

Thanks for the info on ems, it basically is what I was told also. Can you elaborate more on the st_ats_enabled? Is that supposed to be set to 1 or 0 on the UNIX servers, and what reprecussions does that have? Was it helpful in your case, or do you still have the intermittant problem?


The HP-UX and Windows 2000 servers are connected through a Brocade 2800 (16 port) switch, and zoned by WWNs (to separate disk and tape).


Aaron
Michael Tully
Honored Contributor
Solution

Re: Sure Store 2/20 Problems

Hi Aaron,

The Tape libraries are a real pain. The
answers to your questions lies in
*firmware* *st_ats_enabled* kernel parameter
and the disabling of EMS for *all* SCSI
tape related reporting. I am not sure
on the firmware level for your tape library
but get the latest. The kernel parameter
must be configured in your kernel as below.

st_ats_enabled 0

Attached is a document that was given to me
from HP in regards to the Tape monitor in
EMS.

One other thing, (as I'm not familiar with
your particular tape library unit) if you
have a bridge installed in it (FC to SCSI
converter) you should be running these as
fabric. On the model I have A4688A and A4689A
series, any sort of loop is not supported,
so you *must* use fabric.

HTH
-Michael
Anyone for a Mutiny ?
Deshpande Prashant
Honored Contributor

Re: Sure Store 2/20 Problems

Aaron
The parameter st_ats_enabled need to be 0 on unix systems.
After changing this parameter I have more success in backups (95%), I'm still observing this. My main failures on backups is with longer backups (>3 hrs; >500GB data with all 6 drives).
The backups so far are successful with less drives.
Changing the segment size in drive defination to 1GB also helped me here in reducing the backup time as well as consistent success in backup.

Do you have any HP bridges involved in your configuration, if so make sure their bios is also current, "2006y" for A4688A bridge.

Thanks.
Prashant.
Take it as it comes.
Aaron Tatone
Frequent Advisor

Re: Sure Store 2/20 Problems

All,

Thanks for the help. It looks like all your suggestions helped. For the record, I've attached the information from HP (I've opened a call for further clarification on these issues). Thanks again...

-------------

Hello Aaron,

Here is what I was able to find in regard to your questions:

Is there is a need to disable EMS monitoring for tapes and why?

The answer is yes. A backup application may take longer to run and/or the Tape Devices Monitor (dm_stape) may log events resulting from SCSI bus conflicts. If the Event Monitoring System (or "EMS") has been installed on a system, the dm_stape monitor is running in a background process. While it is running, it sends 3 or 4 SCSI commands at regular intervals (usually 1 hour) to tape drives and/or libraries to monitor them for possible problems. This is referred to as "polling". Unfortunately, some customers have seen cases where this polling action by the monitor interferes with a running backup application. In addition, the monitor may log events whenever it detects a SCSI bus conflict. The monitor generates 2 types of events: SCSI I/O error events and Tape Alert events. This would apply to all tape drives including the 10/180.

What does st_ats_enable parameter do and does it need to be disabled in a SAN environment ?

"st_ats_enabled" is a kernel parameter with a default value of 1 (Enabled). Setting st_ats_enabled to 0 (Disabled) will ensure that Omniback has full control over which hosts uses which drives and "alleviate the risk" of having hosts 'locking' tape drives with a SCSI reserve command leaving the drive inaccessible to other hosts.