SCSI Reset on SAN

We have implemented at our customers site the following SAN:

2 fabrics of each 2 SAN Switch 16
1 Compaq ESL9000 tape library connected via Compaq MDR
6 NetWare Servers attached each via 2 Compaq FC-AL HBAs
10 Win2k Servers attached each via CPQ KGPSA HBAs
3 AIX Servers attached via Cambex HBA

The following zoning is configured:
1 zone for Win2K Servers
1 zone for NetWare Servers
1 zone for UNIX Servers
1 zone for Backup. The backup zone has a mix of NetWare, AIX and Win2k servers to backup via SAN.

The Backup SoftWare is SyncSort BackupExpress.

The following problem comes up:

During backup, a SCSI reset causes the drives to rewind. As the backup continues to write data on the tape mounted in the drive, the existing tapes data is overwritten. The porblem is seen by the backup operator when the volume serial is overwritten by the backup software. At this point the backup express server can not mount the tape as the volume serial does not exist.

We were already in contact with HP as the problem was seen several monthes ago. HP changed the SDLT drives with new ones. The new drives have the firmware rev5.1; the libray has the firmware rev 3.31

The driver for the KGPSA HBA is 5.4.53a7 with firmware 3.82a1
The drivers used on NetWare and AIX are at their latest level.

Has anyone expirienced those problems and how can we get rid of them?

Thank you for your help



Yes, we have and to be honest still do.

I understand Windows is great for sending SCSI resets when you least expect it so we now don't have our windows boxes attached to the library and instead back them uo via the network.

Also, You need to make sure you have turned off tape discovery within stm.

You need to make sure you have the latest patches for stape, scsi core, and fibre channel. Make sure the "set_san_safe" kernel parameter is also set.

One other thing you can try is to enable the "st_ats_enable" kernel parameter. This attempts to reserve the tape drive (in the bridge) so that it is immune to scsi resets. The only problem with this last one is that if DP is not shut down correctly you might have a SCSI reserve set on a device so you will then need to release it with "mt" or "st".

We have significantly reduced our SCSI resets but haven't managed to completely illiminate them.

Hope this helps.
Like wise here . We do our windows backup over the network and all HPUX backups over the fibre .

We did have an issue where a server holding a drive is rebooted or there is a scsi reset , upon coming up it starts querying all the scsi drives thereby failing all the backups .

That has since been resolved by a scsi patch by HP . I just can't seem to recall that patch .
First of all please check if everything is in fabric mode (looking to switchshow output you should see all ports in F-port mode).
Second, please upgrade driver to 4.82a16 and firmware to the latest and modify registry parms to:
Please note that Topology=1 will force card to work in fabric.
Please do not use emulex drivers if you have FCA adapters. Download KGPSA driver from HP site.
Also check unix (I hope hpux) hosts to run the latest FC driver and to be patched appropriately.
Please move everything in fabric. If reset issued on the FC-AL then all devices on the loop got reset
Here are a couple of places to look for information on Backups in a SAN environment.

EBS (Enterprise Backup Solutions) is focused on making sure that SAN backups work and work well. The goal of this team is to provide documentation as to what we know works and the best way to configure the systems and software.

Here are a couple of things I would suggest you do to make your config more stable.
1) Delete the backup zone with all servers and libraries in a single zone.
2) If the existing zones for each OS type contain only servers that should see the library, then use those zones and add the library to each OS specific zone.
3) If the existing zones won't work, then create a zone for each OS that will contain the library and any server that should see the library.
4) Follow the instructions to already mentioned about setting ResetTPRLO=2 -- this will get rid of any bus resets from windows -- this is stated in the EBS guide
5) Definitely upgrade the drivers and firmware on the HBAs
6) Download L&TT from: It will allow you to update the library and tape drives to the latest firmware over the SAN.
6) READ the EBS design guide it will help.

I have been working in EBS for three years now as a Windows guy and haven't seen any bus resets from Windows -- when the driver parameters are set correctly.

We don't directly test BackupExpress, but they do come into our Lab and run our tests using their software to get onto the EBS compatibility matrix. Oh, the matrix is a very useful tool located at

Let me know if any of this helps,
Regarding the zoning: Each OS has its own Backup-Zone containing the 2 MDR's and the HBA used to perform SAN Backup.

I changed the ResetTPRLO to 1 and checked if RemovableStorage is enabled on the Win2K servers in the Backup_Windows zone. The RSM is disabled on both serves.

I also checked whether the AIX server uses the rmt device without rewinding. He does (rmt2.1 - no rewind on close, no retention on open)

How canI disable the reset of the SCSI bus during router boot cycles on a Compaq MDR?

To Mark Grant: Can you explain me the abreviations you are using? I do not undastand some of them! Were do you invoke the set parameters on?

Thank you
Zotto, unfortunately the abbreviations I used all relate to HPUX and they are kernel parameters. On their own, they probably wont help too much in your mixed environment.
If any of you are still having SCSI resets from you windows hosts, one thing to check is whether any of the windows boxes that can see the library are running Insight Management Agents 6.40 (or 6.20/30 with softpaqs). The timout for polling was set too low, and it could cause a reset if the library was doing a command that took a while (e.g. move or inventory).

If you have the Agents mentioned above, the solution is:

Either upgrade to the agents (see the downloads for your server), or disable the "fibre array information" agent. You do that by going to Control Panel, and selecting "HP Management Agents", and then disable this particular agent.
