Tape Libraries and Drives
cancel
Showing results for 
Search instead for 
Did you mean: 

MSL8096 LTO5 - CRC and cleaning requests after backup.

Grzegorz_Karnas
Occasional Visitor

MSL8096 LTO5 - CRC and cleaning requests after backup.

We have got problem with MSL8096 (4x LTO5 FC drives) direct connected to DL380G5 with 2 x HP 82Q 8Gb Dual Port HBAs.
We have a lot of broken backups witch CRC Errors and Clean Tape Drive status at the end.

Some technical data:
MSL firmware: 10.00 (9.90 was used for testing too)
Tape firmware: I3BW (I25W was used for testing too)
HBA 82Q: Firmware 5.03.06 - driver STORport 9.1.8.28 - BIOS 2.15
OS: Windows 2003 R2 SP2 x86
Backup: Networker (with LTO-5 patch)

Topology:
Library is direct connected to HBA ports. Tape 1 to HBA 1 Port 1, Tape 2 to HBA 1 Port 2 etc... with OM3 LC-LC cable. Wy tried HBA port and Tape drive setting Loop and Automatically.
We tried also connection thru FC switch preparing zoning in the way that only 1 port of HBA can see only one tape drive. Port setting (both sides) Automatic - negiotiates as Fabric (Point-to-Point).

Backing procedures - full or incremental backup and after it is run tape clonning.

Errors (from System Event Analyser - SEA):

A change in the current health status has been reported by the MSL 8096 Tape Library controller. The current status is now at WARNING.
Failure information:
Drive tape alert Check the TapeAlert log in LTT.
FRU Position number: 03
Supporting failure information:
Error code: (0x0084=Drive Warn or Crit Tape Alert flag )
FRU code: (Event Code: =0x84 - tape alert )
Location code: (;Module error code: =0x25, 37;Current command: =0x12 - rescan)

A change in the current health status has been reported by the MSL 8096 Tape Library controller. The current status is now at WARNING.
Failure information:
Drive requires cleaning. Insert a cleaning cartridge and perform cleaning on this drive.
FRU Position number: 01
Supporting failure information:
Error code: (0x0082=Drive Cleaning request )
FRU code: (Event Code: =0x82 - cleaning request )
Location code: (;Module error code: =0x25, 37;Current command: =0x12 - rescan)

What was made:
- tests on difrent tapes (LTO4 and LTO5)
- changing phisicaly tape drives
- changing firmware/driver versions
- changing topology
- changing FC speed (8Gb, 4Gb)
- disable SCSI reset on QLogic HBA in Windows registry - HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ql2300\Parameters\Device
"DriverParameters"="UseSameNN=0;"
to:
"DriverParameters"="UseSameNN=0;rstbus=2;tapereset=0"
It was recommended by HP and Legato agree with this.
- changing block size on HBA (from 512k to 128k - Windows and Legato have set block size to 128k)
- of course cleaning tapes many times

Errors still appear randomly on different tape drives. It does not matter it is backup or cloning but it is more frequent during tape cloning (mayby it depends on data flow speed).

Has anybody any clue what we can do to solve this problem??
3 REPLIES
Curtis Ballard
Honored Contributor

Re: MSL8096 LTO5 - CRC and cleaning requests after backup.

The drive sets the clearing request based on results from a read after write scan. If the drive is requesting cleaning then the read back detected that it wasn't able to read back the written data with sufficient margin.

If the problems happen "randomly" I would recommend spending a bit more time looking for a pattern. Does it happen more frequently on one drive? How about with a few tapes or a tape brand?

Are there any environmental issues that might be causing the drive or media to become contaminated? (smoke, dust, printers nearby, salt or chlorinated water nearby)
Grzegorz_Karnas
Occasional Visitor

Re: MSL8096 LTO5 - CRC and cleaning requests after backup.

I found some problems with tape cleaning involved with FC/SAN problems in the past. It were usualy fake alarms and problems were in totaly different place.

3 of 4 tape drives were changed during tests and still we have got errors. LTO is usualy very resistable against "dust" and others dirts.

At the begining most of occurences of this error were on Tape1 but it was because most of cloning sessions used this drive as target during cloning. After we changed "cloning target" to different tape drives the errors were spread on them.

Enviroment - totaly clear. It is datacenter with restricted access.

Actualy we installed MSL4048 - 2 x LTO4 in this enviroment replacing MSL8096 - this same server, backup soft, connections etc. We are still using these same tapes (LTO4) and it works fine till now (it is 4th day).
Eemans Dany
Honored Contributor

Re: MSL8096 LTO5 - CRC and cleaning requests after backup.

Hi,

Perhaps stupid of me, But if you use an MSL8096 with 4 FC devices connected to an DL380 with 2 dual FC HBA cards (four ports in total and you have broken connections, Is the back ground (PCI riser) able to sustain the datatransfer.
4 x 140 Mb/sec native transfer.

You say it works with an MSL4048 2 FC devices, i suspect that each drive is connected to an port on a differant HBA card.

Try the same with the MSL8096 but connect only 2 drives (for example 1 and 3) and run the same backups as on the MSL4048.

In this case i think that the background of the server is not able to sustain the load generated by the library.

Dany