cancel
Showing results for 
Search instead for 
Did you mean: 

Dat24i Failure

Ron Reaugh
Occasional Contributor

Dat24i Failure

I have a customer that has repeated Veritas backup(BENT 8.5) failures to a HP Dat24i (fast narrow SCSI2). It's the only device on the properly terminated SE bus segment of a 2940U2 under NT4 Server SP6.

Backups failed about 30% of the time before I identified that the drive was
inaccurately reporting that the tape cartridge as write protected(the tab) when the tab was in the write enabled position. That BENT message has gone by the time you get there in the morning; by then the alert just wants a tape inserted. I got lucky one day and found the Veritas ~"tape cartridge is write protected" message when I caught an actual failure in progress. Flashing the drive from L105 to L111 firmware caused backup to succeed at over a 90% rate.

That brings up the second problem that the customer reported but I'd never seen myself until today. The customer is on their third HP Dat24i drive now because of this problem. Backup is hung/waiting and the tape gets ejected just like the above described (misreporting write tab)case. The tape is ejected in the morning as it's supposed to be. The alert is just like those you see in the previous case by the next morning(insert tape). In this case however if one inserts a tape then the tape goes into the drive normally and normal drive operation seems to follow. However in this case BENT continues to report that no cartridge is present(empty). Now the eject button on the drive does not work and power down is required to regain tape drive functionality.

When I caught this case this morning I tried a number of things to knock the tape drive loose. What finally did it was pulling the 4 pin Molex power plug on the drive on the fly. Backup took off running normally and
instantly on reconnection.

Two LVD HDs including the system HD are on the LVD bus segment of the 2940U2
so generally SCSI remained functional at all times.

My conclusion is that the HP Dat24i is a flawed model. No SCSI device should ever enter a hung state whereby power interruption is required to
regain functionality. This model does it over three physically different drives and two firmware versions.

Does anyone have any experience/observations on this issue?

Anyone?



2 REPLIES
Vincent Farrugia
Honored Contributor

Re: Dat24i Failure

Hello,

Make sure you do not have pending jobs in the BENT software; clear all jobs.

Generally, tape is write-protected error is one of the following:

1. You inserted a 60m tape cartidge.
2. Firmware issue.
3. Bad tape.
4. Bad drive.
5. Tape is write-protected :-)

Have you tried different cartidges? Maybe the cartidge is at fault here.

Also, old 60m tape cartidges will ALWAYS bring that message when inserted, no matter how many sliders are there on the write-protect notch.

You also have a slightly bad SCSI design to be honest. You never put LVD devices with SE devices. Dat24i is a Single-Ended (SE) device. Doing so, the WHOLE bus goes to SE speed, thus your nice fast SCSI-2 LVD disk drives run at a considerably slower speed.

Here are also documents about what to do when media is stuck in the tape drive.

http://h20000.www2.hp.com/bizsupport/TechSupport/SupportTaskIndex.jsp?locale=en_US&taskId=8413&taskName=troubleshoot+a+problem&prodSeriesId=42846&prodTypeId=12169&prodSeriesName=hp+surestore+dat24+drives&supportTaskId=78253&supportTaskName=media+load%2Funload

I know you wanted a more black-on-white answer, but I hope I made things slightly clearer to you.

HTH,
Vince
Tape Drives RULE!!!
Ron Reaugh
Occasional Contributor

Re: Dat24i Failure

"2. Firmware issue."

Yes. As I already reported in my original post the false "write protect tab" report was fixed(well an order of magnitude less likely at least) by flashing from L105 to L111.

"Maybe the cartidge is at fault here."

Not likely. All 11 are HP DDS-3 and the cartridge that is in when a false report occurs works just fine before and after the false report with the tab untouched and in write enable position. The false write protect reports seems to hit any one of the 11 at random.

"You also have a slightly bad SCSI design to be honest. You never put LVD devices with SE devices. "

Nope. Study the design of the Adaptec single channel dual bus segment controllers like the 2940U2W(2940U2) and 29160. There are designed for exactly the purpose of allowing SE and LVD on the same SCSI channel but different bus segments as I have them configured and so described in my original post.

"Doing so, the WHOLE bus goes to SE speed, thus your nice fast SCSI-2 LVD disk drives run at a considerably slower speed. "

Nope, SE devices on the SE bus segment run at fast(wide) SCSI 2 SE speeds and LVD devices on the LVD bus segemnt run at LVD speeds. All this is while on the same SCSI channel but different bus segments of a 2940U2[W] and 29160. For exactly the above reasons the use of the term "SCSI bus" is outdated and ambiguous and should be avoided. The correct current terminology is SCSI channel which is that which SCSI IDs must be unique over. And a bus segment which is contiguous copper and is a subset of a SCSI channel. Two SCSI bus segments on a card like the 2940U2[W] are connected by a bridge chip to make a single SCSI channel.

The primary point of my post is that over a 90 day period I have isolated an apparent endemic design flaw in the Dat24.

The HP DAT manual: http://h200002.www2.hp.com/bc/docs/support/SupportManual/lpg29150/lpg29150.pdf

says on page 8: "emergency unload
If you press the Eject button when the drive is busy, the drive may take some time to respond because it will finish the task it is performing first. This ensures that no data is lost. On rare occasions, however, a system or software fault may cause the tape drive not to respond to an Unload request. In this situation, you can force ejection.
There are two ways of doing this:
Press the Eject button three times within 5 minutes.
Hold the Eject button down for at least 15 seconds.
Following either of these actions, the drive waits until 35 seconds have passed from the time of the first press, to give the normal eject procedure a chance to proceed. After this period, it immediately releases the tape and ejects the cartridge, regardless of what operation it was performing. The drive is then reset as though you had turned the power off and then on again.
Caution: You may lose data if you force ejection of a cartridge. The tape may also become unreadable because an EOD (End of Data) mark may not be properly written."


Three drives over two different firmware versions arrive at a state wherby the above "emergency unload" fails to operate. A drive powerdown fixes such a hang. Further if one searches these forums then one finds an undercurrent of reports of these hard eject button hangs.
For me these failures occur at about a 1 in 15 backup rate which makes the failure low incidence but still way too high.

I suspect that this one went by under the radar because of the low incidence.