StoreEver Tape Storage
1751779 Members
4563 Online
108781 Solutions
New Discussion юеВ

Re: HP Dat24 DDS-3 Drive Model Flawed?

 
Ron Reaugh
Occasional Contributor

HP Dat24 DDS-3 Drive Model Flawed?

I have a customer that has repeated Veritas backup(BENT 8.5) failures to a
HP Dat24i (fast narrow SCSI2). It's the only device on the properly
terminated SE bus segment of a 2940U2 under NT4 Server SP6.

Backups failed about 30% of the time before I identified that the drive was
inaccurately reporting that the tape cartridge as write protected(the tab)
when the tab was in the write enabled position. That Veritas message is
gone by the time you get there in
the morning. I got lucky one day and found the Veritas ~"tape cartridge is
write protected" message when I caught an actual failure. Flashing the
drive from L105 to L111 firmware caused backup to succeed at over a 90%
rate.

That brings up the second problem that the customer reported but I'd never
seen myself until today. The customer is on their third HP Dat24i drive now
because of this problem. Backup is hung/waiting and the tape gets ejected
just like the above described case. The tape is ejected in the morning as
it's supposed to be. The alert is just like those you see in the previous
case by the next morning. In this case however if one inserts a tape then
the tape goes into the drive normally and normal drive operation seems to
follow. However in this case Veritas continues to report that no cartridge
is present(empty). Further the eject button on the drive does not work and
power down is required to regain tape drive functionality.

When I caught this case this morning I tried a number of things to knock the
tape drive loose. What finally did it was pulling the 4 pin Molex power
plug on the drive on the fly. Backup took off running normally and
instantly on reconnection.

Two LVD HDs including the system HD are on the LVD bus segment of the 2940U2
so generally SCSI remained functional at all times.

My conclusion is that the HP Dat24i is a flawed model. No SCSI device
should ever enter a hung state whereby power interruption is required to
regain functionality. This model does it over three physically different
drives and two firmware versions...three strikes and your out in this arena
is my rule.

Does anyone have any experience/observations on this issue? Is the Seagate
STD224000N Scorpion 24 4mm DAT (DDS-3) a good alternative which maintains
DDS3 media compatibility??

Anyone?



4 REPLIES 4
Steve W
Trusted Contributor

Re: HP Dat24 DDS-3 Drive Model Flawed?

Three drives ALL with the exact same problem??? *Highly* unlikely. I would look at other common factors... power supply, cabling, faulty terminator, term power and grounding just for starters. The problems all sound like they have the same root cause anyway. The BU software probably tried to write to the drive half way through a backup and can't so it decides erroneously that the cart is write protected.
Rothery Harris
Trusted Contributor

Re: HP Dat24 DDS-3 Drive Model Flawed?

Dear Ron,
SteveW is quite correct. This is a scenario that is often seen where multiple drives are changes. The customer usually blames the drive. Repair centre analysis confirms this as many drives are found to be fault free. The only answer is to carefully carry out a problem investigation. Its often worth listing what is not working and also listing what is working. Process of elimination is really worthwhile. Try backing up with NTBackup and see if that is OK. I should be careful in switching to another drive without identifying the problem.
If you run HP Library & Tape Tools you can produce a support ticket (you may have done this already). This accesses some internal logs within the drive.

Regards
Rothery
Ron Reaugh
Occasional Contributor

Re: HP Dat24 DDS-3 Drive Model Flawed?

"Three drives ALL with the exact same problem??? *Highly* unlikely."

Nope, highly likely if a fundamental design flaw is present as I suspect.

"The problems all sound like they have the same root cause anyway."

Yep.

"The customer usually blames the drive."

Yep but in my case careful analysis shows substantial justification.

"Process of elimination is really worthwhile."

Quite correct and I've carefully done that if you read my original post and what is included below.

HP DAT manual:
http://h200002.www2.hp.com/bc/docs/support/SupportManual/lpg29150/lpg29150.pdf
page 8 "emergency unload
If you press the Eject button when the drive is busy, the drive may take some time to respond because it will finish the task it is performing first. This ensures that no data is lost. On rare occasions, however, a system or software fault may cause the tape drive not to respond to an Unload request. In this situation, you can force ejection.
There are two ways of doing this:
Press the Eject button three times within 5 minutes.
Hold the Eject button down for at least 15 seconds.
Following either of these actions, the drive waits until 35 seconds have passed from the time of the first press, to give the normal eject procedure a chance to proceed. After this period, it immediately releases the tape and ejects the cartridge, regardless of what operation it was performing. The drive is then reset as though you had turned the power off and then on again.
Caution: You may lose data if you force ejection of a cartridge. The tape may also become unreadable because an EOD (End of Data) mark may not be properly written."


The "emergency unload" FAILS and therefore I conclude the drive is hard hardware(firmware) hung. Three drives and two firmware versions exhibit this behavior at about a one in fifteen backups rate.

If one does a search of these forums then one finds an undercurrent of reports of these exact same dead eject button hangs. There is substantial reason to believe that there remains an outstanding endemic low incidence Dat24 hardware/firmware failure.

Is it EVER reasonable for the drive to be in a dead eject button(dead emergency unload) state?? Doesn't that drive state automatically mean a flawed drive??

I've spent about 90 days slowly hearding this issue into a corner. I think I've got it well described and analyzed.
Ron Reaugh
Occasional Contributor

Re: HP Dat24 DDS-3 Drive Model Flawed?

" I would look at other common factors... power supply, cabling, faulty terminator, term power and grounding just for starters. The problems all sound like they have the same root cause anyway."


All the SCSI cabling and terminations have been changed at least once.

The long list of possible hoops that you'd ask a customer to go through to do HP's debugging for them is endless. The NT4 system is question has run without a single glitch for weeks except of course for the Dat24i hang. The same SCSI card (2940U2) has run the two LVD HDs without a glitch. The same power supply has run all other systems components without a glitch.


"The only answer is to carefully carry out a problem investigation."


Exactly. A compentent consultant must consider the cost to his client. The question is what is the most cost effective next step in the analysis and correction from my customer's standpoint. There is an HP Dat24i tape drive that is entering a hung state and refuses its own emergency eject operation. There is an undercurrent of such reports in HP's own forums. The obvious next debugging step is to try a different mfg's DDS-3 drive don't you think? If such a different mfg's DDS-3 drive solves the problem then what would you conclude?