Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Problem with SDLT III (160/320 GB) tape drive

nipun_2
Regular Advisor

Problem with SDLT III (160/320 GB) tape drive

Hi,
I have openVMS 7.3-2 with 1 DS25 housing the system disk-server and 3 other nodes (2 DS25 and 1 XP1000)


The server also has external SDLT tape drive attached by SCSI cable. We recently noticed the following error

Note: TP0 is a logical

===========================
%BACKUP-F-POSITERR, error positioning TP0:[000000]00001387_IMA.SAV;
-SYSTEM-F-OPINCOMPL, operation is incomplete
starting... done

FATAL-ERROR: copying files
MESSAGE: %BACKUP-F-POSITERR, error positioning !AS
CODE: 279150868
FACILITY: RTL
Program stopped
%DCL-W-SKPDAT, image data (records not beginning with "$") ignored

=====================================

We have replaced the Tape Drive and yet the error doesn't seem to change. In the past we had the terminator pin on the back missing and when this was placed recently it still appears to have the same problem.

Interesting aspect here is that as soon as we remove the current tape and insert a "new" tape 9/10 times the error disappears. This error usually occurs when data is being put on the tape from the hard drive.

Any thoughts and comments on where the problem might be?
11 REPLIES
Jan van den Ende
Honored Contributor

Re: Problem with SDLT III (160/320 GB) tape drive

Nipun,

>>>
and insert a "new" tape 9/10 times the error disappears.
<<<

That means to me, most likely, you need to apply a cleaning tape.

hth

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Hoff
Honored Contributor

Re: Problem with SDLT III (160/320 GB) tape drive

Is there anything relevant logged in the system error log?

Is the error specific to certain SDLT cartridges?

I'd definitely look to apply the file system and related mandatory ECO kits for OpenVMS Alpha V7.3-2.

Is this SDLT drive configured on a shared multi-host SCSI bus? (That's not something HP supports, and it can certainly lead to weird and transient errors when a remote system tosses out a bus reset.)

The bus and bus termination and bus length would be other areas to troubleshoot. Bad cables and loose connections can cause weird and transient errors, for instance.

Loading and using a cleaning cartridge is a reasonable approach, but if this happened immediately after swapping the drive itself with another I'd wonder if there's another trigger lurking.

The HP recommendations for BACKUP process quota settings -- probably not the trigger here, but worth a look if for no other reason than performance -- are available at http://64.223.189.234/node/49
Guenther Froehlin
Valued Contributor

Re: Problem with SDLT III (160/320 GB) tape drive

SYSTEM-F-OPINCOMPL, operation is incomplete

Indicates BACKUP tried to read or skip into a yet unwritten portion on tape. The SCSI error behind is called "blank check".

It seems from the log that there is some kind of DCL script/program involved. One that tries to pre-position the tape for BACKUP?

/Guenther
Jon Pinkley
Honored Contributor

Re: Problem with SDLT III (160/320 GB) tape drive

nipun,

Does the error light come on like it did when you reported this problem before?

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1124155

Does the error occur before data starts to be written to the tape, or well into the backup?

We do not have SDLT III drives, but we have SDLT1 (110/200) drives and other DLT drives back to DLT2000 (TZ87).

We have seen these errors under the following conditions:

1. Someone tries to append a saveset to a tape that has an incomplete saveset at the end. E.G. something caused backup to abort while writing to tape. Then at a later time, someone tries to append another saveset to this tape.

I know of no way to allow any more data to be appended to such a tape. You can reinit it, you just can't append to it.

2. Errors while the backup is writing well into a saveset. When this happens, we generally mark the tape as "do not reuse", because it is cheaper to use another new tape than to try to determine if the tape is bad. We don't throw the tape away, what was successfully backed up to the tape is still readable, and until the "expiration date" of the data, it goes back into our tape library.

Cleaning the tapedrive shouldn't be needed if you just replaced the drive, if it does need cleaned, I would wonder about the quality of the tapes you are using, or the environment the tape drive is operating in, or the replacement drive (was this replaced by field service, it so, you may have inherited someone else's problem), or even where you are storing tapes.

I would also "reseat" the SCSI cable from at the HBA and external drive. As Hoff stated, do not connect the tape drive to a shared SCSI bus. I will go one step further and recommend a dedicated SCSI port (i.e. a PKx0: device, one half of a dual port SCSI adapter is OK) for your tape drive. Then you can more easily do things like replacing the drive without taking the system down. (I know that this may not be considered "best practice", but it is often a requirement. Having only one device on the SCSI bus makes it easy to unload the tape from the drive, shut the power off to the tape drive, and then remove the connectors. I have done this many times with no problems.)

Good luck,

Jon
it depends
Jiri_5
Frequent Advisor

Re: Problem with SDLT III (160/320 GB) tape drive

when last backup did not complete successfully and you try write to this tape backup finish with error. Did you try new tape?
Anton van Ruitenbeek
Trusted Contributor

Re: Problem with SDLT III (160/320 GB) tape drive

Nipun,

Still a question about the connection.
Is this SDLTIII directly attached by SCSI on the Alpha using wich kind of controller ?
Has it worked in the past without errors ?
Whas something changed prior these errors ?
If directly attached, does it work with another tapeunit (older version/model). Be also now aware of the SE/DIFF/LV etc.

AvR
NL: Meten is weten, maar je moet weten hoe te meten! - UK: Measuremets is knowledge, but you need to know how to measure !
nipun_2
Regular Advisor

Re: Problem with SDLT III (160/320 GB) tape drive

Hello all,
thank you for your response.

This is my Response1 (please mention this before you reply as I will be addressing the rest in following post):

I am not a full time administrator and am not very well versed with analyzing problems using the tools so I might need more help in this direction.

Currently, I am considering swapping the SCSI cable with a drive (DLT IV) that is not giving any problem and see how things go from there on.

I also have a service contract with HP and the support has not been very helpful in this regards. This problem is on going since June 2007.




Is there a way for me to contact the HP Hardware experts so that they can efficiently guide me on the phone based on my specific case. I have tried to call the number through the contract however, it seems I just cannot get through to someone who is expert in hardware and well versed with OpenVMS.


Hoff
Honored Contributor

Re: Problem with SDLT III (160/320 GB) tape drive

>>>I also have a service contract with HP and the support has not been very helpful in this regards. This problem is on going since June 2007.

I also have a service contract with HP and the support has not been very helpful in this regards. This problem is on going since June 2007.<<<

There's no direct telephone number into HP as you seek here; the appropriate approach is to ask for the manager on duty when you call into support, and to continue asking and escalating and contacting increasingly higher-level management until you get satisfaction.

The other approach is to bring in outside help. There are various folks here in ITRC and elsewhere that provide such hardware and software services, or that can act as a liaison with HP support for you.

>>>I am not a full time administrator and am not very well versed with analyzing problems using the tools so I might need more help in this direction.<<<

If you should seek to continue diagnosing this yourself, do continue swapping the gear around (drive, cable, media, etc), as best you can. The key here is to determine what causes the problem to move or to resolve, and what does not. Standard "divide and conquer" troubleshooting techniques apply. Swapping the tape drive tends to indicate there's a problem upstream. In the cable or in the host software, or in the command processing.

And with ANALYZE/ERROR/ELV or DIAGNOSE (DECevent) or WSEA (HP SEA), a look at the error logs to see what (if anything) gets logged here.

Stephen Hoffman
HoffmanLabs LLC
nipun_2
Regular Advisor

Re: Problem with SDLT III (160/320 GB) tape drive

Response 2:

Jan
Thanks for the response I did use a cleaning tape and today again I got the same error.

Hoff:
I have replaced the cable. Let's see if this works

Maybe it is the problem with the controller. I will keep this in mind if the problem re surfaces.

Guenther:
yes we use a program however, the program has been there since 2+ years and we have had no problems in the past. The same goes for the other tape Drive DLT IV. I don't see any problems. I have also discussed this with the vendor and he doesn't suspect the problem is with the program as he has other customers using this.

Jon:
First of all thank you for your detailed check. Yes its the same problem.
Thank you for the detailed information.
My drive is External so the scsi cable comes directly from the DS25 Alpha Server to the tape drive.

Anthon;
Good question about the controller I do not know the name of the controller at this time
but I do intend to find out.


======

At this point I am installing the LTTE software which will allow more detailed analysis of the problem.
Guenther Froehlin
Valued Contributor

Re: Problem with SDLT III (160/320 GB) tape drive

nipun,

what is the name of this product? What is it for? An archiving utility? In DCL(-scripts)?

If this 'product' is using some form of skip-file to position on tape try 'SET MAGTAPE my_tape_drive /FAST_SKIP=NEVER'. In some case the SCSI tape drive and the VMS SCSI magtape driver could get out-of-sync with a fast filemark skip.

Can you try a 'BACKUP/LIST my_tape_drive:*.* for a tape which produced the OPINCOMPL error? Did it list the last save set on tape successfully?


Others,

to recover a tape with a partial save set at the end just use BACKUP/LIST and list the last good save set on the media. BACKUP leaves the tape positioned after the last filemark for this file. Then do a '$ SET MAGTAPE my_tape_drive/END_OF_FILE' which creates a double filemark indicating the logical end of tape. Another BACKUP without rewind should then append another save set.

/Guenther
Jon Pinkley
Honored Contributor

Re: Problem with SDLT III (160/320 GB) tape drive

After responding to the somewhat related thread
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1174970

I did some testing, and can verify that Guenther's method of "deleting" a partial saveset at the end of a tape works.

When I had tried it before, I was using set magtape/skip=file:n to position the tape. And that command left the tape positioned just prior to the EOF1 and EOF2 records. When a set magtape/end_of_file tape: is issued there, you will not be able to append to the tape, as you will get a message similar to the following:

%BACKUP-F-LABELERR, error in tape label processing on MKE400:[000000]ITRCSHAD.F071107;
-SYSTEM-W-ENDOFVOLUME, end of volume

Using backup list leaves the tape positioned after the two EOF records, and that is where the set magtape/end_of_file tape: must be done.

If you are using set magtape/skip=file=n tape: ![where n = (save_set_number*3)-4] to position the tape, follow that with a dump /block=c:2 tape:, which then leaves the tape in the same position as if the backup/list had been done. The two blocks that are dumped should be 80 bytes each, and start with EOF1 and EOF2. Alternatively, you can use set magtape/skip=block=2 tape:, but then you loose the ability to verify that the two tape blocks were the correct ones.

When using set magtape to position the tape prior to a saveset before a restore, it is not necessary to skip past the two EOF records, backup will ignore the two records.

Jon
it depends