Operating System - OpenVMS
1751805 Members
4976 Online
108781 Solutions
New Discussion юеВ

Re: Problem with replacement LTO4 tapedrive.

 
The Brit
Honored Contributor

Problem with replacement LTO4 tapedrive.

Yesterday I replaced an LTO4 tapedrive in a MSL4048 library.

After installing the drive, I loaded a scratch tape, initialized it, mounted it, and all seemed well.

Last night I ran a normal system backup (batch job) which appeared to be running fine until the last saveset, at which point the process appeared to hang.

Show Proc/Continuous showed the job was still running the Backup Image, however there was no IO, Buffered or Direct, for over an hour. At that point, I killed the job.

It took the system ~5 minutes to cleanup the process and exit.

The problem now is that the tapedrive keeps returning "medium is offline" messages whenever I try to initialize or mount a tape.

I am going to go onsite and cycle the power on the Library and drive to see if that helps.

If anyone has any other suggestions, I would appreciate them.

Dave.
22 REPLIES 22
P Muralidhar Kini
Honored Contributor

Re: Problem with replacement LTO4 tapedrive.

Hi Brit,

>> Last night I ran a normal system backup (batch job) which appeared to be
>> running fine until the last saveset, at which point the process appeared
>> to hang.
I guess you mean VMS Backup itself and not ABS/MDMS.

>> The problem now is that the tapedrive keeps returning "medium is offline"
>> messages whenever I try to initialize or mount a tape.
I guess you would have tried to mount different volumes on the tapedrive,
just to rule out a bad volume problem.

Also is the volume used compatible with the drive ?

You can use the MRU (Media Robotic Utility) commands like
ROBO SHOW DRIVE , ROBO LOAD/UNLOAD to check if the MRU commands
also face problem accessing the drive.

If the problem persists for a while, you should consider cleaning the
tape drive.

Regards,
Murali
Let There Be Rock - AC/DC
Hoff
Honored Contributor

Re: Problem with replacement LTO4 tapedrive.

Based upon the description, this could be inferred to be a third-party or otherwise unsupported LTO tape drive, and the behavior here would then imply a compatibility issue.

Or this is a a supported drive with incorrect firmware, or with a failure of some sort.

A bad SCSI connection.

Or bad media.

Or a command error within the procedure. Unfortunately, the last known latent bug has not yet been identified.

Cycling the drive might or might not clear the underlying error, though it may well clear the "medium is offline" stuff.

Check the error logs.

Check the batch log.

Check the OPCOM log.

Having a stuck doesn't mean all that much.
Shriniketan Bhagwat
Trusted Contributor

Re: Problem with replacement LTO4 tapedrive.

Hi,

BACKUP is on to single tape or multiple tapes? Is there any tape span-over during BACKUP?
Did you try the BACKUP with /IGNORE=LABEL qualifier? What is the exact BACKUP command?

Regards,
Ketan
P Muralidhar Kini
Honored Contributor

Re: Problem with replacement LTO4 tapedrive.

Hi Brit,

>> You can use the MRU (Media Robotic Utility) commands like
When i said this, i assume you had MRU installed. If MRU is not installed
then these commands cannot be used.

Also, provide the output of "$SHOW DEVICE/FULL
Does it show the status as ONLINE or OFFLINE ?

Once the backup's failed and you killed the job, were you able to unload
that volume from the drive sucessfully or the unload volume failed with
medium offline error?
May be the volume might have got stuck in the drive for some reason.

Regards,
Murali
Let There Be Rock - AC/DC
The Brit
Honored Contributor

Re: Problem with replacement LTO4 tapedrive.

When I arrived on site, the clean drive light was on and the Attention LED was flashing.

I inserted a cleaning tape in to the mailslot and used the front panel to initiate a clean on the new drive. The Clean LED subsequently went out.

The Attention LED was still flashing however when I move a cassette into the drive, the Attention LED went out, (although I cant say for sure that the two events were related).

Anyway, the good news is that I was able to initialize a tape and mount it. (previously, the initialize was giving a parity error).

I seem to be back at the point I was at after installing the replacement yesterday.

I will now run a test of the backup I tried last night to see if it is truly OK, or if I get the same outcome as last night.

I would like to thank the contributors for being connected on a Saturday. I will close the thread if all works out OK.

thanks

Dave.


P Muralidhar Kini
Honored Contributor

Re: Problem with replacement LTO4 tapedrive.

Hi Brit,

>>When I arrived on site, the clean drive light was on and the Attention
>> LED was flashing.
>> Anyway, the good news is that I was able to initialize a tape and mount it.
Cool. So looks like cleaning the drive did the trick.

>> I will now run a test of the backup I tried last night to see if it is
>> truly OK, or if I get the same outcome as last night.
Yes, also you might want to use the same volume (or set of volumes if its a
multi tape backup) for the backups.

>> I would like to thank the contributors for being connected on a Saturday.
Your are welcome. This forum is always ON !!

Good luck with your backup's.

Regards,
Murali
Let There Be Rock - AC/DC
Shriniketan Bhagwat
Trusted Contributor

Re: Problem with replacement LTO4 tapedrive.

Hi,
>> previously, the initialize was giving a parity error
Parity error indicates there is some problem with the tape. Please check the online help on parity. $ help/message parity. You may want to check, if there are any parity error with the same tape by initializing it multiple times. If you observe the parity error then its time to retire the tape.

Regards,
Ketan
The Brit
Honored Contributor

Re: Problem with replacement LTO4 tapedrive.

Problem not resolved.

Although I was able to initialize and mount the tape cassette, and all seemed well.

When I ran my backup test (using the same tape cassette as in the above test) I got the result shown in the attachment.

Notice that after the error occurs, the script jumps to an ERROR handling subroutine which remounts the tape and dismounts it. When the Error routine mounts the tape, it no longer shows a label, although it was initially initialized and mounted OK.

I am going to run another test using a different cassette.

A couple of other observations. I ran the LTT utility while the Batchjob was running and did a scan of MGA5. It said there was no media in the drive. ???

I entered SDA and did a "show proc /id=nnn /chan" and it indicated that MGA5 was in fact Open, and "Busy" even though the Process was showing no CPU use or IO.

SDA> show proc/id=2027CB7E/chan

Process index: 037E Name: BKUP_PHASE1 Extended PID: 2027CB7E
--------------------------------------------------------------------


Process active channels
-----------------------

Channel CCB Window Status Device/file accessed
------- --- ------ ------ --------------------
0010 7FF08000 00000000 DSA10:
0020 7FF08020 8DC81A00 DSA101:[VMS$COMMON.SYSEXE]VMOUNT.EXE;1
0030 7FF08040 8A9A6E40 DSA101:[VMS$COMMON.SYSLIB]LIBOTS.EXE;1 (section file)
0040 7FF08060 8A9A6DC0 DSA101:[VMS$COMMON.SYSLIB]LIBRTL.EXE;1 (section file)
0050 7FF08080 8A9B6280 DSA101:[VMS$COMMON.SYSEXE]DCL.EXE;1 (section file)
0060 7FF080A0 8A9A6C40 DSA101:[VMS$COMMON.SYSLIB]DCLTABLES.EXE;140 (section file)
0070 7FF080C0 8B488780 DSA10:[TESSCO.LOG_FILES.BACKUP]BKUP_PHASE1.LOG;556
0080 7FF080E0 8D6E3EC0 DSA10:[TESSCO.EON_COM_FILES]BKUP_PHASE1.COM;39
0090 7FF08100 8A9AB640 DSA101:[VMS$COMMON.SYSLIB]DECC$SHR.EXE;1 (section file)
00A0 7FF08120 8A9AAEC0 DSA101:[VMS$COMMON.SYSLIB]DPML$SHR.EXE;1 (section file)
00B0 7FF08140 8A9A9740 DSA101:[VMS$COMMON.SYSLIB]CMA$TIS_SHR.EXE;1 (section file)
00C0 7FF08160 8A9A58C0 DSA101:[VMS$COMMON.SYSLIB]MOUNTSHR.EXE;1 (section file)
00D0 7FF08180 00000000 Busy $2$MGA5:

Total number of open channels : 13.



Bob Blunt
Respected Contributor

Re: Problem with replacement LTO4 tapedrive.

Dave, the usual questions and recommendations:
VMS Version
Connection method (noted that it's a SAN-cnx drive)
Patches?

I'm sure that I should presume that you swapped the drive and performed the other steps required to bring the drive online since the WWID should have changed when the drive was replaced and VMS needs you to update the device structures for the new drive to work properly. Power cycling doesn't usually reset the device connection in the operating system.

bob