Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Directory listing of large files on large tapes

 
Highlighted
Advisor

Directory listing of large files on large tapes

I've been creating backup savesets on tape, then mounting the tape and getting a directory listing to see how much the tape was being used. As disks and tape capacities have grown, I've discovered a problem: directory listings fail to show the size of very large backup savesets.

For example, an image backup of a rather full 72GB drive results in a 65GB saveset, but a Directory listing shows the file to be 6 blocks
in size. I should note the saveset is retreivable; the file is there, but the directory listing is wrong. The size of smaller savesets, like before the disk got this full, are shown correctly.

Haven't found any topic/patch that might shed light on this, so I thought I'd post this query. Any solutions or alternatives would be appreciated.

Oh yes, OpenVMS V7.3-2.

\bill
13 REPLIES 13
Highlighted
Honored Contributor

Re: Directory listing of large files on large tapes

Highlighted
Advisor

Re: Directory listing of large files on large tapes

Thanks for the pointers Hoff. Not sure why none appeared during my original search. While awaiting a response to my post, I posited the problem might be a limitation in the tape labels and had confirmed it by dumping a few headers. The pointers showed I wasn't alone.

The pointers also confirm there's not going to be an easy resolution; unlikely tape labelling will change after all this time. Since this arose from long-standing procedures, I'm still left with trying to find a way of getting the usage on a tape. I'd welcome any solutions or alternatives anyone might have...

\bill
Highlighted
Honored Contributor

Re: Directory listing of large files on large tapes

Short of querying the device, there's not a good way to get the remaining storage on a tape device when data compression is in play; compression efficiency trumps source file size.

Here's the LTO version of the discussion:

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1249719

And for TZ8x:

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1143795

There's the "How do I check for free space on a (BACKUP) tape?" section in the FAQ; that covers older tape technologies. And there's the article "Tape Tips: Compression, Remaining Capacity Estimates"
at:

http://64.223.189.234/node/965

I picked a DLT 8000 drive out of a dumpster not too long ago, more so I could get at older media. Might a DLT drive upgrade trump the capacity discussion? Or a DAS or NAS array with drive with a couple of terabyte SATA spindles?

Highlighted
Advisor

Re: Directory listing of large files on large tapes

Thanks Hoff, for the additional pointers; adds to the discussion, though perhaps not to the solution. A couple of additional comments before closing...

I agree that compression makes it impossible to determine true capacity of a cartridge. I think it is worth noting that I'm interested in finding the actual "usage" rather than the "remaining storage". The (automated) procedures then use a theoretical capacity to decide whether to risk adding a saveset or moving on to another volume. A semantic distinction to help clarify the methodology ;-)

(Of course the saveset retention periods, tape retention periods, frequency of tape use all weigh in on how the procedure makes its decisions. A bit of science, a bit of common sense, but apparently a lost art considering how often I see what happens out there.)

Yes, a "DLT upgrade" would trump the capacity discussion. However, in this particular instance, the problem became evident after a move from DLT to LTO. With DLT, the procedures never checked tape usage since they knew from the theoretical capacity of the tape that the large saveset required an "empty" tape. Of course, going back is out of the question since the LTO does in one hour what used to take ten hours with the DLT. ;-)

And lastly, the backup is already disk to disk to tape. In future the backup disks may (should) become high capacity hot-pluggable SATA or SAS. But real world considerations make tape an easy and effective way to extend the retention periods of the backups.

Resolution? I'll just have to experiment a bit to come up with a new way to guess at the tape usage.

\bill
Highlighted
Honored Contributor

Re: Directory listing of large files on large tapes

[[[I agree that compression makes it impossible to determine true capacity of a cartridge. I think it is worth noting that I'm interested in finding the actual "usage" rather than the "remaining storage".]]]

AFAIK, OpenVMS in the form of BACKUP and the magtape drivers have no clue about this remaining storage. BACKUP (or COPY for that matter) tosses blocks until it sees a response from the drive that causes it to alter its behavior. (eg: one of various errors, or EOT) AFAIK, there's no "reservation" mechanism with tapes; with disks and CDs and such, you can. But tapes can see bad spots and differences in compression efficiencies; factors which make estimates more interesting.

[[[The (automated) procedures then use a theoretical capacity to decide whether to risk adding a saveset or moving on to another volume. A semantic distinction to help clarify the methodology ;-) ]]]

Ayup. I used that strategy eons ago. More recently, I let BACKUP roll over onto the next volume, or I use a one saveset per cartridge scheme, or use a library or loader. Or I punt, and catch the overflow through exception means; the operation blows up (eg: due to a lack of tape operators), and I deal with it in the morning.

More commonly, I watch the input size, too. But keeping up with the particular compression efficiency of the input data always made my head hurt.

If you want to pursue this, you have to ask the drive what's left; the milage.

[[[(Of course the saveset retention periods, tape retention periods, frequency of tape use all weigh in on how the procedure makes its decisions. A bit of science, a bit of common sense, but apparently a lost art considering how often I see what happens out there.)]]]

With tapes? Science? Art? You forgot to include "magic" and "luck" in that list. But then, maybe my own head needs alignment. :-)

Highlighted
Honored Contributor

Re: Directory listing of large files on large tapes

Bill,

when you're just interested in the 'actual usage' of the tape, you could use the 'Object count' (UCB$L_RECORD) in the tape device UCB.

Caputer this value after the last backup to tape operation has finished and before dismounting the tape:

$ ANAL/SYS
SDA> SHOW DEV MKAx:
...
Look for Object count nnnnnn

If you know the BACKUP saveset block size, you can multiply this with the Object count value and get an approximate no. no bytes on your tape.

This field in the UCB counts the position of the tape, i.e. the blocks written. It would be used by VMS to re-position an ANSI-mounted tape during mount-verification, if something happens, that causes the tape to rewind.

Volker.
Highlighted
Advisor

Re: Directory listing of large files on large tapes

Volker,

Interesting. I will keep this in mind. Unfortunately, with the established procedures, the process is unlikely to have sufficient privileges to invoke SDA. Nor would I want it to have those privileges. It is OpenVMS after all: you don't want to give an account any more privileges than it needs ;-)

\bill
Highlighted
Honored Contributor

Re: Directory listing of large files on large tapes

Bill.

>>>
Unfortunately, with the established procedures, the process is unlikely to have sufficient privileges to invoke SDA.
<<<

No need. And better not. You watch SDA DURING the activity. And for that, ANY suitably priv'd account will do,
NO relation to the account using the tapedrive.

hth

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Highlighted
Advisor

Re: Directory listing of large files on large tapes

From the discussion and links, I summarize as follows:

It is extremely difficult to determine tape usage, especially for a arbitrary tape (ie. one for which the current process has no knowledge). A Directory listing of a Backup tape provides a ready list of savesets, but with larger savesets, sizes are not displayed properly. So, to estimate tape usage, there are two options:

1. Scan the tape to measure usage
2. When writing to the tape, remember sizes

With either option, one compares the usage with the theortical capacity to "guess" how much capacity remains on the tape. And I do mean "guess".

The first option is too expensive. The Directory approach was fast and easy, but any other method to scan the tape uses significant resourses (IO, CPU, etc.) and takes a long time.

The second option require that the procedures that create the savesets remember, and make available for later use, the sizes of what is written to the tape. This is likely the approach I will take.

Fortunately I don't need to rush on this; it is amazing how much "elbow room" one has when moving from 80GB cartridges to 800GB cartridges...

I will leave the thread open for a couple of days in case anyone wants to leave some parting thoughts or suggestions. Many thanks for the help and the discussion.

\bill