Operating System - OpenVMS
1839259 Members
2619 Online
110137 Solutions
New Discussion

Re: Directory listing of large files on large tapes

 
Korendyk
Advisor

Directory listing of large files on large tapes

I've been creating backup savesets on tape, then mounting the tape and getting a directory listing to see how much the tape was being used. As disks and tape capacities have grown, I've discovered a problem: directory listings fail to show the size of very large backup savesets.

For example, an image backup of a rather full 72GB drive results in a 65GB saveset, but a Directory listing shows the file to be 6 blocks
in size. I should note the saveset is retreivable; the file is there, but the directory listing is wrong. The size of smaller savesets, like before the disk got this full, are shown correctly.

Haven't found any topic/patch that might shed light on this, so I thought I'd post this query. Any solutions or alternatives would be appreciated.

Oh yes, OpenVMS V7.3-2.

\bill
13 REPLIES 13
Hoff
Honored Contributor

Re: Directory listing of large files on large tapes

Korendyk
Advisor

Re: Directory listing of large files on large tapes

Thanks for the pointers Hoff. Not sure why none appeared during my original search. While awaiting a response to my post, I posited the problem might be a limitation in the tape labels and had confirmed it by dumping a few headers. The pointers showed I wasn't alone.

The pointers also confirm there's not going to be an easy resolution; unlikely tape labelling will change after all this time. Since this arose from long-standing procedures, I'm still left with trying to find a way of getting the usage on a tape. I'd welcome any solutions or alternatives anyone might have...

\bill
Hoff
Honored Contributor

Re: Directory listing of large files on large tapes

Short of querying the device, there's not a good way to get the remaining storage on a tape device when data compression is in play; compression efficiency trumps source file size.

Here's the LTO version of the discussion:

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1249719

And for TZ8x:

http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1143795

There's the "How do I check for free space on a (BACKUP) tape?" section in the FAQ; that covers older tape technologies. And there's the article "Tape Tips: Compression, Remaining Capacity Estimates"
at:

http://64.223.189.234/node/965

I picked a DLT 8000 drive out of a dumpster not too long ago, more so I could get at older media. Might a DLT drive upgrade trump the capacity discussion? Or a DAS or NAS array with drive with a couple of terabyte SATA spindles?

Korendyk
Advisor

Re: Directory listing of large files on large tapes

Thanks Hoff, for the additional pointers; adds to the discussion, though perhaps not to the solution. A couple of additional comments before closing...

I agree that compression makes it impossible to determine true capacity of a cartridge. I think it is worth noting that I'm interested in finding the actual "usage" rather than the "remaining storage". The (automated) procedures then use a theoretical capacity to decide whether to risk adding a saveset or moving on to another volume. A semantic distinction to help clarify the methodology ;-)

(Of course the saveset retention periods, tape retention periods, frequency of tape use all weigh in on how the procedure makes its decisions. A bit of science, a bit of common sense, but apparently a lost art considering how often I see what happens out there.)

Yes, a "DLT upgrade" would trump the capacity discussion. However, in this particular instance, the problem became evident after a move from DLT to LTO. With DLT, the procedures never checked tape usage since they knew from the theoretical capacity of the tape that the large saveset required an "empty" tape. Of course, going back is out of the question since the LTO does in one hour what used to take ten hours with the DLT. ;-)

And lastly, the backup is already disk to disk to tape. In future the backup disks may (should) become high capacity hot-pluggable SATA or SAS. But real world considerations make tape an easy and effective way to extend the retention periods of the backups.

Resolution? I'll just have to experiment a bit to come up with a new way to guess at the tape usage.

\bill
Hoff
Honored Contributor

Re: Directory listing of large files on large tapes

[[[I agree that compression makes it impossible to determine true capacity of a cartridge. I think it is worth noting that I'm interested in finding the actual "usage" rather than the "remaining storage".]]]

AFAIK, OpenVMS in the form of BACKUP and the magtape drivers have no clue about this remaining storage. BACKUP (or COPY for that matter) tosses blocks until it sees a response from the drive that causes it to alter its behavior. (eg: one of various errors, or EOT) AFAIK, there's no "reservation" mechanism with tapes; with disks and CDs and such, you can. But tapes can see bad spots and differences in compression efficiencies; factors which make estimates more interesting.

[[[The (automated) procedures then use a theoretical capacity to decide whether to risk adding a saveset or moving on to another volume. A semantic distinction to help clarify the methodology ;-) ]]]

Ayup. I used that strategy eons ago. More recently, I let BACKUP roll over onto the next volume, or I use a one saveset per cartridge scheme, or use a library or loader. Or I punt, and catch the overflow through exception means; the operation blows up (eg: due to a lack of tape operators), and I deal with it in the morning.

More commonly, I watch the input size, too. But keeping up with the particular compression efficiency of the input data always made my head hurt.

If you want to pursue this, you have to ask the drive what's left; the milage.

[[[(Of course the saveset retention periods, tape retention periods, frequency of tape use all weigh in on how the procedure makes its decisions. A bit of science, a bit of common sense, but apparently a lost art considering how often I see what happens out there.)]]]

With tapes? Science? Art? You forgot to include "magic" and "luck" in that list. But then, maybe my own head needs alignment. :-)

Volker Halle
Honored Contributor

Re: Directory listing of large files on large tapes

Bill,

when you're just interested in the 'actual usage' of the tape, you could use the 'Object count' (UCB$L_RECORD) in the tape device UCB.

Caputer this value after the last backup to tape operation has finished and before dismounting the tape:

$ ANAL/SYS
SDA> SHOW DEV MKAx:
...
Look for Object count nnnnnn

If you know the BACKUP saveset block size, you can multiply this with the Object count value and get an approximate no. no bytes on your tape.

This field in the UCB counts the position of the tape, i.e. the blocks written. It would be used by VMS to re-position an ANSI-mounted tape during mount-verification, if something happens, that causes the tape to rewind.

Volker.
Korendyk
Advisor

Re: Directory listing of large files on large tapes

Volker,

Interesting. I will keep this in mind. Unfortunately, with the established procedures, the process is unlikely to have sufficient privileges to invoke SDA. Nor would I want it to have those privileges. It is OpenVMS after all: you don't want to give an account any more privileges than it needs ;-)

\bill
Jan van den Ende
Honored Contributor

Re: Directory listing of large files on large tapes

Bill.

>>>
Unfortunately, with the established procedures, the process is unlikely to have sufficient privileges to invoke SDA.
<<<

No need. And better not. You watch SDA DURING the activity. And for that, ANY suitably priv'd account will do,
NO relation to the account using the tapedrive.

hth

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Korendyk
Advisor

Re: Directory listing of large files on large tapes

From the discussion and links, I summarize as follows:

It is extremely difficult to determine tape usage, especially for a arbitrary tape (ie. one for which the current process has no knowledge). A Directory listing of a Backup tape provides a ready list of savesets, but with larger savesets, sizes are not displayed properly. So, to estimate tape usage, there are two options:

1. Scan the tape to measure usage
2. When writing to the tape, remember sizes

With either option, one compares the usage with the theortical capacity to "guess" how much capacity remains on the tape. And I do mean "guess".

The first option is too expensive. The Directory approach was fast and easy, but any other method to scan the tape uses significant resourses (IO, CPU, etc.) and takes a long time.

The second option require that the procedures that create the savesets remember, and make available for later use, the sizes of what is written to the tape. This is likely the approach I will take.

Fortunately I don't need to rush on this; it is amazing how much "elbow room" one has when moving from 80GB cartridges to 800GB cartridges...

I will leave the thread open for a couple of days in case anyone wants to leave some parting thoughts or suggestions. Many thanks for the help and the discussion.

\bill
AEFAEF
Advisor

Re: Directory listing of large files on large tapes

If IIRC: DIRECTORY is of no help if hardware compression was used. It gives the same size for a given save set whether or not it was compressed (or compacted, whatever). So, the compression is transparent. This makes sense: The tape drive decompresses during read and gives the result to the process. So DIRECTORY is looking at the uncompressed sizes.

I'm curious, though, as to what happens to the resulting block structure on the tape. Are the compressed blocks all separate (and of varying size, of course) or do they span across post-compression fixed-size tape-drive blocks?

You'd still want to keep the block size .LE. 32256 in case you want to copy the save set to a disk.

AEFAEF
GuentherF
Trusted Contributor

Re: Directory listing of large files on large tapes

All modern tape drives use a fixed block size which has nothing to do with the block size the a user/progam specifies. Within each block there is meta data and the block is filled with records. A tape mark for example is a record with a type of "tape mark" and zero length data. Most drives keep a directory of where each internal block is on tape and allow you quasi random access to these blocks. So a compressed user block can span over more than one on-tape-block. Some SCSI tape drives can return an estimate of bytes left on a cartridge. But the returned number is for uncompressed data.

Only encrypting and compressing a file to disk first an then copying to tape gives you reliable numbers.

Unfortunately OpenVMS BACKUP does not compress (undocumented feature) if encryption is used at the same time.

/Guenther
Hoff
Honored Contributor

Re: Directory listing of large files on large tapes

> Only encrypting and compressing a file to disk first an then copying to tape gives you reliable numbers.

Encrypted data doesn't compress.

Always compress before you encrypt.
Korendyk
Advisor

Re: Directory listing of large files on large tapes

Thanks to all for your comments, it was enlightening. I was able to easily adapt the current procedures to remember saveset sizes, so the procedures and processes used over the last 15 years remain essentially the same: guess at how much of the tape's capacity is being used and decide on whether to move on to the next tape. Older tapes are still handled properly, and so everyone is happy.

\bill