- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - OpenVMS
- >
- Re: Calculating block & file size of files in mult...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 02:12 AM
06-06-2007 02:12 AM
I am trying to let the user know when they have selected too many savesets.
I need this info to compare to the free blocks on the hard drive.
If the total number of blocks required to expand the multiple savesets exceeds the total available blocks on the hard drive, then I will alert the user to de-select savesets until it is within the memory left on the hard drive.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 02:22 AM
06-06-2007 02:22 AM
Re: Calculating block & file size of files in multiple savesets
No, as some data compress a lot and others not. You can't say "I have 100 M compressed, so I will always get 136,3 M of uncompressed data."
Today, disk space is cheap. May be the simplest solution is to have a huge free space.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 02:34 AM
06-06-2007 02:34 AM
Re: Calculating block & file size of files in multiple savesets
You can certainly guess (eg: 2.0 to 2.5x is an oft-quoted ratio, or you could determine this ratio approximation empirically for your data) but there's no way I'm aware of to know what the input volume was, unless you have some control over the process and record this material yourself. This could be via a metadata file stored on the media in parallel with the compressed saveset, or via a metadata file inserted into the saveset, or via tagging the saveset with a site-specific ACE containing this data (this latter case assuming no use of /interchange).
Or you could wing it, restore the saveset, and catch the error when it arises. In most cases, I'd tend to code to catch the allocation error regardless, as -- barring a completely quiescent target disk -- there can be a parallel allocation that arises during the restoration, and derails your restoration process. Even with a careful and correct check, the restoration can still fail.
There is a semi-related discussion in the OpenVMS FAQ, around estimating available capacity of output media when generating a compressed saveset via BACKUP compression or drive-level compression http://64.223.189.234/node/1 This is the logical reverse of your question, that too is related to determining compression efficiency.
Stephen Hoffman
BoffmanLabs LLC
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 02:40 AM
06-06-2007 02:40 AM
Re: Calculating block & file size of files in multiple savesets
zip, gzip, bzip2, any other ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 02:43 AM
06-06-2007 02:43 AM
Re: Calculating block & file size of files in multiple savesets
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 02:57 AM
06-06-2007 02:57 AM
Re: Calculating block & file size of files in multiple savesets
The tape drive can compress data and that data can include BACKUP saveets, and the same difficulty in estimating compression efficiency exists there.
The BACKUP compression command is BACKUP /DATA_FORMAT=COMPRESSED -- AFAIK, this is latent and undocumented, and not (yet?) supported. It's been discussed in various forums around the net.
There's a write-up on BACKUP, compression and encryption I/O throughput here: http://64.223.189.234/node/85
This topic doesn't cover the compression efficiency.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 03:32 AM
06-06-2007 03:32 AM
Re: Calculating block & file size of files in multiple savesets
> zip, gzip, bzip2, any other ?
> I used the backup command for OpenVMS 7.3-2
So, how is that _compressed_? Or, by
"compressed", did you just mean "collected"
(into a BACKUP save set)?
Not that it's likely to matter here, but note
that a Zip archive includes uncompressed and
compressed size data for the files it
contains, and zipinfo (unzip -Z) can reveal
such information without actually unpacking
the archive.
And the size of a(n uncompressed) BACKUP save
set is at least a fair guide to the size of
the data therein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 04:25 AM
06-06-2007 04:25 AM
Re: Calculating block & file size of files in multiple savesets
did you make a saveset of savesets? if so
$ back/list yoursaveset.bck/save
will give you the total blocks used.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 05:58 AM
06-06-2007 05:58 AM
Re: Calculating block & file size of files in multiple savesets
I want to let the end user know when they have selected too many savesets.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 06:01 AM
06-06-2007 06:01 AM
Re: Calculating block & file size of files in multiple savesets
"$ back/list yoursaveset.bck/save
will give you the total blocks used."
I am looking for the uncompressed block size.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 06:25 AM
06-06-2007 06:25 AM
Re: Calculating block & file size of files in multiple savesets
if you just used plain backup then
its not compressed. its around 10% +- larger
then the blocks listed with a backup/list.
I guess we are not sure how you compressed
the savesets. backup which won't do it ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 06:32 AM
06-06-2007 06:32 AM
Solutionyour answer is much closer then you are seeking it!
>>>"$ back/list yoursaveset.bck/save
will give you the total blocks used."
I am looking for the uncompressed block size.
<<<
WHAT that will give you, is NOT the saveset number of blocks, but the blocks READ in creating the saveset, ie, the number of blocks you will get upon restore.
(well, DO allow for clustersize uprounding. So, add approx 1/2 * (number-of-files-in saveset) * targetvolume clustersize.)
hth
Proost.
Have one on me.
jpe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 06:57 AM
06-06-2007 06:57 AM
Re: Calculating block & file size of files in multiple savesets
>>>"$ back/list yoursaveset.bck/save
will give you the total blocks used."
<<<
Based on the back/list command above, this could take a while for the entire saveset to complete the listing. This is especially true in my case where one saveset can contain as as many as 300,000 to 500,000 files each.
Is there a quick way to extract the line that contains the number of blocks and files for the entire saveset?
So, add approx 1/2 * (number-of-files-in saveset) * targetvolume clustersize.)
Finally, how do I determine the target volume cluster size?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 07:07 AM
06-06-2007 07:07 AM
Re: Calculating block & file size of files in multiple savesets
--
cluster_size = f$getdvi(
where devnam is the name of the mounted target device
-- Rob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-06-2007 07:13 AM
06-06-2007 07:13 AM
Re: Calculating block & file size of files in multiple savesets
The problem as posed has several hazards:
- There may be files that were marked NOBACKUP in the saveset (and thus not saved) that WILL occupy space when restored
- If the saveset is stored on a sequential device (tape or simulated tape), then there is no way to determine the length of the saveset without reading through the entire set
- The "breakage" factor relating to the disk cluster size and the BACKUP record size
There are probably a few cases that I missed in the above.
The bottom line is that without parsing the output of a BACKUP/LIST of the saveset, I doubt that it is possible to come up with a truly reliable number.
It is important to note that hardware compression is below the user's visibility in this case. The case of the NOBACKUP files is effectively an example of an optimization within the saveset and is an issue.
One option is to do the restore to a scratch volume, where users have far more space available than normal volumes. The operation can then be staged onto the actual destination.
As usual, the depth of the response is limited by the details of the target environment.
I hope that the above is helpful.
- Bob Gezelter, http://www.rlgsc.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-08-2007 07:25 AM
06-08-2007 07:25 AM
Re: Calculating block & file size of files in multiple savesets
This is the fourth question about what appears to be the same problem.
PKZIP for VMS vs. backup/log http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1114625
Total Number of Files and Blocks inside savesets http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1121642
Need to speed up expansion of very large savesets http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1122470
Calculating block & file size of files in multiple savesets http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1133933
Can you please provide a few more details about the actual problem you are trying to solve? We are answering the specific questions you ask, but the answers don't seem to be solving your real problem.
This appears to be a data transfer issue, not an archival issue.
Reading between the lines, it seems you have a process that creates many (300,000 - 500,000 individual files containing a total of more than 50 GB of data) on an ongoing basis. This data is delivered to another party periodically. The apparent problem is that the "customer" is complaining about how long it takes to get the data into a form that they can process it.
Once the customer "unloads" the data to their disk, what do they do with it? After they process it, do they delete it to make room for the next set of data? Specifically, do they process it multiple times, or do they only need to process it once in its raw form? E.g. if they are reading data, and loading it into another database, then once they have processed it, they no longer need the original data.
The reason I ask is that if they are processing the data only once, then if is possible for you to change your procedure in the collection of the data, you will be able to provide the data in a form that will be usable in a very short period of time from the customer's point of view.
If they are only processing the data once, and you don't need a copy of the original data, you can create the data on a removable disk that you deliver to them once the disk is "full". If you had two disks, you could exchange disks (double buffering) but if you need to have a drive available for collected data at all times, then you will need to get the previous disk back before your primary disk fills, i.e. you may need more than two drives. In a previous thread I suggested the use of an LD container file as the "drive", but you seemed reluctant to use LDDRIVER.
The modified procedure to transfer data would be:
At your site:
1. Prepare collection/transfer disk. (Connect, Initialize, Mount)
2. Store data to disk until disk nearly full.
3. Remove disk, send to data consumer.
4. goto step 1
At customer site:
1. Ready input disk (Connect, Mount)
2. Process data
3. Remove disk, sent to data provider.
4. Goto step 1.
The customer can be processing one set of data while you are collecting/generating the next.
Note in this scenario, no backups/restores are done. It is just mount and go. If you do need to keep a copy of the data, you will need to do a backup of the drive before it is sent, or you will need to use HBVS to another drive (which can be an LD device), so the data is copied into two places as it is saved.
The disk can be either a removable SCSI disk or an LD container file. The procedure is essentially the same. The key is that you are providing them with a disk that has the files in a usable state without the need to do an unload (i.e. a restore of a backup saveset or an unzip of a zip file)
Answers to your specific questions:
The only way to get an accurate estimate of the output size of an arbitrary saveset is what is reported by backup/list. However, this data does not change, and you can create a listing file at the time of the initial backup (just include /list=file in the backup command that is creating the saveset). Once the saveset is created, the time consuming process of listing the contents does not need to be done again. Deliver the listing file along with the backup saveset (assuming you are not going to use my proposed solution, in which case, the listing isn't needed).
In your case, as long as you do not have files that are marked /nobackup, and you are not using data compression, and you are creating the backup savesets on disk, and you specify /group=0 (no redundancy), the size of the saveset will be a good approximation of the size of the restored data if restoring to a disk with a cluster size of 1. But you would want to require more space than the size of the saveset, as you would not want to run out of space on a restore. This problem can be avoided by just exchanging drives.
Jon