Re: Cluster size of large disks

Gerry Downey · ‎08-17-2005

Hello everyone,
I am installing a 36GB disk in an Alpha 800 running VMS 7.1. The minimum cluster size that I can set on the disk is 69, however I need a maximum cluster size of 51 due to application restrictions.

Is there any way I can set a lower cluster size? Can the disk be "partitioned" to allow it?

Or if I upgraded VMS will this allow me to set a lower cluster size?

Thanks,
Gerry

Ian Miller. · ‎08-17-2005

what application restriction prevents the large cluster size?

The changing allowing smaller cluster sizes was in VMS V7.2 along with lots of other good stuff so upgrade if you can to V7.3-2 or V8.2

You could use the LD driver to create smaller disks from container files on the 36Gb disk.

____________________
Purely Personal Opinion

Aaron Lewis_1 · ‎08-17-2005

Gerry, I don't remeber exactly when these 'INIT' switches bacame available, but there is:

/limit -- defaults to cluster size of 8 &
allows you to dynamically expand
the size

/cluster -- allows you to specify a smaller
cluster size

/structure=5 -- specifies and ODS5 disk,
instead of ODS2. provides
for upper & lower case file
names, support for long
file names and a few other
nifty tricks. Default
cluster size is 3

All of these are in VMS 7.3-2, so an upgrade of VMS will get you what you need if 7.1 doesn't support them.

Gerry Downey · ‎08-17-2005

The application is a third party application that uses RMS and I've been told that the cluster size must be 51 or lower.

I tried init/cluster, but it wont let me set it lower than 69.

Looks like a VMS upgrade.

Thanks for the replies,
Gerry

Antoniov. · ‎08-17-2005

Gerry,
I'm afraid you cannot use value less than 69. If you read carefully help /cluster you can read minimal value of cluster is:
(disk size in number of blocks)/(255 * 4096)
With 36Gb HD the minimal value is 69.
So you have one these alternative:
1) Mount a 36Gb HD
2) Upgrdade you VMS 7.3 and use ODS-2 disk.

Antonio Vigliotti

Antonio Maria Vigliotti

Uwe Zessin · ‎08-17-2005

Earliest release that supports the larger allocation bitmap (but not dynamic volume expansion) was V7.2. Before that, BITMAP.SYS was limited to about 1 million bits (255 blocks * 512 bytes per block * 8 bits per byte).

In case you're curious:
from the formula that Antonio has mentioned:
4096 = 512 bytes per block * 8 bits per byte

.

Hein van den Heuvel · ‎08-17-2005

As the others wrote: Upgrade this is deriable anyway and now you have a good 'excuse'.
Or you could use the LDdriver to partition.

>> The application is a third party application that uses RMS and I've been told that the cluster size must be 51 or lower.

Intersting. That 51 happens to be the number used by Vir for a relative file multi-block-count in an other recent thread:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=946349

Please question that 51 max !
Where does it come from?
What is the (suspected) problem with anything larger?

From a pure RMS perspective it makes absolutely no sense at all.
The clustersize is (should be :-) entirely transparant to RMS record IO applications.
The only obvious direct effect from cluster size is the (obvious) round up during allocations... but that is transparent.
It will also change the EDIT/FDL/NOINTERACTIVE tuning rules, but those effects are minor, and can be over-ruled by hardcoding a selected clustersize in the input analyze data file.

Please do NOT accept this 51 line at face value. Check into this. It may well turn out to be antiqueted bagage which should be thrown out. If it is really true, then please explain because we'd love to learn and we may be able to fix or give alternatives.

hth,
Hein.

Ian McKerracher_1 · ‎08-17-2005

Hein seems to be particularly intrigued by this cluster size value of 51. Have a look at this question from last year. It may or may not be of interest.

http://h71000.www7.hp.com/wizard/wiz_9545.html

Ian

Hein van den Heuvel · ‎08-17-2005

Ian wrote "Hein seems to be particularly intrigued by this cluster size value of 51"

Not me! I am a firm believer in nice, big cluster sizes with larger powers of 2 in there. For example: 240, 256, 512, 600, or 720 or some such high number where acceptable. (thousands of files, mot millions).

But indeed, that wizard question from march 2004 also appears to have been submitted Gerry! (I peeked behind the curtain).

Hein.

John Gillings · ‎08-17-2005

Gerry,

Well, I'm particularly intrigued by this:

>The application is a third party
>application that uses RMS and I've
>been told that the cluster size must
>be 51 or lower.

I'm wondering HOW anyone could write an application that cares about the cluster size. Short of calling SYS$GETDVI to check, I can't imagine how any application would be aware of cluster size for a particular file, much less care about it!

Yes you should upgrade to at least a supported version of OpenVMS (currently at least V7.3-2), and yes, that would give you much more flexibility in configuring and managing disk volumes, BUT it shouldn't be necessary.

A crucible of informative mistakes

Antoniov. · ‎08-17-2005

Like Hein, I prefer cluster size power of two. Value 51 is unusual, strange.

Antonio Vigliotti

Antonio Maria Vigliotti

Gerry Downey · ‎08-17-2005

Ian, I had a look at the link you provided. From reading the documentation with VMS
7.3-2, I see that I can set the cluster size to what ever I want and then do a
BACKUP/IMAGE/NOINIT to preserve the cluster size. So I definately have to upgrade.

I seem to have you all intrigued as to why the cluster size is important. I'll try to find out why and let you know.

Robert Gezelter · ‎08-17-2005

Gerry,

As everybody else has noted, the cluster size is, when all is said and done, likely irrelevant.

As to the genesis of the myth, I would speculate two things:

- the selection of the cluster size at some point may be linked with ancient disk geometry. At some point in the past, someone may have decided that it seemed a good idea to ensure that clusters aligned on disk boundaries (track, cylinder). Today, most people ignore that fact. This was not so in the past. Many other operating systems (e.g., IBM's OS/360 and CDC NOS) did in fact encourage this type of thinking. Provided it did not cause excessive "breakage", it isn't a bad idea. Breakage can be impressive, I have seen 20-30% values at times, a strong argument AGAINST large cluster sizes.

- somewhere along the line, someone made the presumption that cluster sizes related to RMS bucket sizes.

In short, before migrating disks or similar action, track down the source of this myth, and set it right.

If the applications firm does not give a solid explanation, consider calling in an HP or solid third party consultant to take a look at the question. It is not uncommon to find such a thing is an artifact of the testing configuration. A small review and some conference calls can literally save tens of thousands of dollars of budget spent on unneeded system management exercises (Yes, to be honest, I have resolved several of these types of situations in the past, but I digress).

- Bob Gezelter, http://www.rlgsc.com

Robert Gezelter · ‎08-17-2005

Gerry,

As to a short term solution, I would recommend a good look at the use of LD.

- Bob Gezelter, http://www.rlgsc.com

Uwe Zessin · ‎08-17-2005

Today, you cannot really 'align on cylinder boundaries', because modern disks don't have a fixed geometry, inside. The number of sectors is larger on the outer tracks. Try to tell that an OS. And most of the time a disk drive is behind a RAID controller, anyway.

.

Antoniov. · ‎08-18-2005

Gerry,
I'm happy you upgrade you os version. I'm also intrigued by this choice; usually system administrators don't love upgrade theyr machine to avoid compatibility trouble. So the reason of keep cluster size to 51 is very very strong!
If you need help for upgrade, post: we are here!

Antonio Vigliotti

Antonio Maria Vigliotti

Willem Grooters · ‎08-18-2005

>The application is a third party
>application that uses RMS and I've
>been told that the cluster size must
>be 51 or lower.

There can be good reasons!
What is meant by "RMS" in this case? It may mean "Record Management System" - the VMS meaning - but who knows it has something to do with memory: "Reusable Memory Storage", perhaps? Just a guess. In that case, it can very well be feasable to limit the clustersize on disk, where is is possible to read whole clusters of data directly into memory - bypassing VMS's RMS. Maximum clustersize of 51 means a 25,5K buffer.

Just a question on this: does the application require LOGIO of PHYIO privilege? That would be a signal of this behaviour.

Why you should need to, and not use VMS's own facilities is another matter. Maybe the program ws originally written for Unix?

Anyway: you will need to upgrade. Given the nature of the program, I would test it with a higher version of VMS. It is very well possible that it won't work when it is linked against the system!

Willem

Willem Grooters
OpenVMS Developer & System Manager

Peter Barkas · ‎08-19-2005

Mmmm, RMS could mean "Royal Microscopical Society" too.

A somewhat more likely answer is that the application was written in a way that attempted to do, or thought that it was doing, some sort of optimised IO but perhaps had only limited internal storage. If the application is old, a clustersize of 51 might have seemed future-proof at the time.

Robert_Boyd · ‎08-22-2005

Bob Gezelter and anyone else familiar with the inner workings of ODS 2/5 on recent versions of VMS:

I'm confused -- comments about the irrelevance of cluster size compared to sector size -- if you choose a cluster size that is not an integer multiple of the sector size, what happens to the pieces that are left over?

My impression from past descriptions of how the drive gets broken up into useable hunks of media leaves me with the impression that randomly picking a cluster size leads to wasted space or inneficient use.

Robert

Master you were right about 1 thing -- the negotiations were SHORT!

Robert Gezelter · ‎08-22-2005

Robert,

Cluster size is independent of the numbers of sectors on a track. Clusters follow one another, if a cluster does not fit on the track, it continues into the next track (or cylinder, if the this occurs on the last track in the cylinder).

At first glance, this may seem inefficient, but the reality is that the benefit of multiblock allocations outweighs the negative effects of the occasional cluster that spans a physical boundary.

In many cases, at least in the past, the number of blocks was a composite number (product of two or more numbers), thus it was possible, without wastage, to keep a cluster completely on a track. I have seen more than a few cases where the numbers of sectors per track are constant and a prime number (e.g., a number divisible only by one and itself). In that case, the only cluster factors that would not yield track crossing would be 1 (which would be inefficient to allocate space, creating a high degree of fragmentation) or a cluster size equal to the track size (which would yield a tremendous loss of space due to "breakage", the unused, and unallocatable space at the end of the file).

The main performamnce loss is not from spanning tracks or cylinders (of which only cylinders is truly significant), but the overheads from highly fragmented allocations and multiple file headers.

Of course, all of these effects are harder to observe when there are caches at different levels in the hierarchy obscuring what one is trying to measure.

- Bob Gezelter, http://www.rlgsc.com

Uwe Zessin · ‎08-22-2005

Hm, I thought that the 'sector size' of a SCSI disk is usually 512 bytes when working with a server. Some storage arrays use 520 bytes, but that's beyond this thread's topic.

Today (and it's been that way for a long time), disks are addressed by logical block numbers, not physical addresses. Again: expect the inner geometry to be different that what is reported through the interface - see my previous response.

If the total number of blocks on the disk is not a multiple of clusters, then, at least in the past, the last (incomplete) cluster is assigned to BADBLK.SYS. Nice trick to avoid special handling code.

.

Hein van den Heuvel · ‎08-22-2005

Robert> "I'm confused -- comments about the irrelevance of cluster size compared to sector size -- if you choose a cluster size that is not an integer multiple of the sector size"

But it is... a sector (in VMS land) is 512 bytes, and a cluster is a whole number of sectors by law.
http://en.wikipedia.org/wiki/Sector

Bob G seems to refer to cylinders. Back when I was young. Ok... way back... disks had a fixed number of sectors in a cylinder, often an odd number, and you could play with allocation matching a cylinder and so on.

But when by the time (10+ years ago) got down to 3.5" disks, disks became zoned (banded) with the outside zones having twice more cyl/sec then the inside. So forget about adapting/expoiting that.

However... next came smart controllers with stipesets and raidset and CHUNK sizes. I am a firm believer in making clustersize and chunk sizes have large common denominators. For example: cluster size 16, chunk size 128 but for other applications possibly clustersize 512, chunksize 64.

hth,
Hein.

Robert Gezelter · ‎08-22-2005

Hein,

I may perhaps have written unclearly (reviewing my post I said "sectors" instead of "blocks" at one point in my posting. The terminology I am using is:

- BLOCK == 512 bytes (one or more sectors)
- TRACK == the set of blocks/sectors on a single heads pass over the media
- CYLINDER == the set of TRACKs which can be accessed without repositioning the heads (note that this definition works for both for canonical drives (1 head/surface) and other variations (some drives have more than one head/surface).

The point I was trying to get across is that clusters are only file allocation units, they do not (directly) affect whether a disk operation will span track or cylinder boundaries (and yes, I am aware of how RMSDFMBC is affected by cluster size, but that does not affect the basic concept).

Optimal selection of cluster size truly depends on what the file population on the disk is. If you are storing small files (e.g., command files, source listings, emails), then small cluster sizes produce the least breakage. Even modest cluster sizes (e.g., 11) can produce 20-30% breakage.

On the other hand, if one is storing a huge RMS file or DBMS database, then a far larger cluster size is appropriate. In this I agree with the point (that I think) you are making.

In summary, I would consider the controller imposed preferences to be a factor similar in import to the physical disk geometry issues, and try for a balance between them. In any event, the nature of the files stored on the volume is tremendously important. Large amounts of breakage reduce the overall efficiency of the storage system in many ways, and are to be avoided if possible.

- Bob Gezelter, http://www.rlgsc.com

Antoniov. · ‎08-22-2005

Uwe gave a full description of basilar disk concepts. He's a big expert in this area.
Cluster size is called allocation unit in other lans (e.g. Windoze); it simply means how many record system i/O reads togheter.
Read above about performance consideration.

Antonio Vigliotti

Antonio Maria Vigliotti

Robert Gezelter · ‎08-22-2005

Antoniov,

With all due respect, I must disagree with your last post.

In OpenVMS, cluster factor is almost entirely a factor in how the disk is organized, NOT the determinor of how many blocks are read at a time (there are numerous other parameters which directly affect the size of IO operations).

The only exception of which I am aware is the implicit one with respect to the fact that physical IO operations, at the driver level, must be to physically contiguous sections of the disk, but this is a consequence of fragmentation, not specifically tied to cluster size.

- Bob Gezelter, http://www.rlgsc.com

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Cluster size of large disks

Cluster size of large disks