<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: VMS utility to determine file size distribution? in Operating System - OpenVMS</title>
    <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638476#M99906</link>
    <description>Jim,&lt;BR /&gt;&lt;BR /&gt;Thanks, the performance advisor report is probably what I remember.  We had the DEC performance advisor, but when it was sold to CA we didn't continue with the maintenance, and many of the features of PSDC were VMS version dependent, so  it is no longer in our startup.  &lt;BR /&gt;&lt;BR /&gt;Since the disk analysis reporting feature is just looking at the INDEXF.SYS and BITMAP.SYS files, I decided to try it, and the PSDC V2.2-51 PSDC$DSKANL still works (with 1995 limitations) on VMS 8.3, although it is throwing a warning message.  Since it predated extended bitmaps, perhaps that is what the buffer&lt;BR /&gt;&lt;BR /&gt;OT$ anal/image/sel=(id,link,build) sys$system:psdc$dskanl.exe&lt;BR /&gt;SYS$COMMON:[SYSEXE]PSDC$DSKANL.EXE;2&lt;BR /&gt;"PSDC V2.2-51"&lt;BR /&gt;11-OCT-1995 11:47:58.55&lt;BR /&gt;""&lt;BR /&gt;OT$ advise collect report disk disk$user1 /out=scr:t.t&lt;BR /&gt;%PSDC-W-GETMSGWARN, $GETMSG System Service Warning&lt;BR /&gt;-SYSTEM-S-BUFFEROVF, output buffer overflow&lt;BR /&gt;OT$ &lt;BR /&gt;&lt;BR /&gt;I have attached a portion of the output from the above command for anyone that is interested. &lt;BR /&gt;&lt;BR /&gt;This utility is quite efficient, as it gets its info by scanning indexf.sys and bitmap.sys directly (it does not have to traverse all directory files).  &lt;BR /&gt;&lt;BR /&gt;So I can use this, but it isn't something that we can expect ITRC users to be able to run.&lt;BR /&gt;&lt;BR /&gt;BTW, are you the same Jim Hintze that presented the paper about Disk Fragmentation at the spring 1981 DECUS symposium?  (I am guessing you are since you know about RM03 drives)&lt;BR /&gt;&lt;BR /&gt;         On the Fragmentation of Disk, Jim Hintze, Eric  Deaton,&lt;BR /&gt;         Weeg Computing Center, pp. 1321-1325 Proceedings of the&lt;BR /&gt;         Digital Equipment Computer Users Society  Spring  1981,&lt;BR /&gt;         Volume 7, Number 4.&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;Fekko,&lt;BR /&gt;&lt;BR /&gt;If this is something that you can share, then yes, I would be interested.  Especially if it can be released like DIX and ACX, so ITRC folks could use it to gather information.  I didn't see it on the &lt;A href="http://www.oooovms.dyndns.org/" target="_blank"&gt;http://www.oooovms.dyndns.org/&lt;/A&gt; site, but perhaps I didn't know where to look.&lt;BR /&gt;&lt;BR /&gt;I assume that it is getting its file info by going directly to indexf.sys, and not by traversing all directories on the disk.  If so, and this is freely available, this may be the fastest freely available tool for file size distribution analysis.&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;Hein,&lt;BR /&gt;&lt;BR /&gt;I agree with your recommendations here and in &lt;A href="http://forums.itrc.hp.com/service/forums/questionanswer.do?threadId=1431785" target="_blank"&gt;http://forums.itrc.hp.com/service/forums/questionanswer.do?threadId=1431785&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Specifically, a cluster size that is a power of 2, and using different disks for small volatile files and large infrequently extended files.  Also, pre-extending the large RMS indexed files makes a lot of sense, especially if you can do so with CBT extensions.  Just curious, do your tools allow you to extend a file while the file is open (with update sharing allowed) by another process?  I assume your tool uses the $EXTEND service. &lt;BR /&gt;&lt;BR /&gt;One reason I like powers of 2 for cluster sizes is that will prevent files growing when backing up from one disk to another, even if backup/truncate is not used.&lt;BR /&gt;&lt;BR /&gt;Being nitpicky with your first comment.  Since VMS 7.2 the file allocation bitmap is no longer limited to 255 blocks (255 blocks * 512 bytes/block * 8 bits/byte = 1044480 bits which limits to 1044480 extents per pre-7.2 disk); it is now up to 65535*512*8 = 268431360 extents per disk.  Since your comment was in a "performance" related thread, I will concede that using extended bitmaps is not for enhanced performance (it can cause slow CBT creation/extension on fragmented disks); it is to allow for better space utilization of large disks.&lt;BR /&gt;&lt;BR /&gt;The perl tool you provided is good because it allows a subset of files on the disk to be analyzed.  This is also true for DCL that Hoff and the suggestion by John Gillings.  For a disk with many files on it, it would probably be quite a bit slower than a tool that is going directly to INDEXF.SYS for the info.&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;Hoff,&lt;BR /&gt;&lt;BR /&gt;Interesting method to get powers of ten without a log function.  &lt;BR /&gt;&lt;BR /&gt;$ sizelen = f$length(size)&lt;BR /&gt;$ count_'sizelen' = count_'sizelen' + 1&lt;BR /&gt;&lt;BR /&gt;Your submission has the advantage of working standalone on a plain vanilla VMS system that can't have any software&lt;BR /&gt;installed.&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;John Gillings,&lt;BR /&gt;&lt;BR /&gt;For detailed analysis, loading into a spreadsheet is a good method.  There may be issues with some spreadsheets not being able to deal with the number of records on a disk with many files.&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;All,&lt;BR /&gt;&lt;BR /&gt;Just for completeness, I did find another utility that will supply file size distribution.  Executive Software (makers of the DisKeeper defrag utility) has a "Disk Analysis Utility" available for download "free to qualified System Manager". I saw the pointer to this in their online whitepaper "Fragmentation: the Condition, the Cause, the Cure"  &lt;A href="http://www.diskeeper.com/fragbook/FRAGBOOK.HTM" target="_blank"&gt;http://www.diskeeper.com/fragbook/FRAGBOOK.HTM&lt;/A&gt; (infomercial for DisKeeper) where you can see an example of the output.  There is a link to a page to download the "OpenVMS Fragmentation Analysis Utility", but I didn't download it because I didn't want to be on their mailing list, and you must supply info to be able to download.&lt;BR /&gt;&lt;BR /&gt;&lt;A href="http://www.diskeeper.com/trialware/trialwareproducts.aspx" target="_blank"&gt;http://www.diskeeper.com/trialware/trialwareproducts.aspx&lt;/A&gt; ! link to page with download form (requires info I didn't want to provide)&lt;BR /&gt;&lt;BR /&gt;Jon</description>
    <pubDate>Fri, 28 May 2010 23:33:18 GMT</pubDate>
    <dc:creator>Jon Pinkley</dc:creator>
    <dc:date>2010-05-28T23:33:18Z</dc:date>
    <item>
      <title>VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638468#M99898</link>
      <description>Does anyone know of a VMS utility that provides information about file size distribution?&lt;BR /&gt;&lt;BR /&gt;There have been several threads asking about what cluster size to initialize a volume with, and knowing the file size distribution will help choose an appropriate size.&lt;BR /&gt;&lt;BR /&gt;It's relatively easy to determine the mean file size on a volume, but determining the median size is much harder.&lt;BR /&gt;&lt;BR /&gt;I tried several Google searches, but I haven't been successful.  I thought there was a Decus utility, but I can't remember for sure that it exists, and if it does, what it was named.  &lt;BR /&gt;&lt;BR /&gt;The DFG/DFO defrag utility gives a histogram of free space extents, but not file sizes.  Perhaps that is what I remember.  The command to get that is:&lt;BR /&gt;&lt;BR /&gt;$ defrag show /histogram &lt;DISK&gt;&lt;BR /&gt;&lt;BR /&gt;I did find a UNIX utility, fsstats.  Does anyone know of something similar for VMS?&lt;BR /&gt;&lt;BR /&gt;Description of fsstats&lt;BR /&gt;&lt;A href="http://www.pdsi-scidac.org/fsstats/download.html" target="_blank"&gt;http://www.pdsi-scidac.org/fsstats/download.html&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Sample output from fsstats&lt;BR /&gt;&lt;A href="http://www.pdsi-scidac.org/fsstats/files/sample_output" target="_blank"&gt;http://www.pdsi-scidac.org/fsstats/files/sample_output&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Somewhat interesting paper file size distributions on university UNIX system (1984 vs. 2005), and how different blocksizes affect file system usage.  Note that it is UNIX centric, and the UNIX and VMS file systems and caches behave differently.&lt;BR /&gt;&lt;A href="http://www.cs.vu.nl/~ast/publications/osr-jan-2006.pdf" target="_blank"&gt;http://www.cs.vu.nl/~ast/publications/osr-jan-2006.pdf&lt;/A&gt;&lt;BR /&gt;&lt;/DISK&gt;</description>
      <pubDate>Thu, 27 May 2010 07:35:25 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638468#M99898</guid>
      <dc:creator>Jon Pinkley</dc:creator>
      <dc:date>2010-05-27T07:35:25Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638469#M99899</link>
      <description>I expect that cluster size has more to do with the geometryof the media than RMS allocation settings. The cluster size should be a devisor that produces a quotent with no remainder when using the track size as a dividend.&lt;BR /&gt;&lt;BR /&gt;CA Performance advisor&lt;BR /&gt;advis coll repo disk sys$sysdevice &amp;gt;&amp;gt;&amp;gt;&lt;BR /&gt;&lt;BR /&gt;Disk Analysis              _$1$DGA4244: (USTPROD_SYS)                   Page   3&lt;BR /&gt;Summary of Allocated Space                                        PSDC V3.1-0805&lt;BR /&gt;                           Thursday 27-MAY-2010 07:24&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;  Space Allocated per Header     No. Headers    Cum % Headers&lt;BR /&gt;  --------------------------     -----------    -------------&lt;BR /&gt;&lt;BR /&gt;  &amp;gt;=        64,  &amp;lt;       128          20442          80.7&lt;BR /&gt;  &amp;gt;=       128,  &amp;lt;       192           1096          85.0&lt;BR /&gt;  &amp;gt;=       192,  &amp;lt;       320            544          87.2&lt;BR /&gt;  &amp;gt;=       320,  &amp;lt;       640            682          89.9&lt;BR /&gt;  &amp;gt;=       640,  &amp;lt;      1280            450          91.6&lt;BR /&gt;  &amp;gt;=      1280,  &amp;lt;      1920            289          92.8&lt;BR /&gt;  &amp;gt;=      1920,  &amp;lt;      3200            292          93.9&lt;BR /&gt;  &amp;gt;=      3200,  &amp;lt;      6400            332          95.2&lt;BR /&gt;  &amp;gt;=      6400,  &amp;lt;     12800            907          98.8&lt;BR /&gt;  &amp;gt;=     12800,  &amp;lt;     19200            103          99.2&lt;BR /&gt;  &amp;gt;=     19200,  &amp;lt;     32000             48          99.4&lt;BR /&gt;  &amp;gt;=     32000,  &amp;lt;     64000             24          99.5&lt;BR /&gt;  &amp;gt;=     64000,  &amp;lt;    128000             37          99.7&lt;BR /&gt;  &amp;gt;=    128000,  &amp;lt;    192000             41          99.8&lt;BR /&gt;  &amp;gt;=    192000,  &amp;lt;    320000             24          99.9&lt;BR /&gt;  &amp;gt;=    320000,  &amp;lt;    640000              2          99.9&lt;BR /&gt;  &amp;gt;=    640000,  &amp;lt;   1280000              1          99.9&lt;BR /&gt;  &amp;gt;=   1280000,  &amp;lt;   1920000              0          99.9&lt;BR /&gt;  &amp;gt;=   1920000,  &amp;lt;   3200000              4          99.9&lt;BR /&gt;  &amp;gt;=   3200000                           16         100.0&lt;BR /&gt;</description>
      <pubDate>Thu, 27 May 2010 10:26:24 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638469#M99899</guid>
      <dc:creator>Jim Hintze</dc:creator>
      <dc:date>2010-05-27T10:26:24Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638470#M99900</link>
      <description>&lt;!--!*#--&gt;Hii Jon,&lt;BR /&gt;I do have a program written in fortran to display the consequences of changing the cluster size. It will report the loss of space for various clustersizes.&lt;BR /&gt;See example below:&lt;BR /&gt;$clus dsa0:&lt;BR /&gt;Maximum indexfile &lt;BR /&gt; Fileheaders total    1047510 used      39975 free    1007535&lt;BR /&gt;Current size indexfile =     157410 blocks&lt;BR /&gt; Fileheaders total     157142 used      39975 free     117167&lt;BR /&gt;Current clustersize    =      3&lt;BR /&gt;Disk DSA0: has volume SYS_ITV002   Max. Files    1047510 (  4.35%)&lt;BR /&gt;#blocks used    8884089 #blocks allocated    9205746 Waste =       3.62%&lt;BR /&gt;Lost blocks =     269391&lt;BR /&gt;Found  45559 files, total nblock =    8884089 average size =    195&lt;BR /&gt;Found    9 files &amp;gt;= 65536 total size    3755249&lt;BR /&gt; Cluster size    3 tot_blocks =    8936364 waste =   0.59%&lt;BR /&gt; Cluster size    1 tot_blocks =    8884093 waste =   0.00%&lt;BR /&gt; Cluster size    2 tot_blocks =    8910406 waste =   0.30%&lt;BR /&gt; Cluster size    3 tot_blocks =    8936364 waste =   0.59%&lt;BR /&gt; Cluster size    4 tot_blocks =    8965407 waste =   0.92%&lt;BR /&gt; Cluster size    8 tot_blocks =    9084149 waste =   2.25%&lt;BR /&gt; Cluster size   16 tot_blocks =    9346377 waste =   5.20%&lt;BR /&gt; Cluster size   18 tot_blocks =    9418634 waste =   6.02%&lt;BR /&gt; Cluster size   32 tot_blocks =    9916961 waste =  11.63%&lt;BR /&gt; Cluster size   35 tot_blocks =   10030451 waste =  12.90%&lt;BR /&gt; Cluster size   64 tot_blocks =   11150289 waste =  25.51%&lt;BR /&gt; Cluster size   70 tot_blocks =   11388084 waste =  28.19%&lt;BR /&gt; Cluster size  144 tot_blocks =   14473241 waste =  62.91%&lt;BR /&gt;&lt;BR /&gt;It can also report on counts for filesizes.&lt;BR /&gt;Is this something you need?&lt;BR /&gt;&lt;BR /&gt;Fekko</description>
      <pubDate>Thu, 27 May 2010 10:50:21 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638470#M99900</guid>
      <dc:creator>Fekko Stubbe</dc:creator>
      <dc:date>2010-05-27T10:50:21Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638471#M99901</link>
      <description>&lt;BR /&gt;John, I've written such tools but I cant's readily find them. Mostly I just found that the simple algorithms are  best.&lt;BR /&gt;&lt;BR /&gt;The Maximum waste is simply the number of files timer clustersize - 1.&lt;BR /&gt;&lt;BR /&gt;The Average waste is simply the number of files times the half the cluster size.&lt;BR /&gt;&lt;BR /&gt;Now if you had a dominant file allocation usage, say 100,000 files out of the 300,000 files are always 1234 block, then you would know that right? So calculate the exact waste for those file and exclude them from the rest.&lt;BR /&gt;Or just pick the clustersize to match that dominant size.&lt;BR /&gt;&lt;BR /&gt;Jim&amp;gt;&amp;gt; I expect that cluster size has more to do with the geometryof the media than RMS allocation settings. The cluster size should be a devisor that produces a quotent with no remainder when using the track size as a dividend.&lt;BR /&gt;&lt;BR /&gt;The notion of exploiting disk geometry went out with the horse and buggies. &lt;BR /&gt;Disks have had variable geometry for decades (more segments on the outer bands than the inner band).&lt;BR /&gt;And mostly folks do not talk to disks directly anyway, but through smart controllers which slice and dice allocation over spindles as they see fit.&lt;BR /&gt;&lt;BR /&gt;OpenVMS just uses 1M (1024*1024) clusters per disk, figuring that will allow for up to a million (single extent) files if need be and reasonable waste for most usages&lt;BR /&gt;&lt;BR /&gt;The single suggestion I end up with is to pick a power of 2, and make it large when you anticipate few files (thousands) and smallish like 16 when expecting many files (hundreds of thousands)&lt;BR /&gt;&lt;BR /&gt;This will help align the with the typical RMS sequential file defaults, and the storage folks like it, and the XFC cache likes it. &lt;BR /&gt;It is tricky to get it all to work together, but  with clustersizes like 16 or 256 you have the best odds.&lt;BR /&gt;&lt;BR /&gt;Mark Hopkins once produced a histogram of MB/sec through the caches varying IO sizes. It had significant peaks as 4,8,16,32 and 64, the peaks at 16 and 32 being the highest.&lt;BR /&gt;&lt;BR /&gt;Hope this helps some,&lt;BR /&gt;Hein&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 27 May 2010 12:26:10 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638471#M99901</guid>
      <dc:creator>Hein van den Heuvel</dc:creator>
      <dc:date>2010-05-27T12:26:10Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638472#M99902</link>
      <description>Oh, I thought he was talking about an RM03 on an 11/780. :-)&lt;BR /&gt;</description>
      <pubDate>Thu, 27 May 2010 12:36:07 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638472#M99902</guid>
      <dc:creator>Jim Hintze</dc:creator>
      <dc:date>2010-05-27T12:36:07Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638473#M99903</link>
      <description>Here's some very quick DCL:&lt;BR /&gt;&lt;BR /&gt;&lt;A href="http://labs.hoffmanlabs.com/node/1582" target="_blank"&gt;http://labs.hoffmanlabs.com/node/1582&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Not pretty, but it gives you a powers-of-10 distribution.&lt;BR /&gt;&lt;BR /&gt;That result could be switched into a percentage graph or various other displays with minimal effort; this stuff isn't rocket science.&lt;BR /&gt;&lt;BR /&gt;FWIW...  Disk geometry is increasingly fictional on current-generation hard disks; VMS eliminated dependence on that a while back.   VMS also expects the older 512 byte sector sizes.  The current-generation IDEMA-compliant hard disk drive devices are now arriving with 4 KiB sectors; when HP might add that support and might start deploying SSD hardware is not something I'm aware of.&lt;BR /&gt;</description>
      <pubDate>Thu, 27 May 2010 13:32:55 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638473#M99903</guid>
      <dc:creator>Hoff</dc:creator>
      <dc:date>2010-05-27T13:32:55Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638474#M99904</link>
      <description>&lt;!--!*#--&gt;Jim, Nice comeback ! Made me smile!&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Jon, &lt;BR /&gt;I rolled some quick perl (see below), which may be good enough.&lt;BR /&gt;It may well have an off-by-one  error ( or two ), but it will help you decide what you need.&lt;BR /&gt;&lt;BR /&gt;I feed it with DFU SEARCH output, or DIR/SIZE=ALL output.&lt;BR /&gt;&lt;BR /&gt;Here is what it gives for the EISNER scratch disk ( DRA2 ) as example:&lt;BR /&gt;&lt;BR /&gt;64416 files with some allocation, Total allocation = 9146582, Average size = 141.&lt;BR /&gt;&lt;BR /&gt; Zone       Limit      Count.&lt;BR /&gt;    0           1        4586&lt;BR /&gt;    1           2       24275&lt;BR /&gt;    2           7       24790&lt;BR /&gt;    3          20        5496&lt;BR /&gt;    4          54        2783&lt;BR /&gt;    5         148        1288&lt;BR /&gt;    6         403         676&lt;BR /&gt;    7        1096         287&lt;BR /&gt;    8        2980         112&lt;BR /&gt;    9        8103          69&lt;BR /&gt;   10       22026          26&lt;BR /&gt;   11       59874          19&lt;BR /&gt;   12      162754           8&lt;BR /&gt;   13      442413           1&lt;BR /&gt;&lt;SNIP-ALL-ZERO&gt;&lt;BR /&gt;   21  1318815734           0&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Cluster     Waste      Simple&lt;BR /&gt;    1           0           0&lt;BR /&gt;    2       32892       32208&lt;BR /&gt;    3       68845       64416&lt;BR /&gt;    4      100098       96624&lt;BR /&gt;    5      134978      128832&lt;BR /&gt;    6      162088      161040&lt;BR /&gt;    7      191782      193248&lt;BR /&gt;    8      224946      225456&lt;BR /&gt;    9      256105      257664&lt;BR /&gt;   10      293788      289872&lt;BR /&gt;   11      334703      322080&lt;BR /&gt;   12      368014      354288&lt;BR /&gt;   13      415022      386496&lt;BR /&gt;   14      463018      418704&lt;BR /&gt;   15      514318      450912&lt;BR /&gt;   16      559786      483120&lt;BR /&gt;:&lt;BR /&gt;  126     7156558     4026000&lt;BR /&gt;  127     7217495     4058208&lt;BR /&gt;  128     7282346     4090416&lt;BR /&gt;&lt;BR /&gt;As you can see, the AVG simple algorithm drifts away from reality a lot when the cluster size becomes larger than most files. The other simple max calculation then becomes closer: files * (cluster_size minus fudge). &lt;BR /&gt;Fudge would be the mean file allocation.&lt;BR /&gt;&lt;BR /&gt;Hein&lt;BR /&gt;&lt;BR /&gt;$! ------ [hein]file_and_cluster_sizes.pl ----&lt;BR /&gt;# File size distribution and cluster size effects&lt;BR /&gt;# Feed this with DFU SEARCH OUTPUT or DIR/SIZE=ALL. Should look like:&lt;BR /&gt;# x$y:[p.q.r]a.b;n    n/m&lt;BR /&gt;&lt;BR /&gt;$min_cluster = 1;&lt;BR /&gt;$max_cluster = 128;&lt;BR /&gt;&lt;BR /&gt;while (&amp;lt;&amp;gt;) {&lt;BR /&gt;  next unless /;\d+\s+(\d+)\//;&lt;BR /&gt;  next unless $1;&lt;BR /&gt;  $allocation = $1;&lt;BR /&gt;  $total_allocation += $1;&lt;BR /&gt;  $files++;&lt;BR /&gt;  $zone[ int(log($allocation)) ]++;&lt;BR /&gt;  for ($i=$min_cluster ; $i&amp;lt;=$max_cluster; $i++) {&lt;BR /&gt;    $used =  $allocation % $i;&lt;BR /&gt;    $waste[$i] += ($used) ? $i - $used : 0 ;&lt;BR /&gt;  }&lt;BR /&gt;}&lt;BR /&gt;$avg = int($total_allocation/$files);&lt;BR /&gt;printf "$files files with some allocation, Total allocation = $total_allocation, Average size = $avg.\n";&lt;BR /&gt;print "\n Zone       Limit      Count.\n";&lt;BR /&gt;for ($i=0; exp($i) &amp;lt; 2**31; $i++) { # VMS will do 2GB files soon&lt;BR /&gt;   printf "%5d%12d%12d\n", $i, int(exp($i)), $zone[$i];&lt;BR /&gt;}&lt;BR /&gt;print "\nCluster     Waste      Simple\n";&lt;BR /&gt;for ($i=$min_cluster ; $i&amp;lt;=$max_cluster; $i++) {&lt;BR /&gt;   printf "%5d%12d%12d\n", $i, $waste[$i], int($files * ($i-1) / 2) ;&lt;BR /&gt;}&lt;BR /&gt;&lt;/SNIP-ALL-ZERO&gt;</description>
      <pubDate>Thu, 27 May 2010 13:55:36 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638474#M99904</guid>
      <dc:creator>Hein van den Heuvel</dc:creator>
      <dc:date>2010-05-27T13:55:36Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638475#M99905</link>
      <description>Jon,&lt;BR /&gt;&lt;BR /&gt;  Rather than try to build a utility to do this, I'd start with some raw data and play with it in a spreadsheet. &lt;BR /&gt;&lt;BR /&gt;$ PIPE DIRECTORY/NOHEAD/NOTRAIL/SIZE=ALL disk:[000000...]*.*;* | search sys$pipe "/" &amp;gt; rawdata.dat&lt;BR /&gt;&lt;BR /&gt;now edit rawdata.dat, change all "/" into "," and collapse out spaces. You now have a CSV file with file sizes in the first column and allocations in the second.&lt;BR /&gt;&lt;BR /&gt;Read the data into your favourite spreadsheet and generate histograms of sizes and allocations. You can also experiment with different cluster sizes by generating columns with projected minimum allocations and comparing column totals.&lt;BR /&gt;&lt;BR /&gt;You have many more options for manipulating and visualising the data than you could easily build into a utility.</description>
      <pubDate>Thu, 27 May 2010 21:10:11 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638475#M99905</guid>
      <dc:creator>John Gillings</dc:creator>
      <dc:date>2010-05-27T21:10:11Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638476#M99906</link>
      <description>Jim,&lt;BR /&gt;&lt;BR /&gt;Thanks, the performance advisor report is probably what I remember.  We had the DEC performance advisor, but when it was sold to CA we didn't continue with the maintenance, and many of the features of PSDC were VMS version dependent, so  it is no longer in our startup.  &lt;BR /&gt;&lt;BR /&gt;Since the disk analysis reporting feature is just looking at the INDEXF.SYS and BITMAP.SYS files, I decided to try it, and the PSDC V2.2-51 PSDC$DSKANL still works (with 1995 limitations) on VMS 8.3, although it is throwing a warning message.  Since it predated extended bitmaps, perhaps that is what the buffer&lt;BR /&gt;&lt;BR /&gt;OT$ anal/image/sel=(id,link,build) sys$system:psdc$dskanl.exe&lt;BR /&gt;SYS$COMMON:[SYSEXE]PSDC$DSKANL.EXE;2&lt;BR /&gt;"PSDC V2.2-51"&lt;BR /&gt;11-OCT-1995 11:47:58.55&lt;BR /&gt;""&lt;BR /&gt;OT$ advise collect report disk disk$user1 /out=scr:t.t&lt;BR /&gt;%PSDC-W-GETMSGWARN, $GETMSG System Service Warning&lt;BR /&gt;-SYSTEM-S-BUFFEROVF, output buffer overflow&lt;BR /&gt;OT$ &lt;BR /&gt;&lt;BR /&gt;I have attached a portion of the output from the above command for anyone that is interested. &lt;BR /&gt;&lt;BR /&gt;This utility is quite efficient, as it gets its info by scanning indexf.sys and bitmap.sys directly (it does not have to traverse all directory files).  &lt;BR /&gt;&lt;BR /&gt;So I can use this, but it isn't something that we can expect ITRC users to be able to run.&lt;BR /&gt;&lt;BR /&gt;BTW, are you the same Jim Hintze that presented the paper about Disk Fragmentation at the spring 1981 DECUS symposium?  (I am guessing you are since you know about RM03 drives)&lt;BR /&gt;&lt;BR /&gt;         On the Fragmentation of Disk, Jim Hintze, Eric  Deaton,&lt;BR /&gt;         Weeg Computing Center, pp. 1321-1325 Proceedings of the&lt;BR /&gt;         Digital Equipment Computer Users Society  Spring  1981,&lt;BR /&gt;         Volume 7, Number 4.&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;Fekko,&lt;BR /&gt;&lt;BR /&gt;If this is something that you can share, then yes, I would be interested.  Especially if it can be released like DIX and ACX, so ITRC folks could use it to gather information.  I didn't see it on the &lt;A href="http://www.oooovms.dyndns.org/" target="_blank"&gt;http://www.oooovms.dyndns.org/&lt;/A&gt; site, but perhaps I didn't know where to look.&lt;BR /&gt;&lt;BR /&gt;I assume that it is getting its file info by going directly to indexf.sys, and not by traversing all directories on the disk.  If so, and this is freely available, this may be the fastest freely available tool for file size distribution analysis.&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;Hein,&lt;BR /&gt;&lt;BR /&gt;I agree with your recommendations here and in &lt;A href="http://forums.itrc.hp.com/service/forums/questionanswer.do?threadId=1431785" target="_blank"&gt;http://forums.itrc.hp.com/service/forums/questionanswer.do?threadId=1431785&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Specifically, a cluster size that is a power of 2, and using different disks for small volatile files and large infrequently extended files.  Also, pre-extending the large RMS indexed files makes a lot of sense, especially if you can do so with CBT extensions.  Just curious, do your tools allow you to extend a file while the file is open (with update sharing allowed) by another process?  I assume your tool uses the $EXTEND service. &lt;BR /&gt;&lt;BR /&gt;One reason I like powers of 2 for cluster sizes is that will prevent files growing when backing up from one disk to another, even if backup/truncate is not used.&lt;BR /&gt;&lt;BR /&gt;Being nitpicky with your first comment.  Since VMS 7.2 the file allocation bitmap is no longer limited to 255 blocks (255 blocks * 512 bytes/block * 8 bits/byte = 1044480 bits which limits to 1044480 extents per pre-7.2 disk); it is now up to 65535*512*8 = 268431360 extents per disk.  Since your comment was in a "performance" related thread, I will concede that using extended bitmaps is not for enhanced performance (it can cause slow CBT creation/extension on fragmented disks); it is to allow for better space utilization of large disks.&lt;BR /&gt;&lt;BR /&gt;The perl tool you provided is good because it allows a subset of files on the disk to be analyzed.  This is also true for DCL that Hoff and the suggestion by John Gillings.  For a disk with many files on it, it would probably be quite a bit slower than a tool that is going directly to INDEXF.SYS for the info.&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;Hoff,&lt;BR /&gt;&lt;BR /&gt;Interesting method to get powers of ten without a log function.  &lt;BR /&gt;&lt;BR /&gt;$ sizelen = f$length(size)&lt;BR /&gt;$ count_'sizelen' = count_'sizelen' + 1&lt;BR /&gt;&lt;BR /&gt;Your submission has the advantage of working standalone on a plain vanilla VMS system that can't have any software&lt;BR /&gt;installed.&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;John Gillings,&lt;BR /&gt;&lt;BR /&gt;For detailed analysis, loading into a spreadsheet is a good method.  There may be issues with some spreadsheets not being able to deal with the number of records on a disk with many files.&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;All,&lt;BR /&gt;&lt;BR /&gt;Just for completeness, I did find another utility that will supply file size distribution.  Executive Software (makers of the DisKeeper defrag utility) has a "Disk Analysis Utility" available for download "free to qualified System Manager". I saw the pointer to this in their online whitepaper "Fragmentation: the Condition, the Cause, the Cure"  &lt;A href="http://www.diskeeper.com/fragbook/FRAGBOOK.HTM" target="_blank"&gt;http://www.diskeeper.com/fragbook/FRAGBOOK.HTM&lt;/A&gt; (infomercial for DisKeeper) where you can see an example of the output.  There is a link to a page to download the "OpenVMS Fragmentation Analysis Utility", but I didn't download it because I didn't want to be on their mailing list, and you must supply info to be able to download.&lt;BR /&gt;&lt;BR /&gt;&lt;A href="http://www.diskeeper.com/trialware/trialwareproducts.aspx" target="_blank"&gt;http://www.diskeeper.com/trialware/trialwareproducts.aspx&lt;/A&gt; ! link to page with download form (requires info I didn't want to provide)&lt;BR /&gt;&lt;BR /&gt;Jon</description>
      <pubDate>Fri, 28 May 2010 23:33:18 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638476#M99906</guid>
      <dc:creator>Jon Pinkley</dc:creator>
      <dc:date>2010-05-28T23:33:18Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638477#M99907</link>
      <description>&lt;!--!*#--&gt;Jon,&lt;BR /&gt;&lt;BR /&gt;I will make a kit and place it on oooovms.dyndns.org. Give me a week or so.&lt;BR /&gt;I will ask Ian Miller to place an announcement on openvms.org.&lt;BR /&gt;&lt;BR /&gt;Fekko</description>
      <pubDate>Mon, 31 May 2010 10:29:13 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638477#M99907</guid>
      <dc:creator>Fekko Stubbe</dc:creator>
      <dc:date>2010-05-31T10:29:13Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638478#M99908</link>
      <description>Fekko,&lt;BR /&gt;&lt;BR /&gt;I don't see an announcement at openvms.org yet, but do see that DISKSTAT 1.0 is up at &lt;A href="http://oooovms.dyndns.org/diskstat/" target="_blank"&gt;http://oooovms.dyndns.org/diskstat/&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;I have downloaded it and done some initial testing on a test Alpha (ES40) running VMS 8.3.&lt;BR /&gt;&lt;BR /&gt;I will provide feedback at diskstatdev@oooovms.dyndns.org when I have done a bit more testing.&lt;BR /&gt;&lt;BR /&gt;Thanks again, this looks useful.&lt;BR /&gt;&lt;BR /&gt;Jon</description>
      <pubDate>Mon, 07 Jun 2010 22:31:16 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638478#M99908</guid>
      <dc:creator>Jon Pinkley</dc:creator>
      <dc:date>2010-06-07T22:31:16Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638479#M99909</link>
      <description>Jon, You are too fast for me. I asked a friend (the webmaster of the site) to place the kit on the website, and this morning I got his mail that it was there. I will send Ian a mail that the tool is available. As always : Any feedback is welcome.&lt;BR /&gt;&lt;BR /&gt;Fekko</description>
      <pubDate>Tue, 08 Jun 2010 05:51:14 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638479#M99909</guid>
      <dc:creator>Fekko Stubbe</dc:creator>
      <dc:date>2010-06-08T05:51:14Z</dc:date>
    </item>
    <item>
      <title>Re: VMS utility to determine file size distribution?</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638480#M99910</link>
      <description>Jon,&lt;BR /&gt;Yes, seems like just yesterday? not exactly.&lt;BR /&gt;where in the world did you find that reference?&lt;BR /&gt;</description>
      <pubDate>Wed, 18 Aug 2010 14:13:42 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/vms-utility-to-determine-file-size-distribution/m-p/4638480#M99910</guid>
      <dc:creator>Jim Hintze</dc:creator>
      <dc:date>2010-08-18T14:13:42Z</dc:date>
    </item>
  </channel>
</rss>

