Operating System - OpenVMS
1753861 Members
7394 Online
108809 Solutions
New Discussion юеВ

Re: Weird disk free blocks increase after increasing lun size

 
SOLVED
Go to solution
EdgarZamora_1
Respected Contributor

Weird disk free blocks increase after increasing lun size

I had a 168GB lun that was filling up so I decided to increase it to 200GB on the backend. After increasing the size, the number of free blocks show an illogical increase in size (about 65GB in increased free space). No user freed up a huge amount of space within the few seconds that I took to increase the size of the LUN. Any ideas?

(See attachment for some logs I cut and pasted)

Thanks in advance for any ideas. This isn't really important. It's just a little puzzling.
8 REPLIES 8
John Gillings
Honored Contributor
Solution

Re: Weird disk free blocks increase after increasing lun size

Edgar,

I don't know what your check_diskspace procedure does, but keep in mind that these are relatively large 32bit signed integers. It's entirely possible that DCL arithmetic is overflowing somewhere. Make sure your numbers agree with raw SHOW DEVICE and F$GETDVI.

On the other hand, the algorithms for keeping track of free space on OpenVMS disks has always been a bit hit and miss (for that matter, the very definition of free space is rather hard to pin down). It wouldn't surprise me if expanding a disk caused problems. Consider, you're dealing with disks far larger than the original architects would have imagined, and the concept of a disk changing size on the fly would almost certainly have be assumed impossible.

Maybe SET VOLUME/REBUILD=FORCE?

Yet another possibility is the reported free space prior to the expansion was incorrect, and the expansion has done an implicit correcting /REBUILD?

If you do this again, capture a SHOW DEVICE/FULL, and execute a /REBUILD=FORCE before the expansion.
A crucible of informative mistakes
Hein van den Heuvel
Honored Contributor

Re: Weird disk free blocks increase after increasing lun size

I like the REBUILD suggestion, although it should not be needed.

I like the ORIGINAL free space question, but that's late. How about turning that around and have DFU REPORT or DIR/GRAND [000000...] (not [*...]) count the actual use?

I like the DCL 32 bit math question, but that should play only in the 2G-block space, or when multiplying to bytes, and it would not influence the SHOW DEV/FULL output which was attached and shows the large number.

fwiw, I did a quick replay with an LD device and the basic operation looks clean / as expected. With growth, but without SET VOL/SIZE initially the MAXBLOCK is already increased, but FREEBLOCKS as obtained from the BITMAP not yet. After SET VOL/SIZ all is back in sync and ANALYZE/DISK is happy. See below.

Hein

$ ld create lda9.dsk/size=20000
$ ld connect lda9.dsk lda9:
$ init lda9: hein /limit=100000/clus=1
%INIT-I-LIMITCHANGED, value for /LIMIT increased to 1048576
$ moun lda9 hein
%MOUNT-I-MOUNTED, HEIN mounted on _EISNER$LDA9:
$ pipe show dev lda9:/full | search sys$pipe total,free
Total blocks 20000 Sectors per track 11
Free blocks 19567 Maximum files allowed 262144
$ crea/dir lda9:[temp]
$ copy/allo=1000 nl: lda9:[temp]a.tmp
$ copy/allo=3000 nl: lda9:[temp]b.tmp
$ dir/grand lda9:[*...]/size=allo
Grand total of 1 directory, 2 files, 4000 blocks.
$ pipe show dev lda9:/full | search sys$pipe total,free
Total blocks 20000 Sectors per track 11
Free blocks 15566 Maximum files allowed 262144

$ dism lda9
$ ld disco lda9
$ mcr sys$login:extend -b -e 10000 LDA9.DSK
Old: 20001 blocks in EISNER$DRA3:[DECUSERVE_USER.HEIN]LDA9.DSK;1
New: 30003 blocks. Extended by 10002.
$ set file/end LDA9.DSK;
$ ld connect lda9.dsk lda9:
$ $ mount lda9: hein
%MOUNT-I-MOUNTED, HEIN mounted on _EISNER$LDA9:
$ pipe show dev lda9:/full | search sys$pipe total,free
Total blocks 30003 Sectors per track 13
Free blocks 15566 Maximum files allowed 262144
$ dir/grand lda9:[*...]/size=allo

Grand total of 1 directory, 2 files, 4000 blocks.
$ set volu/size lda9:
$ pipe show dev lda9:/full | search sys$pipe total,free
Total blocks 30003 Sectors per track 13
Free blocks 25569 Maximum files allowed 262144
$ anal/disk lda9:
Analyze/Disk_Structure for _EISNER$LDA9: started on 28-MAR-2011 17:27:57.82
%ANALDISK-I-OPENQUOTA, error opening QUOTA.SYS

P Muralidhar Kini
Honored Contributor

Re: Weird disk free blocks increase after increasing lun size

Hi Edgar,

Looks like your command procedure "@su:check_diskspace" picks only
relevant fields from the "SHOW DEVICE" output for displaying.

Like John, even i suspect that the reported free space on the disk disk$data1 was incorrect prior to the size expansion. The expansion might have done a implicit rebuild as a result of which you are seeing a drastic increase in the free block count.

Note that the free block count displayed by the "SHOW DEVICE" command is not always accurrate and can drift away from the actual value. A rebuild operation on the volume would correct it.

The "SHOW DEVICE/REBUID" on the disk might give you a indication as to whether a rebuild is required on the volume or not.

In case you plan to try this again, you can consider the following prior to expanding the volume size
1) SHOW DEVICE/REBUID on the disk to know the status of the REBUILD flag
2) SET VOLUME/REBUILD=FORCE on the disk to forcefully trigger a rebuild opreration
(just in case the REBUILD flag is not set under some scenario)
3) Try the volume expansion

This way you can eliminate the involvement of Volume Rebuild operation.

Hope this helps.

Regards,
Murali
Let There Be Rock - AC/DC
Malcolm Wade
Valued Contributor

Re: Weird disk free blocks increase after increasing lun size

I see this quite often in clustered environments.

I have never bothered to try and get to the bottom of it; I simply do a $ SET VOLUME/REB=FORCE (via SYSMAN with a SET ENV/CLUSTER in place) to correct the free space counts.

Don't recall seeing it on standalone systems.

Malcolm
Jan van den Ende
Honored Contributor

Re: Weird disk free blocks increase after increasing lun size

Edgar,

Clustered environment, right?

In a cluster, each node has its own EXTEND CACHE. The contents of a nodes'own cache are (I think) counted as free blocks, but the other nodes' caches obviously are NOT.
I suspect SET VOLUME simply clears those caches (returning the space), to allow for starting to use the newly-added space.

just my EUR 0.02

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
EdgarZamora_1
Respected Contributor

Re: Weird disk free blocks increase after increasing lun size

Thank you for all the responses. This system is a standalone system. This morning I did a show dev/rebuild (it indicated no rebuild needed) and I did a set vol/rebuild=force. It's still reporting the "large" free space amount. I also did an anal/disk and it did report some free space drift but not by a large amount (see below). I'm going to assume the number I have now is correct and that I had an incorrect prior (prior to the size expansion) free space number as a few of you have suggested. Thanks for all the ideas. Appreciate it.

%ANALDISK-W-FREESPADRIFT, free block count of 176192224 is incorrect (RVN 1);
the correct value is 176191856
John Gillings
Honored Contributor

Re: Weird disk free blocks increase after increasing lun size

Edgar,

As you probably already know, but it's worth repeating often...

The "Free Space" on a disk in a multi user operating system is a very nebulous concept. Add caches and multiple cluster nodes and it gets even worse. At best it's an advisory estimate. It's all a matter of expectation. As long as you understand the limitations, and don't make any unrealistic assumptions or dependencies, you won't get burned.

The implementation on OpenVMS uses Lock Value Blocks (LVB) to communicate changes in disk usage. LVBs are not 100% reliable. Normally any algorithm which uses LVBs should have an alternate mechanism to determine the "true" value, in case an LVB is lost by the lock manager. This is not a bug. It's a fundamental consequence of the design.

For disk free space the alternate is ANALYZE/DISK and SET VOLUME/REBUILD[=FORCE]. Note that =FORCE is a relatively new feature, specifically added because free space drift was a fairly hot issue in customer support centres.

Drift isn't necessarily gradual, nor is it proportional to anything in particular. It can be subtly or spectacularly wrong.

For code, don't ever even bother to look at the reported disk free space. If you want disk space, ask for it. If it's available, you will get it regardless of the reported availability. Handle the error if the request is refused.

For interactive use, if the disk space looks wrong, it probably is. Perform a disk analysis or a rebuild.
A crucible of informative mistakes
GuentherF
Trusted Contributor

Re: Weird disk free blocks increase after increasing lun size

To add to what John said...

On each VMS node for each mounted volume the free block count is kept in memory (the Lock Value Block of the Volume Lock) for quick and easy display in DCl command "SHOW DEVICE". Only one node in the cluster has the most recent value which other nodes can obtain. If that node 'resets' (crash, power failure) an older value of any surviving node is then used...and that is stale. So it is not actually a 'drift' but a plain wrong value. Because it is only used to display free blocks and never used by the file system it is a (nasty) cosmetic error. Only a DCl-"SET VOLUME/REBUILD=FORCE" or a cluster re-mount of this volume corrects that. Thus the stale value can hang around for a long time...unrecognized.

Second, cluster-or-not, each time a volume is mounted a certain amount of free blocks (usually ca. 10% of free space) are marked allocated on disk (in [000000]BITMAP.SYS) and the list of allocated blocks is kept in memory. Each node which mounts a volume does this. If any one node dies these pre-allocated blocks remain allocated. Only a "MOUNT/REBUILD" or a "SET VOLUME/REBUILD" recover these blocks. Again, this situation can exist for quite some time...unrecognized.

It is a good practice to run DCL-"ANALYZE/DISK/LOCK" (don't forget the /LOCK!!! otherwise the report is useless) on a regular basis and take a look at the report.

/Guenther

(Btw. there is a very easy fix to handle the stale value in the LVB. But it would have required to lock the volume until the bitmap on disk has been scanned and the correct free block count has been obtained. This would be triggered by an invalid LVB (currently ignored in the code) when a node in the cluster "resets". The "lock volume" would have forced ALL nodes in the cluster to flush their file system caches to disk adding more delays during a cluster transition. It would have fixed the "free block count drift" once-and-for-all.)