Re: Use of HFS inode cache

Greg Nowicki · ‎08-28-2005

On systems with /stand being the only HFS file system, what else is in the inode cache? sar -v reports many more inodes cached than there are in use on /stand. I am trying to figure out how low ninode can be set.

Steven E. Protter · ‎08-28-2005

/stand isn't exactly a very busy filesystem. The kernel loads there to boot and there is some activity when patching and when you change the kernel.

Though I don't think there is a big payoff in lowering that innode figure, if you lower it, and the innode the system needs is not cached then disk will be read.

This is an excellent performance tuning document, though a more modern version may be hiding out there:

http://www1.itrc.hp.com/service/cki/docDisplay.do?docLocale=en_US&docId=200000077186712

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Sudeesh · ‎08-28-2005

If /stand is your only HFS file system, then ninode can be configured very low (maybe 200). If you configure a small HFS inode cache, be sure the DNLC (Directory Name Lookup Cache) is configured appropriately for other file system types.

The ninode tunable used to configure the HFS inode cache size can also impact the size of the Directory Name Lookup Cache (DNLC).

The dependency on the ninode tunable is reduced with the introduction of 2 tunables:
ncsize â Introduced with PHKL_18335 on 10.20. Determines the size of the Directory
Name Lookup Cache independent of ninode.
vx_ncsize â Introduced in 11.0. Used with ncsize to determine the overall size of the
Directory Name Lookup Cache .

While you can tune ncsize independently of ninode, the default value is still dependent on ninode and is calculated as follows:
(NINODE+VX_NCSIZE)+(8*DNLC_HASH_LOCKS)

Beginning with JFS 3.5 on 11.11, the DNLC entries for JFS files are maintained in a separate JFS DNLC.

Have look at following doc for details:
http://docs.hp.com/en/5580/Misconfigured_Resources.pdf

Sudeesh

The most predictable thing in life is its unpredictability

Bill Hassell · ‎08-29-2005

The inode cache sized by ninode is (for historical reasons) linked to other parameters and makes sizing a lot more complicated. If you do not have NFS on this system, you can lower it to just a few hundred. Both sar -v and glance do not give an accurate picture of the inode cache empty slots. The inode cache 'remembers' or caches current and recently opened files (not the data, just the location) and it's purpose is to eliminate unnecessary directory searches by using an in-memory list. The likelyhood of opening the same file several times during the day is high, espercially for system files, thus the value in having the inode cache.

The formula provided by SAM templates is way too high for most systems. Even for NFS, you can leave ninode between 1000 and 4000. It does not need to be larger unless the majority of your filesystems are HFS rather than VxFS. To find the filesystem type:

mount -p | awk '{print $3"\t",$2}'

Note that sar and glance will report the cache full or nearly most of the time. This is not meaningful because the report does not indicate how many entries can be reused and there is no metric to determine this number. So leave it at a fixed number, certainly not 10000 or 40000 as the formula might specify. The extra large value wastes RAM inside the kernel.

Bill Hassell, sysadmin

Greg Nowicki · ‎08-29-2005

I am familiar with the excellent documents referred to by Stephen and Sudeesh. I also understand that, as Bill pointed out, it functions as a cache and will hold closed files. But I can not reconcile how sar can report 1500 inodes cached when there are less than 100 inodes in use on /stand.

Bill Hassell · ‎08-30-2005

Ahhh, the answer lies in that the inode cache is not one-per-file. If you read a file that is more than 8k in size (HFS block size), then each additional 8k needs another inode. So you can roughly calculate the number of inodes (not from bdf -i) by dividing the Kbytes used by 8 (bdf reports Kbytes, so divide by 8 for 8k inodes). Now this isn't very accurate but for a typical /stand directory with less than 100 files, the size is about 50megs, so 50000/8=6250. So how did all those 6 thousand inodes get accessed? When was your last backup? How about a recursive grep? Once the files have been read (which is what a backup does), all the inodes for each file are cached and will stay in the cache forever until new inodes are accessed and the old ones are replaced.

So if you set ninode = 8000 on a typical 11.11 system, it won't get fully used (unless you create a bunch of extra vmunix files).

As a practical matter, I would set it to 1000 and forget it unless you need to tune NFS--then refer to the NFS for HP-UX book by Dave Olker.

Bill Hassell, sysadmin

Sameer_Nirmal · ‎08-30-2005

I assume that you have JFS 3.3 installed.

sar -v shows the system-wide inode-cache.
This will include hfs and vxfs inode entries.

I believe the reason for it is , ninode parameter , which still comes into picture for both hfs and vxfs for DNLC calculation.
Thus DNLC is comman for hfs and vxfs, even though both have different inode-cache.

In JFS 3.5 , separate DNLC is maintained for vxfs files. I guess even in sar -v report, you will have system-wide inode-cache calculations.

You can check the current number of inodes in the jfs inode cache using
For JFS 3.3
adb -k /stand/vmunix /dev/mem
> vx_cur_inodes/D
For JFS 3.5
vxfsstat -v / | grep curino

Greg Nowicki · ‎08-30-2005

Sameer, I believe you are incorrect about 'sar -v' reporting on both HFS and JFS inodes. The maximum number cooresponds to the ninode value which only controls the HFS inode table.

Bill, if you are right about the actual inode usage being greater than reported by bdf, that would explain the discrepancy. But a search did not find anything to back up your assertion. I know that large files exceed the capacity of their initial inode and must use continuation inodes but I did not think another inode was needed for every 8KB. Your formula would give rather uniform inode usage but my observation has been that much fewer inodes are needed for a small number of large files.

Bill Hassell · ‎08-31-2005

A bit more explanation on HFS (only) inode usage. When a filesystem is first created on HFS, all the inodes that will ever exist are also created. That's one of the reasons that bdf -i option exists. The newfs option for HFS has a -i option to control inode creation, expressed in bytes per inode. The default is 6144 bytes per inode and is used to only to control the quantity of inodes created. For a disk with very few but very large files, bytes per inode might be increased to 5x or 10x that number so fewer inodes will be crteated (more on the quantity later).

At the same time, the block size for an initial inode (not to be confused with LVM blocksize) is established. The default is 8K, the range is 4K through 64k. Now if the file is smaller than 8K, inode pointers will be conserved by assigning a fragment of a block (typically 1K) to these small files. Inside the inode, there are 15 pointers, the first 12 point to blocks of the file, but the 13th pointer is an indirect pointer, pointing to a special pointer-only block containing 2048 pointers (we'll assume all defaults for HFS, specifically 8K blocksize). Once this pointer block has been used for a large file, a second level indirect (the 14th slot in the inode) points to a pointer block that contains 2048 pointers to more pointer blocks (or 2048*2048 pointers). And for really big files, the 15th slot in the file's inode points to 2048 pointer blocks, each of which points to 2048 pointer blocks or 2048*2048*2048 pointers to 8K blocks. So the largest file in HFS can be FS_BLOCKSIZE * (12+2048+2048^2+2048^3). The actual maximum filesize is larger than the largest supported HFS filesystem.

Here's a really useful document from 10 years ago: http://uwsg.ucs.indiana.edu/usail/peripherals/disks/adding/HPfs.html

So technically, there are less than 100 active inodes in /stand. I believe (but don't have any docs handy) that the pointers are also loaded into the cache as each file is accessed, thus filling the cache when backing up /stand. The inode and the pointers will remain in the cache (even if the file is no longer open) forever or until a new file is opened and the entries reused. The purpose of the inode cache is to eliminate looking on the disk to discover the address of each block of a file.

Bill Hassell, sysadmin

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Use of HFS inode cache

Use of HFS inode cache