Cranking up "ninode" to improve backup performance

Steve Bonds · ‎08-06-2002

We have several filesystems with over 500,000 files per filesystem. This can be a real CPU-burner when running backups, or when someone decides this would be a great place to run a "find" command.

Would boosting "ninode" help in this situation? How big can this go? Could I get it big enough to cache several million inodes?

I would think that due to the infrequent and sequential nature of most backups and find commands that boosting the inode cache would only help if ALL the inodes were cached, otherwise any decent least-recently-used algorithm is going to toss out the cache entries before the next backup.

What do you all think?

Stefan Farrelly · ‎08-06-2002

Ive cetainly heard of servers with zillions of files having ninode to a massively high number so I dont see why you cant give it a go. Its simply a cache size - its limited by physical RAM so it can be as big as you want.

However, for backup performance I think you would increase it by adjusting some of your backup parameters - eg. if using fbackup you can specify all sorts of things like blocks per record, blocksize, number of reader processes (the more the better as these will buffer up data to keep the backup streaming fully). You can adjust this in Omniback too - the more readers the better, this should isolate you a little from others doing things like a find command and slowing down your backup.

Im from Palmerston North, New Zealand, but somehow ended up in London...

Steve Bonds · ‎08-07-2002

Thanks for the reply, Stefan. The more traditional backup optimizations you mentioned would be great if we weren't read-constrained on these massively populated filesystems.

The "find" commands themselves aren't really a problem with backups, they're a problem in their own right. I would like to find (no pun intended) a way to make them run much more quickly and with much less CPU (filesystem I/O) overhead.

Do you know of any practical limits on the inode cache? I.e. could I have a 10GB inode cache if I wanted one and had the physical RAM for it? I should add that this would be for HP-UX 11.0.

Stefan Farrelly · ‎08-07-2002

Hi Steve,

there is a way to make finds go faster, take a look at this;

Bill Hassel wrote on May 20 2002.....

A higher buffer cache size (perhaps 500 megs) will help, but the interactive delays you are seeing (ie, login) are largely due to sequential I/O such as copying files to another server. HP-UX gives preference to sequential I/O, so much so that it can severely delay random I/O such as a login request.

A new parameter has been introduced with recent patches: disksort_seconds. Note that the SAM Help on Context file and a section 5 man page are still missing but the patch documentation has the details. Here is an excerpt from the 11.00 patch:

PHKL_21768:

The system sometimes takes a very long time to respond to a disk read/write request (could be up to several hundred seconds) while it is busy processing other I/O requests on the same disk, especially when there are sequential file accesses going on.

This is a fairness problem with the disk sort algorithm. The disk sort algorithm is used to reduce the disk head retractions. With this algorithm, all I/O requests with the same priority are queued in non-descending order of disk block number before being processed if the queue is not empty. When requests come in faster than they can be processed, the queue becomes longer, the time needed to perform one scan (from smallest block number to largest block number of the disk) could be very long in the worst case scenarios.

It is unfair for the request which came in early but has been continuously pushed back to the end of the queue because it has a large block number or it just missed the current scan. These kind of unlucky requests could line up in the queue for as long as the time needed for processing a whole scan (which could take a few minutes). This situation usually happens when a process tries to access a disk while another process is performing sequential accesses to the same disk.

Resolution:

To prevent this problem from happening, we have to take the time aspect into consideration in the sorting algorithm. We add a time stamp for each request when it is enqueued, which is used as the second sorting key for the queue (1st key: process priority; 2nd key: enqueued time; 3rd key: block number). The granularity of the time stamp value is controlled by a new tunable "disksort_seconds".

If we set "disksort_seconds" to N (N>0), for all the requests with the same priority, we can guarantee that any given request will be processed earlier than those which come in N seconds later than this request. Within each N second period (requests have the same time stamp), all requests are sorted by non-descending block number order. By choosing the right "disksort_seconds" value, we can balance the maximum waiting time of requests and the efficiency of disk accesses. The tunable parameter can be set to 0, 1, 2, 4, 8, 16, 32, 64, 128 or 256 second(s). If "disksort_seconds" is 0 (default value), the time stamp is disabled, which means that time aspect is not taking effect.

Im from Palmerston North, New Zealand, but somehow ended up in London...

Steve Bonds · ‎08-07-2002

I don't see how this would be applicable to the problem I'm having. I'm not seeing any problems with excessively delayed I/Os. Quite the contrary-- most of them seem to complete quite fast.

The I/O overhead is not a result of real disk I/O, but rather is "system" CPU time spent fetching all those inode entries out of our disk array's cache.

Each inode retrieval probably only takes about 5ms, but when there are half a million of them, it can add up. My thought was that by caching all these inodes, that retrieval time could be reduced to about 100ns. ;-)

Frank Slootweg · ‎08-07-2002

ninode is only relevant for HFS filesystems, i.e. the inode table/cache (set by ninode) is not used for VxFS/JFS filesystems.

However to be Frank :-) I don't know where VxFS/JFS inodes are cached. I *assume* in the normal File System Buffer Cache, but I don't know that for a fact.

Anyway, perhaps you should give some more details about your problem, because "a real CPU burner" indicates that the inodes *are* (mostly)in some kind of cache. If they are not in a cache, then it is "a real *disk* burner", not a CPU burner.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Cranking up "ninode" to improve backup performance

Cranking up "ninode" to improve backup performance

Re: Cranking up "ninode" to improve backup performance

Re: Cranking up "ninode" to improve backup performance

Re: Cranking up "ninode" to improve backup performance

Re: Cranking up "ninode" to improve backup performance

Re: Cranking up "ninode" to improve backup performance