Simpler Navigation for Servers and Operating Systems
Completed: a much simpler Servers and Operating Systems section of the Community. We combined many of the older boards, so you won't have to click through so many levels to get at the information you need. Check the consolidated boards here as many sub-forums are now single boards.
General
cancel
Showing results for 
Search instead for 
Did you mean: 

Optimum filesize filesystem size for large databases

SOLVED
Go to solution
Ed Sampson
Frequent Advisor

Optimum filesize filesystem size for large databases

What is the optimum filesize and filesystem size for a large (1.2TB) database? The OS is HP-UX 11.00 64bit, Oracle 32bit (8.0.6) and
emc disk storage with FC connections, with vxfs
filesystems.
5 REPLIES
Alexander M. Ermes
Honored Contributor

Re: Optimum filesize filesystem size for large databases

Hi there.
We have a V-class and we use an EMC array as well. We use the size of the physical disksize
as filesystem size as well. Do not forget to set the large-file-option on the Unix level.
Then you should be able to setup filesizes
as you like them. What we do hereis, that we
use filesizes of 4 GB per file. Reason :
if you want to copy these files to other systems, 4 GB is a size, that should not give problems copying to tape or so.
If you do an export, be careful with streaming
the data into zipped files. If the export file is larger than 2 GB, you have to import it with this command :
cat file | gunzip > .....
So far so good.
Hope it helps.
Rgds
Alexander M. Ermes
.. and all these memories are going to vanish like tears in the rain! final words from Rutger Hauer in "Blade Runner"
Ed Sampson
Frequent Advisor

Re: Optimum filesize filesystem size for large databases

My question was more of a generic performance question about the optimum filesize for large files with vxfs and HP-UX 11.00. Does vxfs do
indirect indexing of data blocks beyond a certain size, like most other unix filesystems?
Where are these performance points? If I tell them that the optimum size is 4gb (which may be true) that would mean they need 300 files to hold 1.2tb of data. I don't think they want to deal with 300 segments for a database.
Where does double indirect indexing start? what about triple indirect indexing? Are there any performance losses for using 150 logical volumes in a volume group?
Bruce Regittko_1
Esteemed Contributor

Re: Optimum filesize filesystem size for large databases

Hi,

There should be no significant performance penalties for having 150 logical volumes in a volume group as long as the physical extents don't get fragmented. You should be able to improve performance somewhat by increasing the extent size.

--Bruce
www.stratech.com/training
Trevor Dyson
Trusted Contributor
Solution

Re: Optimum filesize filesystem size for large databases

Hi Ed,

HP's Performance and tuning course may help with understanding the JFS structure.

As regards levels of indirection I believe JFS has much greater efficiency than HFS. This is mainly because of the extent based allocation of blocks to a file. A JFS extent is just a number of contiguous block on the fileseystem. Each file has a one or more extents, and each extent is listed in the inode, pretty much like in HFS. The difference is that the inode lists the beginning block number of the extent and the number of contiguos block that are in the extent. If the file's data blocks are contiguous on the filesystem then this addressing method is very efficient. If the file is fragmented then there will be many extents and accordingly there will be many extent records in the inode. Once the number of extents exceeds a certain value (don't know what) then indirect addressing of the extent records will begin. If you can keep your files contiguous then JFS will be very efficient when adrressing data blocks in very large files.

To keep your files contiguous you will need to buy and use online JFS from HP (unless you already have it) It has a defragmentation utility. It also has a tool to preallocate contiguous space to a file to ensure it never becomes fragmented. And you get a set of steak knives......no hang on .....it has additional mount options, for example to bypass the Unix buffer cache when the DB writes to disk (on Oracle the SGA buffers writes to disk so the Unix buffer cache is a waste of resource for Oracle data files.

For some more specifics I have seen previous posts by Bill Hassel on managing very large files and file systems. You may wish to do a forum search on his name.

I hope this helps.

Regards, Trevor
I've got a little black book with me poems in
Carol Garrett
Trusted Contributor

Re: Optimum filesize filesystem size for large databases


This is a very interesting question. We have some massive databases here with some db file sizes over several hundred gigabytes. We dont notice any performance differences between those and smaller databases. The only problem is admin/backup. Larger files are much harder to backup - lets face it, who has a single tape device that can backup more than around 100Gb on one tape ? (DLT8000). We broke some of our large database files into smaller 100Gb ones so now with several DLT8000's we can backup all these files at the sametime and at least get the db backed up in a pretty decent time. With one big db file over 1 Tb we couldnt use multiple backup drives to back it up.

For performance, the main thing for oracle is to match the filesystem size with the block size set in oracle. JFS is 8k by default, oracle should match it. But for bigger db files set the blocksize higher. Weve gone up to 64k and performance is slightly better. Also use the jfs direct options as someone else mentioned, this allows non-unix cache i/o for db files only (not indexes or logs). The Kernel performance tuning manual also has something to say about performance;

To improve disk I/O performance:

Distribute the work load across multiple disks. Disk I/O performance can be improved by splitting the work load. In many configurations, a single drive must handle operating system access, swap, and data file access simultaneously.
If these different tasks can be distributed across multiple disks then the job can be shared, providing subsequent performance improvements. For example, a system might be configured with four logical volumes, spread accross more than one physical volume. The HP-UX operating system could exist on one volume, the application on a second volume, swap space interleaved across all local disk drives and data files on a fourth volume.

Split swap space across two or more disk volumes. Device swap space can be distributed across disk volumes and interleaved. This will improve performance if your system starts paging.

Enable Asynchronous I/O - By default, HP-UX uses synchronous disk I/O, when writing file system "meta structures" (super block, directory blocks, inodes, etc.) to disk. This means that any file system activity of this type must complete to the disk before the program is allowed to continue; the process does not regain control until completion of the physical I/O. When HP-UX writes to disk asynchronously, I/O is scheduled at some later time and the process regains control immediately, without waiting.

Synchronous writes of the meta structures ensure file system integrity in case of system crash, but this kind of disk writing also impedes system performance. Run-time performance increases significantly (up to roughly ten percent) on I/O intensive applications when all disk writes occur asynchronously; little effect is seen for compute-bound processes. Benchmarks have shown that load times for large files can be improved by as much as 20% using asynchronous I/O. However, if a system using asynchronous disk writes of meta structures crashes, recovery might require system administrator intervention using fsck and, might also cause data loss. You must determine whether the improved performance is worth the slight risk of data loss in the event of a system crash. A UPS device, used in a power failure event will help reduce the risk of lost data.

Asynchronous writing of the file system meta structures is enabled by setting the value of the kernel parameter fs_async to 1 and disabled by setting it to 0, the default.