buffer cache and I/O

HP-UX Enthusiast · ‎09-28-2005

We are running Oracle on a system with 36GB of memory. We have recently noticed slow I/O from within Oracle.

I checked the buffer cache hit and it's 52%. I understand that the cache hit % should be above 90% on a nice and healthy system.

dbc max is 8 and min is 5, which I understand are the kernel defaults, so these need to be changed in some way since I have read that the default for these parameters is not suitable on high RAM systems.

My question is what should I reduce these values to? I understand that the sweet spot for Oracle is 300MB, but that between 700 and 900 is OK. If dbc_max_pct is 8, does that mean the maximum buffer cache could be 2.88GB? Which is far too high?

Secondly will getting the buffer cache down to a reasonable size affect I/O in any way (i.e for better or worse).

If this is of any use: total memory usage is constantly between 92-96%, with system at 12% and user the remainder.

Thanks!

Jean-Luc Oudart · ‎09-28-2005

Hi,

5% of 36GB is 1.8GB seems quite big to me for buffer cache
On HPUX 11.0 we fixed it to ~300Mb. I believe this should be higher with HPUX 11i

Which version of OS do you run ?
Do you have OnlineJFS ? how do you mount the Oracle database fs (unless you use raw) ?

Do you have (Oracle) statspack reports ?

Regards
Jean-Luc

fiat lux

Sยภเl Kย๓คг · ‎09-28-2005

Try with this settings

dbc_max_pct 5 - 5
dbc_min_pct 5 - 5

Regards,
SK

Your imagination is the preview of your life's coming attractions

Jean-Luc Oudart · ‎09-28-2005

Hi again

find attached the HP-UX Performance Cookbook (in case you don't have it).

Regards
Jean-Luc

fiat lux

RAC_1 · ‎09-28-2005

In your case, it is set at a high value. (As seen from buffer cache hits.)

you can certainly bring it down. min 2% - high 5 % would suffice. (I personally never set it over 800Mb)
Again monitoe usage with sar -b 2 5 and check 5 write cache rate. Should be around 90 % or more.

There is no substitute to HARDWORK

saju_2 · ‎09-28-2005

Hi

Pls find the attached document which covers about the miscofigured parameters..It has chapeter on Buffer cache.

Hope this helps.

Regards
CS

James R. Ferguson · ‎09-28-2005

Hi:

I think you could lower your buffer cache to perhaps 3-5% (min-max). More importantly, leverage *by-passing* double buffering since Oracle, by design, does its own buffering. Make sure that your filesystems (assuming that you are them) are mounted with optimail VxFS (JFS) options: delaylog, nodetainlog, mincache=direct & convosync=direct.

Regards!

...JRF...

Geoff Wild · ‎09-28-2005

Oracle has it's own buffer cache - so it won't use HP's...

That said, with 36GB of ram - you don't need 2.8GB of cache...

I like the 500-800 MB of cache range,,,

I would set top 2 for both dbc_max_pct and dbc_min_pct.

If you are running 11iv2 - changes are dynamic - anything earlier - requires a reboot.

Rgds...Geoff

Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.

Eric Antunes · ‎09-28-2005

Hi,

From Oracle perspective (since you already have HP-UX perspective):

- check this:

select (bbc.total_waits*100/(cg.value+dbg.value)) "Buff busy ratio ind. < 0,007"
from v$system_event bbc,
v$sysstat cg, v$sysstat dbg
where bbc.event='buffer busy waits'
and cg.name ='consistent gets'
and dbg.name='db block gets'

select round(((1-(sum(decode(name, 'physical reads', value,0)) /
(sum(decode(name,'db block gets', value,0))+
(sum(decode(name, 'consistent gets', value, 0))))))*100),2)
|| '%' "Buffer Cache Hit Ratio > 90%"
from v$sysstat;

- what are your db_block_size, db_block_buffers, log_checkpoint_interval (should be twice redolog size) and log_checkpoint_timeout?

- Use a different tablespace for indexes and tables for each module of the application

- avoid fragmentation of tablespaces:

See the fragmentation in Oracle Tablespace Manager or trought this query:

select file_id, block_id, blocks,
owner||'.'||segment_name "Name"
from sys.dba_extents
where tablespace_name = upper(:p_tablespace_name)
UNION
select file_id, block_id, blocks,
'Free'
from sys.dba_free_space
where tablespace_name = upper(:p_tablespace_name)
order by 1,2,3

Rebuild the highly fragmented indexes:

alter index . rebuild
storage( initial next ); -- INITIAL_SIZE and NEXT_SIZE can be 8M or 8K for example

PS: if you don't have Online JFS use options delaylog, nodatainlog for data logical volumes and delaylog,datainlog for binaries.

Best Regards,

Eric Antunes

Each and every day is a good day to learn.

HP-UX Enthusiast · ‎09-28-2005

Thanks for all the help and attachments.

I am running 11i V1.

I will reduce the max and min, perhaps to 3 initially and then to 2.

James - thanks for those mount options, I read the man page for mount_vxfs but it's not exactly easy to understand, can you tell me what the mincache and convosync options do when set to direct? Does it somehow cause reads/writes to bypass the buffer cache altogether, in which case is that safe?

Jean-Luc Oudart · ‎09-28-2005

Hi

part of the answer is in the document attached in previous post.

also from HP3KOracle (Sandy Gruver
Senior Technical Consultant HP/Oracle Advanced TechnologyÂ Center ):
===========================================
To gain the most performance from the JFS file system, the Online JFS product should also be implemented. This will allow the use of enhanced mount options that optimize the interaction of Oracle and HP-UX. These options include:
delaylog â this option allows the file system to delay the writing on non-critical information to the JFS intent log. This improves the performance of the file system by allowing some I/O operations to return before this information is put into the intent log on the disk. An example would be when an I/O operation only entails changing the timestamp on the file. In this case, the contents and structure of the file would not be compromised if a crash were to occur before the intent log information made it to the disk.
nodatainlog â Oracle always uses O_SYNC writes, this option will avoid JFS writing the data to the intent log as well as the file.
mincache=direct â The default read operation for JFS copies data from disk to the HP-UX buffer cache, and then copies data to the Oracle SGA. Setting this mount options causes the data to be moved directly into the Oracle SGA; this may provide a minor improvement in the performance for non-sequential read operations. In 8.x versions of Oracle, this mount option will cause unnecessary physical I/O for sequential I/Os. This mount option should NOT BE USED with Oracle 8.x tablespace files, however it is recommended for Oracle 8.x redo and archive file systems.
convosync=direct â This option changes the behavior of files opened with the Osync flag enable, which Oracle always uses. This will enable Osync I/O operations to operate the same as non-osync file operations and thus use the mincache=direct mount option. In 8.x versions of Oracle, this mount option will cause unnecessary physical I/O for sequential I/Os. This mount option should NOT BE USED with Oracle 8.x tablespace files, however it is recommended for Oracle 8.x redo and archive file systems.
=============================================
I suppose it means you mus know if your data access os mosttly random or sequential to choose the right parameter.

Regards
Jean-Luc

fiat lux

Jean-Luc Oudart · ‎09-28-2005

Also if you have acces to Metalink (yous hould if you have Oracle support) :
check Metalink doc id 10668.996 (forum thread) : Mount file system using -o mincache=direct option

Regards
Jean-Luc

fiat lux

A. Clay Stephenson · ‎09-28-2005

Under 11.11 (unlike previous versions of HP-UX), I've actually found (and measured) the best Oracle performance using fully cooked files (i.e not using the OnlineJFS mount options convosync=direct,mincache=direct to bypass the buffer cache). Moreover, on systems with large amounts of memory, fixing buffer cache at 800-1600MB by setting bufpages to a non-zero value tends to be the optimum value. I wouldn't be concerned about a 52% cache hit rate on boxes mainly doing random i/o. Because, your overall memory usage is always high what may really be happening is that you are sometimes actually paging out. Because the hit you take from pageout's is on the order of 100X worse than that of having to go to disk rather than buffer (whether Oracle SGA or UNIX) cache, you might actually see better performance by reducing the size of the SGA. If you see any significant pageout rates, insist that the DBA's reduce their SGA footprint despite what they say because again, swapping is far, far worse than extra Oracle trips to disk.

If it ain't broke, I can fix that.

Jean-Luc Oudart · ‎09-28-2005

and Clay reminded me I had a thread on this subject (mincache=direct) 2 years ago.

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=193104

Regards
Jean-Luc

fiat lux

HP-UX Enthusiast · ‎09-28-2005

Thanks Jean.

Hi Clay - after what you said I had a look a the pageout rate from my historical MWA data.

Most of the time it is 0.

However during periods of the day lasting not more than 10 minutes the pageout rate rockets to between 2000 and 20000. This may occur about 8 times, but not more than 10.

Should I throw more memory at the system?

I'm a bit reluctant since if I tie it in with logical and physical IO rate then the moment at which the system is paging out, the those I/O rates don't really increase noticeably.

Hence I am thinking the I/O problem is not related to paging.

Fundamentally the problem is that we have had a whole new load of users to SAP which may or may not being using the disk more - and I initially wanted to know if decreasing the buffer cache from 2.8GB to around at least 1GB is going to result in better I/O performance because the disk would be visited less.

What do you think?

Eric Antunes · ‎09-28-2005

Complementing Clay's post, tell your DBA's to read metalink note 100709.1 (this is about Oracle 8 but they may find a note for Oracle 9).

Best Regards,

Eric Antunes

Each and every day is a good day to learn.

Eric Antunes · ‎09-28-2005

"However during periods of the day lasting not more than 10 minutes the pageout rate rockets to between 2000 and 20000. This may occur about 8 times, but not more than 10."

Do you have RAW devices? If no, are the DBA's trying to use async_io??

Best Regards,

Eric Antunes

Each and every day is a good day to learn.

HP-UX Enthusiast · ‎09-28-2005

Eric
No all the datafiles are located on VxFS filesystems.

Not sure if they are using async_io - should they be? If not, why? (so I can go and tell them to check and possibly change)

Thanks!

Eric Antunes · ‎09-28-2005

If you use VxFS filesystems, your DBA's
must have the init.ora disk_asynch_io = false. Otherwise you will have those I/O peaks!

There is a metalink note or bug about that but I don't remember the number...

Best Regards,

Eric Antunes

Each and every day is a good day to learn.

A. Clay Stephenson · ‎09-28-2005

If you are seeing pageout rates between 2K/s and 20K/s for 10 minutes then your performance has got to be terrible. You definitely have a memory bottleneck. You need to either reduce load or add more memory. I would reduce buffer cache to about 1200MB by setting bufpages to 307200. On systems with large amounts of memory, it's much easier to use fixed buffer caches so that the dbc_xxx_pct values don't have to be adjusted each time memory is changed. I don't expect to see much improvement by making such a minor tuning change because 1GB or so of memory is not that much in your scheme of things. Because this is Oracle, I would check to make sure that timeslice has not been set to 1 (rather than a value very near 10).

If it ain't broke, I can fix that.

Jean-Luc Oudart · ‎09-28-2005

Currently async IOs only supported for raw device.

Regards
Jean-Luc

fiat lux

Eric Antunes · ‎09-28-2005

It's Metalink Note 1016262.102 (Excessive CPU Usage by Checkpoint Process)

See also this previous thread: http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=836925

Eric

Each and every day is a good day to learn.

Jeff Lightner_1 · ‎09-29-2005

So far as I know async I/O device only comes into place when one is using raw devices rather than filesystems. You should ignore it if you're using vxfs as you say.

James Ferguson gave you the answer for Oracle on filesystems. You need to disable those filesystems that contain oracle datafiles from using OS buffer cache at all. This is done with the convosync and mincache directives in fstab. Once we enabled these some time back we saw performance improvement. An example:

/dev/vx/dsk/vxconv/convdata /database/conv2data vxfs mincache=direct,convosync=direct,delaylog,suid 0 3

The above all being one line of course.

The reason this helps is that Oracle has its own buffering mechanism using its SGA. If you don't disable OS buffer cache for Oracle filesystems then it tries to double buffer so that the OS buffering is actually impeding the Oracle buffering.

Note that you should only do this for the filesystems that contain datafiles (databases, redologs etc...) but not the ones that contain regular files (archive logs, Oracle binaries/executables).

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

buffer cache and I/O

buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O

Re: buffer cache and I/O