Operating System - HP-UX
1832604 Members
2258 Online
110043 Solutions
New Discussion

Re: disk thrashing diagnosis

 
Phil Corchary
Advisor

disk thrashing diagnosis

(HP K570 3GB RAM, HPUX 11.0, Oracle 7.3.4, Oracle Financials 10x, JFS 3.0, HP AutoRAID, all Oracle on vxfs not raw, 50GB database)

Via Glance and sar reports, I know that I have disk thrashing on this system.

In particular I was noticing today during a peak processing time that one file system showed very high LOGICAL r/w (mostly read), but virutally NO PHYSICAL r/w!

What would cause this. All other filesystems seem to have 1-1 relationship between logical and physical r/w. I do know that this file system contains one of the main Oracle redo logs.

99% of the game is half mental. - Yogi Bera
10 REPLIES 10
Rick Garland
Honored Contributor

Re: disk thrashing diagnosis

Sounds as if Oracle is doing most of the work on this disk. Are the Oracle files spread out on differnet disks? Example, the Oracle index files on a disk, the data files on another disk, the redo logs on yet another disk, etc... May need to some load balancing a move stuff around.
Brian M. Fisher
Honored Contributor

Re: disk thrashing diagnosis

Buffer cache will cause logical reads instead of physical reads.

Brian
<*(((>< er
Perception IS Reality
Phil Corchary
Advisor

Re: disk thrashing diagnosis

Ok.

1) Rick: everything is on LVM. The Oracle files are spread out where possible across Filesystems (e.g. datafile on one FS, index on another FS - some db's cover 3-5 datafiles) I count 6 redo logs, each on different FS. Between LVM and HP AutoRAID, I don't see a way to granularly control which datafiles (tablespaces) reside where to eliminate contention .... ?

2) Good point ... how do I determine if Oracle is caching, or HPUX?

TIA

99% of the game is half mental. - Yogi Bera
Phil Corchary
Advisor

Re: disk thrashing diagnosis

... and why only ONE FS showing the discrepancy between logical and physical I/O?
99% of the game is half mental. - Yogi Bera
Tim Malnati
Honored Contributor

Re: disk thrashing diagnosis

Having high LOGICAL utilization (vs PHYSICAL) is NOT a bad thing in of itself and is NOT an indication of disk thrashing either. In this case read requests are being satisfied by the buffer cache in system memory. In reality it is the best situation to be in for read activity. Every 30 seconds (default) the system syncer process will come along and scan the buffer cache area and flush write activity out to the phisical disks. A high percentage of writes can cause problems though. If your system feels like it periodically 'hickups', this is a likely cause. In this case you would be better suited to have some hardware cache (that's part of the reason why disk arrays are becoming more popular). The syncer frequency can be modified, but I don't recommend it unless you know what you are doing. In many cases, reducing the buffer cache may be necessary in order to reduce the overall amount of write data is hanging around.
Phil Corchary
Advisor

Re: disk thrashing diagnosis

Attached is filtered output of sar -d from a sar sampling every 10 sec during a peak period. The filter basically looks for instances where the queue is long, or the wait greatly exceeds the serve for I/O

Comments?

(Again, LVM, AutoRAID)
99% of the game is half mental. - Yogi Bera
Andrew Fieldsend
Occasional Advisor

Re: disk thrashing diagnosis

Are you ronline redo logs on c3t6d0 and c4t6d0, and are these on the AutoRAID?

I've seen a similar system where we had to add a few small (mirrored) JBOD to get the performance on the online redo logs.

Also, how large are your online redo logs, and where do you put your archive redos? If they are also on the AutoRAID(s), you can get a situation where every log switch copies the redo log out to the archive area, which is on the same AutoRAID. If the redo log size is relatively small, and therefore switching frequently, this can cause a lot of disk traffic.
CHRIS_ANORUO
Honored Contributor

Re: disk thrashing diagnosis

Phil

From the init***.ora file you can find out the settings of oracle for each instance.
The buffering on HPUX is seen from glance or you can as well use mount -v to check is the vxfs filesystem is setup for delaylog,nodatainlog or the default which will show log.
You can use old sar command of "sar -d"

Cheers!
When We Seek To Discover The Best In Others, We Somehow Bring Out The Best In Ourselves.
Phil Corchary
Advisor

Re: disk thrashing diagnosis

ALL of the database files, redo logs, archive logs, etc. are ALL on the AutoRAID.
Redo logs at 5MB, and roll VERY frequently (on the order of 200-1100 times per day!)
99% of the game is half mental. - Yogi Bera
Andrew Fieldsend
Occasional Advisor

Re: disk thrashing diagnosis

I'm not a DBA, but I would suggest you either move the redo logs off the AutoRAID, or resize them so that they switch much less frequently (ideally both) - a DBA I used to work with preferred log switches to occur about every hour.

We were using 70-100MB redo logs, which sounds as though it would be a good start point for you.