Operating System - HP-UX
1833897 Members
1601 Online
110063 Solutions
New Discussion

Extremely bad avwait and avserv figures

 
Ninad_1
Honored Contributor

Extremely bad avwait and avserv figures

Hi,

I have attached an extract from the sar -d output after sorting the output by avwait and avserv to get the top stats. Also the pivot table is there at the end which shows how many occurances of a particular disk are there in the top stats to get an idea which are the top affected disks.
The setup is - a V2500 server running HP-UX 11.00 , SG 11.09 , Oracle database 8.1.6 , OLTP + batch jobs - mainly sql scripts.- Storage used in XP256.
Please can anyone tell me firstly - arent the avwait and avserv figures extremely bad ? How to find out the reason , any guess what it could be ?
I found out that the top 2 disks c13t4d4 and c73t5d0 are in the same disk group and a single volume has been configured which is the filesystem for archive redo logs + area where the archived redo logs are kept in compressed form after backing up to tape every 10 mins.

What do such huge figures for avwait and avserv indicate ?

Thanks,

Ninad
4 REPLIES 4
Steven E. Protter
Exalted Contributor

Re: Extremely bad avwait and avserv figures

Shalom Ninad,

The figures are indeed not optimal from any normal standpoint.

I'd be willing to bet that you have heavy writes happening in the redo logs, indexes and data, and all of that is sitting on Raid 5 storage. It might benefit from Raid 1 or Raid 10, though I know thats very hard to get now.

To be precise, your heavy write to storage is holding everything up.

You also might have poor sql code that is contributing to the problem.

I recommend the following steps:
1) Get management to make a policy that when storage wants to rearrange the part of storage your stuff sits on, you are in on the review process. Surely whatever they did has made things worse.
2) Have a meeting with storage and see about providing faster storage for the areas you have heavy writes to.
3) Reveiw your database statistics and start a sql code review on anything using a lot of resources.

You will find, though tedious, there will be a great benefit from all three items over the long haul.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Ninad_1
Honored Contributor

Re: Extremely bad avwait and avserv figures

Hi Steven,

Thanks for the inputs.
The storage is using RAID1 volumes for all the volumes currently. But am not able to understand the figures - You feel that there may be a lot of reads and writes for redo,indexes and data - thats fine but why these particular 2 disks could be having maximum occurances of high avwait and avserv ? The database is using raw volumes.
The no of r+w/s and blks/s are comparatively less for these 2 disks than for some disks used by database. e.g. -

time device %busy avque r+w/s blks/s avwait avserv
15:45:01 c13t14d3 71.48 0.5 391 13368 5.04 2.87
17:50:00 c13t14d3 89.24 0.5 390 16368 5.03 5.08
15:00:01 c13t14d3 71.11 0.5 387 12713 5.05 2.83
16:30:13 c13t14d3 67.3 0.5 387 15337 5.04 3.24
17:55:00 c13t14d3 74.99 0.5 383 14229 5.04 3.32
05:00:01 c13t14d3 66.35 0.5 379 12541 5.05 2.71
14:50:01 c13t14d3 70.58 0.5 379 12380 5.05 3.33
17:45:00 c13t14d3 78.45 0.5 373 13252 5.06 4.12
15:40:01 c13t14d3 65.27 0.5 368 12049 5.05 2.91
14:25:01 c13t14d3 82.39 0.5 364 20303 5.04 5.02
15:15:01 c13t14d3 68.17 0.5 364 11858 5.05 2.87
17:30:00 c13t14d3 78.59 0.5 364 12116 5.04 4.74

The only diff I notice is that the database is using raw vols and the disks I mentioned are filesystems. And I feel I have got a breakthrough in this problem - Please guide me if this is correct.
I just checked each disk in the top avwait and avserv list till almost top 70 odd entries and observed that all these disks are hosting filesystems. Also the buffer cache seems to have been configured at default 10% and since memory is 24 GB it amounts to 2.4GB, and as I understand from several other posts in the forum that this is a big big cache figure and can cause problems - time spent to scan thru the cache for each read/write (Is it for read as well as write or just read or just write ?? _ can you confirm).
Should reducing this cache figure make a lot of difference or its of no use ?

Please provide your valueable guidance as ever.

Thanks,

Ninad
Steven E. Protter
Exalted Contributor

Re: Extremely bad avwait and avserv figures

Shalom Ninad,

Oracle has its own cache, so a large buffer cache is double buffering. It does not help performance and could increase the chance of data loss in a sudden system stop.

I would set max_pct within a few percentage points of each other.

Changing this figure is very expensive in CPU terms.

dbc_max_pct 50 - 50
dbc_min_pct 5 - 5


Thats the default from a system I have yet to tune(how embarassing).

I normally drop dbc_max_pct to 7.

Requires a reboot, can have a massive performance impact.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Ninad_1
Honored Contributor

Re: Extremely bad avwait and avserv figures

Steven,

As I mentioned the database is already using raw volumes - thus not using filesystem buffer cache. There are other filesystems as well not used by database (apart from the archive redo logs FS) .What I would like to know is
1. Whether keeping the buffer cache 2.4 GB in this case is causing the high avserv and high avwaits apart including the many read/writes on the 2 disks ?
2. Will reducing the buffer cache solve the problem?
3. Will reducing the buffer cache affect adversely for filesystem reads ?

Thanks,
Ninad