Re: sar avwait what is normal question

Tom Geudens · ‎02-04-2004

Hi all,
I should be a very happy SysAdmin, I've recently installed EMC DMX800 and CX600 storageboxes that drastically reduce the IO time.

However, they so drastically reduce it, that avserv is now (almost) always smaller than the avwait (which is +/- 5ms). The "powers that be" now want to know why avwait is "so high".

Can anyone explain (or point me to a document explaining) how avwait is computed and why it is what it is (in my case +/- 5ms) ?

Regards,
Tom Geudens

A life ? Cool ! Where can I download one of those from ?

Jean-Louis Phelix · ‎02-04-2004

Hi,

From this doc :

avwait = Average time (in milliseconds) that transfer requests waited idly on queue for the device

avserv = Average time (in milliseconds) to service each transfer request (includes seek, rotational latency, and data transfer times) for the device.

When average wait (avwait) is greater than average service time (avserv) it indicates the disk can't keep up with the load during that sample . This is considered a bottleneck .

The avwait is similar to %wio returned for sar -u on cpu .

Regards.

It works for me (© Bill McNAMARA ...)

Tom Geudens · ‎02-09-2004

Back to top ...
Can anyone tell me what are normal avwait times ? Does this depend on the type of machine, type of fibrecard, type of OS ?

Regards,
Tom Geudens

A life ? Cool ! Where can I download one of those from ?

Paula J Frazer-Campbell · ‎02-09-2004

Tom

This is Sun doc but is very good and may help.

http://sunsite.uakom.sk/sunworldonline/swol-06-1996/swol-06-perf.html

Paula

If you can spell SysAdmin then you is one - anon

Tom Geudens · ‎02-10-2004

Thanks Paula, it is indeed a very good document (very technical though). It'll take some reading ...

There's still 10 points available for anyone who can in a couple of simple (but technically sound ;-) steps explain the 5ms avwait-time on my HPUX-systems ... while I'm reading through the document Paula referred to.

Regards,
Tom Geudens

A life ? Cool ! Where can I download one of those from ?

Jeff Schussele · ‎02-10-2004

Hi Tom,

Depending on the avg latency of the drive itself - i.e. slower drives will inherently having higher avwait time - I always believe that anything in the single digits is AOK.
Of course, the lower, the better.

From the man page...

avwait Average time (in milliseconds) that transfer requests waited idly on queue for the device;
avserv Average time (in milliseconds) to service each transfer request (includes seek, rotational latency, and data transfer times) for the device.
%busy Portion of time device was busy servicing a request;
avque Average number of requests outstanding for the device;
r+w/s Number of data transfers per second (read and writes) from and to the device;
blks/s Number of bytes transferred (in 512-byte units) from and to the device

Rgds,
Jeff

PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!

Tom Geudens · ‎02-10-2004

Hi Jeff,
I agree on the single numbers. However, since our responsetime (avwait + avserv) is now < 10ms, I've got to come up with an explanation for those 5ms (some people I won't name will never be pleased with performance :-).

So I'm looking for the magic formula. Sometime on the lines of :
- Since your drives are so fast
- Since you are doing so many operations
- Since your operating system takes so long
to react
=> the avwait is 5ms.

Off course all of the above would have to be measurable ...

Regards,
Tom

A life ? Cool ! Where can I download one of those from ?

Jeff Schussele · ‎02-10-2004

I understand.
A lot of the time the numbers really depend on just what data the system is looking for & where it resides.
If the system keeps requesting data immediately adjacent to the last request - well, then those times will always be low because the seek will be next to immediate. BUT if the data is wildy random *every* time, then those numbers will always be higher as the heads seek all over the place. So attention needs to be paid to the request patterns the system (SW) is generating - are the consistently sequential or random?

Rgds,
Jeff

PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!

Tom Geudens · ‎02-10-2004

Hi Jeff,
I see this avwait (the 5ms on avarage) on all our central HPUX servers (8 of them, all HPUX 11.0, all fibreconnected to the SAN). Most of these are databaseservers, so there's a lot of random IO. However one of them is a fileserver, with a lot of sequential IO. Also, during databasebackup (also sequential) the avwait does not change. In fact, if I didn't see the occasional difference, I'd have said this number was hardcoded in sar.

Regards,
Tom Geudens

A life ? Cool ! Where can I download one of those from ?

Paula J Frazer-Campbell · ‎02-10-2004

Tom

As aready said :-

avwait Average time (in milliseconds) that transfer requests waited idly on queue for the device;

This is not for each request but an average for the collection period, so during a collection period 10 requests could be handled and the average being in your case 5ms.

These items are queued and each stem in servicing a request has an inherent latency - electricity is fast (just sub light) but has a measurable time to carry out each of the tasks the system requests it to do.

Example

1 Queued item.
2 ready to send signal given (time expended)
3 ready to receive signal given (time expended)
4 data sent (time expended)
5 data validity check (time expended)
6 All data recieved (time expended)
7 Ok sent (time expended)

Whilst not being definative the above shows the basics of moving and valitating data from one place to another, each step takes a measurable amount of time which in your case give an average wait of 5ms whilst the data in front is being delt with.

5ms is small and whilst 10ms is twice as big it is still small.

AS long as your servers are not showing bottlenecks then the 5/10ms is fine.

Out of interest every server no matter how good and fast will once loaded sufficiently have a bottleneck.

HTH

Paula

If you can spell SysAdmin then you is one - anon

Tom Geudens · ‎02-11-2004

Hi Paula,
Yes, I understand your point (and Jeff's as well), and you must be thinking I'm an absolute bore by now.
However, I've been asked by the "Evil Wizard of Numbers" to :
- Come up with an exact explanation of how and why avwait is 5ms.
- Any kernel- or other parameters that influence this.
In fact, he told me today he wants it down to a maximum of 2ms.

So I'll just keep searching ...
Regards,
Tom Geudens

A life ? Cool ! Where can I download one of those from ?

Sridhar Bhaskarla · ‎02-11-2004

Hi,

This is my understanding.

avserv - Average time in milli seconds to process a request. This does not include how long the request has to wait in the queue. Once it is out of queue, how long it took to get it processed by the disk subsystem.

avwait - Average time in milli seconds the request waited in the queue.

The total time it would take for a request to get processed would be the sum of the above. There may be a magical formula to arrive at exact avwait time. But for simple understanding consider the avserv is 1 ms. If there are 5 requests in the same time, then only one request can be processed and the other 4 would remain in the queue idle waiting to be processed. So, you can do the math there.

There are two factors that can affect avwait. Too many number of requests or too high service time.

A combination of avwait and avserv alone is not an indication of disk bottleneck. I would look at %busy and avqueue. If %busy is less than 70 and avqueue is around 0.5, then I would pay little attention to avwait and avserv values as they may vary depending on the type of request - read or write.

-Sri

You may be disappointed if you fail, but you are doomed if you don't try

doug mielke · ‎02-11-2004

What sar is able to tell us is often not the speed of the disks, but of the cache on the EMC. Unless you are doing synch. i/o in the emc, we can ignore (for writes) seek or rotational latency, and random access concerns. For reads physical latency is a factor only for data that's not been accessed for a while. (check hit rate on EMC)
So sar is really measuring the speed of serving data accross the fiber.

If it's waiting in the queue for service then the bottleneck ( if someone really needs to find one) would likely be the fiber controllers.
The fix? Buy some more controllers, use more ports on the fiber switch, etc. Mucho denero$.
Overall you're likely serving more data /sec. then your old storage, processes are completing faster, more run, fewer getting swapped out waiting for i/o....

BTW: our times 5 ms wait, 3 ms serv. from out san right now.

Paula J Frazer-Campbell · ‎02-11-2004

Hi

Another await defination:-

The time between the request being handed to the device controller and the end of transfer interrupt (the ``(device) service time''), typically composed of delays due to request scheduling at the controller, bus arbitration, possible seek time, rotational latency, data transfer, etc

TUNING

The first thing you are interested in is %busy. The "avwait" column is the average amount of time
that processes wait for the disk to give them their data, so that and "avque" (how many processes
are trying to use the disk) can give you a clear picture of load. The "avserve" is a measure of the
disks ability to deliver that data, and isn't going to change much for the same hardware.
Note that if %busy is small and avque and avwait are high, you are probably seeing buffer cache flushes. Those can affect performance, and they are tunables that affect how often and how much is flushed.

Tuning a Disk I/O-Bound System

It is important to note that at this point, there is no way to distinguish between an actual disk mechanism bottleneck and a bus controller bottleneck. That determination would require using devices external to the system. Any steps taken to alleviate disk bottlenecks should be taken with this in mind.

On a given device, you can determine which partition is causing high I/O in glance, most easily on the I/O by Logical Volume screen.

If the bottleneck is in swap partitions (caused by an unavoidable memory bottleneck), use multiple swap areas of equal size and equal priority to balance the I/O across multiple spindles/buses.
Balance disk I/O across multiple spindles/buses.
Tune (increase) the Buffer Cache size (this will help file systems only). This needs to be balanced with the fact that a very large buffer cache will negatively affect system performance.
If using Online JFS, use fsadm to tune the file systems.
The more free space in a file system, the more seek efficiently the data can be placed into the structure.
Dedicate a group of disks to a particular application, then balance the I/O for that application across this group of disks.
Use mkfs(1m) options when creating file systems. A file system can be created with a particular use in mind. For example, a file system with lots of cylinder groups and lots of inodes is designed to place files in locations within the file system structure that will provide the most efficient seek times when there are many small files.
Consider using asynchronous I/O (kernel parameter fs_async). Note that asynchronous I/O increases the chances of data loss in the event of a system crash.
Consider using immediate reporting (controlled by the scsictl command, or at 10.x, the kernel parameter default_disk_ir). This is also a very dangerous option which increases chances of data loss in the event of a system crash.
Minimize symbolic links.
When creating a file system, have the file system block size match the size of the files, if possible.
If the file system was created at 9.x, tune the file system to use a rotational delay of 0, with the tunefs command. This will only be effective in placement of files written after the file delay is changed. A rotational delay of 0 is the default at 10.x and up.
Increase ninode, if long directory paths are commonly used to access files on the system. However, ninode should not be too large--a large inode table can negatively affect system performance (most systems start seeing performance decrease when ninode is greater than 4000).

Also:-

http://docs.rinet.ru:8083/UNIXy/unx39.htm#I22

HTH

Paula

If you can spell SysAdmin then you is one - anon

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: sar avwait what is normal question

sar avwait what is normal question