Operating System - HP-UX
1833750 Members
2538 Online
110063 Solutions
New Discussion

Re: Drunk Processor - 11i ???

 
Laurie A. Krumrey
Regular Advisor

Drunk Processor - 11i ???

Hi All,

We have two L2000, one on 11.0 and the other
on 11i, all applications the same.

The 11i server is new. My dba is complaining
how slow 11i is. Well, I've notice it too,
it will be fast in the shell or SAM, then
it kind of hicups and gets slow for a little
bit, then it gets fast again.

It's really like my processor is drunk. The
response time is fast (normal), then slow,
then fast again, then slow..It's a very
inconsistent thing and I can't figure out
why.

This is a brand new system and no production
is even running on it.

The load average is under 0.28.

There are no production proesses running yet.
Here is my output from the top command, these
are the top processes running.

vxfsd
dmisp
rep_server
agdbserver
midaemon
ioconfigd
opcctla

Now I did set this system up, so maybe in
setting up 11i I set something up I shouldn't
have, but I don't even know what to look
for. The new box has four 440MHz processors
so I am not sure what it's problem is.

Have ideas...
Laurie Krumrey
Happiness is a choice
2 REPLIES 2
John Poff
Honored Contributor

Re: Drunk Processor - 11i ???

Hi Laurie,

Just some ideas. Have you found any clues in your syslog file? It sounds like maybe a SCSI or network card is timing out. Maybe a SCSI device is acting up, or has a bad terminator? Have you tried running STM on the box?

Of course, today is SysAdmin Appreciation Day, so it is possible that one of your CPUs has started drinking! :)

JP
Bill Hassell
Honored Contributor

Re: Drunk Processor - 11i ???

There may be some patches to address this problem, but first check the size of your buffer cache. The default is still a ridiculous 50% of RAM so after a large number of writes in a short time, there will be long delays when the syncer tries to flush out all the old writes in the cache.

However, there's a new tunable just added to 11.0 (and part of 11i) to help with a problem with intense disk I/O for large file activity. The buffercache prefers large sequential IO.
Copying a large file to a filesystem may force other (low priority) commands that are operating on the same filesystem (e.g. bdf) to wait.

There is a kernel tunable introduced with[PHKL_21678/PACHRDME/English] - 11.00 Disk sort algorithm fix for slow io response that changes the so called "disk sort" algorithm within buffercache code.

This is a fairness problem with the disk sort algorithm. The disk sort algorithm is used to reduce the disk head retractions. With this algorithm, all I/O requests with the same priority are queued in non-descending order of disk block number before being processed if the queue is not empty. When requests
come in faster than they can be processed, the queue becomes longer, the time needed to perform one scan (from smallest block number to largest block number of the disk) could be very long in the worst case scenarios.

It is unfair for the request which came in early but has been continuously pushed back to the end of the queue because it has a large block number or it just missed the current scan. These kind of unlucky requests could line up in the queue for as long as the time needed for processing a whole scan (which could take a few minutes). This situation usually happens when a process tries to access a disk while another process is performing sequential accesses to the same disk.

Resolution: To prevent this problem from happening, we have to take the time
aspect into consideration in the sorting algorithm. We add a time stamp for each request when it is enqueued, which is used as the second sorting key for the queue (1st key: process priority; 2nd key: enqueued time; 3rd key: block number). The granularity of the time stamp value is controlled by a new
tunable "disksort_seconds".

If we set "disksort_seconds" to N (N>0), for all the requests with the same priority, we can guarantee that any given request will be processed earlier than those which come in N seconds later than this request. Within each N second period (requests have the same time stamp), all requests are sorted by non-descending block number order.

By choosing the right "disksort_seconds" value, we can balance the maximum waiting time of requests and the efficiency of disk accesses. The tunable parameter can be set
to 0, 1, 2, 4, 8, 16, 32, 64, 128 or 256 second(s). If "disksort_seconds" is 0 (default value), the time stamp is disabled, which means that time aspect is not taking effect.





Bill Hassell, sysadmin