Operating System - HP-UX
1837524 Members
3919 Online
110117 Solutions
New Discussion

Re: Process hang waiting for I-O

 
ATT NIS Division
New Member

Process hang waiting for I-O

On several ocassions we've had processes hang apparently waiting for I-O, and typically it is due to a disk going bad. Can't kill the pid because of the process priority. The logical volumes are mirrored, but the I-O error still prevents the I-O operation from completing. Our last resort has been to reboot. If I know the logical volume encountering the problem, and if I do an lvcreate -t 300 (I believe the default timeout is 0 - try forever); will this take effect immediately and timeout the I-O to the lvm, or would that particular I-O operation continue to timeout, but subsequent I-O to the same logical volume WILL or MAY timeout?? Rebooting is a horrible way out of this predicament, and what is mirroring buying me??
4 REPLIES 4
James R. Ferguson
Acclaimed Contributor

Re: Process hang waiting for I-O

Hi:

I'd certainly try setting the timeout of the logical volume. From Technical Knowledge Base document #UNX1030078:

/begin_quote/

To adjust the value of a logical volume's timeout, use lvchange(1M) with the -t option, specifying the number of seconds to try before timing out.

For example, to change the I/O timeout value of a logical volume (LV) to one minute (60 seconds):

lvchange -t 60 /dev/vg01/lvol1

This functionality was introduced at HP-UX 10.30.

LV Timeouts
-----------
Without LV timeouts, the system continues to retry an I/O to a non-responding disk until the disk responds. If the disk does not respond, the I/O never completes and never returns to the caller. In this case, the caller is in a "hung" state waiting for an I/O that will not complete.

If an LV timeout is specified, as described above, I/O to a non-responsive disk will also be retried, but only for a length of time that does not exceed the specified timeout value. If the disk fails to respond within that time, the system will return an I/O error to the caller. In this case, the caller will not be in a "hung" state that lasts longer than the specified timeout value.

The error returned to the user is EIO. This is a generic error signifying that the I/O was not able to complete. There is no way to know precisely why the I/O was unable to complete. For example, it could be due to some other I/O error in addition to exceeding the specified timeout.

The lvdisplay(1M) command can be used to display an LV's timeout value.

/end_quote/

Regards!

...JRF...
ATT NIS Division
New Member

Re: Process hang waiting for I-O

Thanks Jim,
If I already have a hung situation prior to setting the lvm timeout, will this take effect immediately after setting the timeout value? i.e., will I be able to free the hung process by setting the timeout value. Or does it only effect subsequent I-O to the logical volume?
Krishna Prasad
Trusted Contributor

Re: Process hang waiting for I-O

Just a question.

If lvdisplay shows the timeout as default. I take it is set to 0.

Would you change all logical volumes to have some kind of timeout setting?
Positive Results requires Positive Thinking
ATT NIS Division
New Member

Re: Process hang waiting for I-O

Ron,

It seems to imply that it would prevent you from getting into a hung io state. I'm just not sure what the other ramifications are. How would your RDBMS react, or your application?

It may still cause an outage. It may allow you to fail over to an adoptive node in a SG clustered environment. If it does, and you activate the volumegroup without quorum it may or may not use the disk that's going bad.

Does anyone else have experience using the logical volume timeout option??