1847257 Members
5824 Online
110263 Solutions
New Discussion

Re: Processes hang

 
SOLVED
Go to solution
mike mcclernon
Occasional Contributor

Processes hang

I have a K570 running 10.20. I had processes that failed or were interupted and were not going away. Unsuccessful mounts, backups, finds, etc.. were not dying. I couldn't kill the processes and finally had to reboot the server to get rid of them.

I used glance and checked memory and the system tables. The processes were not using CPU, but they were interfering with the mounts and backups. All processes were waiting for I/O, the parent process had died. The filesystems had space.

Any thoughts or suggestions of what could have caused the problem and what I could look for when this happens again.

Thank you in advance for any assistance you will provide.

I
13 REPLIES 13
John Palmer
Honored Contributor
Solution

Re: Processes hang

Hi Mike,

This sounds like a hardware problem - processes hanging attempting to access a disk.

It could be a single disk or even a SCSI problem (e.g. termination).

I would expect you to continue having problems after the reboot.

Regards,
John
Jeff Machols
Esteemed Contributor

Re: Processes hang

Mike,

How were you trying to kill the processes? Some processes can trap certain signals (1,2,3,15), did you try using the kill -9 PID?
Darrell Allen
Honored Contributor

Re: Processes hang

Hi Mike,

This isn't a technical explanation by any means...

Sometimes it may seem that a kill command doesn't work on a process (such as kill -9 on a tar to a tape). What often happens is the process gets the signais while blocked on an I/O request. It will seem the signal had no effect but once the I/O request completes, the signal will have the results desired.

This is regardless of whether or not the process can trap for the signal sent.

Darrell
"What, Me Worry?" - Alfred E. Neuman (Mad Magazine)
Jeff Machols
Esteemed Contributor

Re: Processes hang

one other question, do you use NFS mounts? Is it possible an NFS server went down. If you have hard NFS mounts, and the server you mount to is gone, any processes will be hung, bdf will hang, mount will hang, and the only way to resolve the problem is to bounce the box.
John Waller
Esteemed Contributor

Re: Processes hang

I've had a similar problem which turned out to be problems with NFS mounts. A poor network link was causing a problem with the automount and I was using my HP to backup on a nightly basis some remote NFS filesystems.
John Bolene
Honored Contributor

Re: Processes hang

If a process is waiting on an I/O to complete, it will wait forever if it is a local disk subsystem. If it is a remote NFS drive, then it might timeout.

If an I/O cannot be completed because of a problem somewhere, the machine must be rebooted as the I/O interrupt is still pending.

Sounds like you may have a disk problem starting to happen.
It is always a good day when you are launching rockets! http://tripolioklahoma.org, Mostly Missiles http://mostlymissiles.com
Sridhar Bhaskarla
Honored Contributor

Re: Processes hang

Hi Mike,

-unsuccessful mounts
-failed backups
-'find' hanging through mount points
-waiting on I/O
-couldn't kill

all are symptoms of bad H/W somewhere probably in the SCSI chain or it could be the backplane itself.

Use stm and check various components on the system and check the syslog to see if there are any errors like scsi lbolt etc.,

If you have EMS installed, enable it for all the subsystems and see if you get any warnings.

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Krishna Prasad
Trusted Contributor

Re: Processes hang

Where the find commands searching mounted directory's?.

I have seen bad find commands looking for path that was not mounted. I.E. Stale NFS or path that just doesnt exist.

It sounds like you may be using NFS and have some mounts that are not auto-mounting or are going away.

Are all of the mount problems with NFS?

If so take a look at some of the NFS patches.

Also did any of your servers change IP's lately. You can have some strange problems with NFS when you change IP's on a server in rare occassions.

If this is the case look in /var/statmon for old records.

If your mount problems are not NFS related I would make sure you don't have any stale extents on your drives.
Positive Results requires Positive Thinking
Darrell Allen
Honored Contributor

Re: Processes hang

Hi again,

Sounds like the reboot cleared everything. Has it happened any more? Did you find any messages from Predictive, EMS, or in syslog? What type of filesystem was having the problem?

"Curious" Darrell
"What, Me Worry?" - Alfred E. Neuman (Mad Magazine)
mike mcclernon
Occasional Contributor

Re: Processes hang

To all;

I was not using NFS and the syslogs have not uncovered shown any problems. I was trying to mount a cd-rom when the mount command hung.

I have run the stm execerises on the disk and all have checked out.

Thanks to all that have replied, I appreciate your input and ideas.
John Palmer
Honored Contributor

Re: Processes hang

Were you using pfs_mount? If so then this uses NFS and the above comments about NFS apply.

Regards,
John
Darrell Allen
Honored Contributor

Re: Processes hang

Thanks for the info Mike. Years ago I had a similar problem because someone ejected a CD while it was mounted. I just tried it on my test K570 11.0 system. Pressing the eject button was ignored while the cd was mounted so I don't know how that can happen now (unless the 'ole paper clip in the little hole trick is used - and I don't even know if that will work). Any way, I can see where that could cause problems.

Darrell
"What, Me Worry?" - Alfred E. Neuman (Mad Magazine)
mike mcclernon
Occasional Contributor

Re: Processes hang

My troubles were after a problem with a pfs_mount. My subsequent mount commands used the regular mount command, but I could take it that the psf_mount could have caused the problem. Therefore, the articles mentioning NFS could be valid. I will have to go back throught the logs and try to correlate the times.

Thanks
Mike