Operating System - OpenVMS
1753813 Members
7914 Online
108805 Solutions
New Discussion юеВ

Can't kill non existent process

 
Mike Smith_33
Super Advisor

Can't kill non existent process

We are running OpenVMS 7.3-2 update 9 on Alphaserver. We have a home grown app which uses Ingres as the database. For the second time in the past year we have run into a problem that has ended in a reboot. The users connect from a pc to the VMS box, log into a VMS account which takes them into the app. Apparently sometimes due to database locking issues and operator frustration during these times, the result is a reboot of the pc while still connected to the VMS box. This tends to leave the account that the operator was using, logged into the box, but not really still there.

A $show user shows the process but a stop/id will not remove it. I found a tool called forcex and tried that but it said the process did not exist. The operators start trying to log in and out and in and out and we end up with several of these zombie like processes all in rwmbx state along with many other processes in that state. I never get called until they have hosed it as thoroughly as possible and the mill is about to shut down. At that point, there is little time for troubleshooting just screams from the mill about possible production losses and cries of "Reboot, Reboot".

I have the document for troubleshooting process in rwmbx but never enough time to go through all the processes to find a culprit. Any suggestions on how to delete these processes when they occur? I am not sure what state these processes or non processes or really in and whether the doc would help anyway.

5 REPLIES 5
Jur van der Burg
Respected Contributor

Re: Can't kill non existent process

RWMBX can be caused by several things like system or application issues, but especially for an unexperienced person it may take time to find the cause. If it happens in the future then instead of a normal reboot just force a system crash, this takes a little bit longer but saves a crashdump with valuable info. If you can't figure it out yourself and you have a support contract you can also get HP involved.

Jur.
Hoff
Honored Contributor

Re: Can't kill non existent process

Jur is correct.

What I usually recommend for these situations is to post explicit details on properly performing a system crash sequence directly onto or immediately near the console terminal, and I configure the system to ensure that a valid crashdump is logged into the system dump file.

There can be various causes, and usually involving a stuffed-up mailbox somewhere. As this is a production environment, suggestions around reviewing and then modifying the application code to better detect and report (or detect and avoid) these cases may or may not be feasible. This assumes these mailboxes are yours, and not something underneath the database.

Information on Resource Waits and on RWMBX states can be found in various spots; I've posted up the following related articles a while back:

http://64.223.189.234/node/231
http://64.223.189.234/node/250

Stephen Hoffman
HoffmanLabs LLC
Mike Smith_33
Super Advisor

Re: Can't kill non existent process

The application people have come out that the root cause of the entire problem is something accessing the database. We do have a support contract so crashing instead of rebooting is a great suggestion as well.
Dean McGorrill
Valued Contributor

Re: Can't kill non existent process

Ask the application people if its safe to
clear the mailboxes. I've cleared a backlog
of processes piled up on one process in rwmbx. I'd find the mbx(s) and what they want with sda, eg if needs a read in dcl

$ open/read xx mbaxxx
$ read xx foo

another might need a write,

do until the process you did the stop/id
disappears. I'd definitly get the crash
and get the problem fixed if you can though.
if its safe, this might keep the mill from
screaming at you.
Mike Smith_33
Super Advisor

Re: Can't kill non existent process

It looks like getting a crash dump is the best way to go.