1752273 Members
4606 Online
108786 Solutions
New Discussion юеВ

Re: Mutex Proc

 
Bjay
Advisor

Mutex Proc

Hi yesterday I observed 3 to 4 backup processes going into MUTEX state as the application was effected so the we toss the system to clear MUTEX processes and RWAST processes I am not sure what cased them I read some of the posts here seems like Process Quota may be the reason could someone help ........

Thanks
4 REPLIES 4
Willem Grooters
Honored Contributor

Re: Mutex Proc

<< the we toss the system >>

I assume this means: reboot (without dunping the system state).
Which is a pity, because that migh have been useful in locating the cause. It's merely a matter of guessing, I'm afraid...
So a lot of questions to start with, to limit possibilities:

VAX / Alpha / Itanium? VMS version?

<< 3 to 4 backup processes going into MUTEX state >>

How many running at any given time?

For each of them:
BACKUP options? Input? Output?
How have these been started: separate process, or SPAWNED from one?
When have they been started? How log have they run?
Willem Grooters
OpenVMS Developer & System Manager
Hoff
Honored Contributor

Re: Mutex Proc

Mutex and RWAST processes can be triggered by a variety of problems, including hardware and software problems, and by kernel and device driver bugs, and down-revision (not current on ECOs) OpenVMS system software. And yes, RWASTs by quota errors. (Can't say I've seen a quota error trigger a MUTEX recently...)

First, place the documentation on how to generate a crashdump onto the system console. This so that the next time the system is "tossed" we have an option available to allow us to figure out what happened. Here, any further research is basically futile; "tossing" the system erased all that detail.

Then, here are some of the recent process quota suggestions for BACKUP derived from materials from HP:

http://64.223.189.234/node/49

As for the question, there is insufficient information included here for anything even approaching a response. We don't know if this is this a VAX, Alpha or Integrity? OpenVMS version? Command(s)? Current on patches? Anything in the system error log? (The ITRC forum software should suggest inclusion or should prompt for that stuff, but that's another discussion.)

With these cases in particular (and without a crashdump), having the system "tossed" means that the details needed are very likely now gone. Which is why I uniformly recommend placing the forced-crash sequence for the particular box (and which box?) on or near the system console, and training the system operator(s) to use that rather than the halt-boot.

With a crashdump, your support organization can see what specifically wedged, and can possibly then determine why the application(s) wedged.

Crashdumps. Don't "toss" without them.

John Gillings
Honored Contributor

Re: Mutex Proc

The "proper" meaning of MUTEX state is a process waiting for a MUTEX, but in practice you'll never see a process "stuck" in a "real" MUTEX state.

The state has been perloined to indicate running out of a shared resource. One of BYTLM or TQELM. You can confirm the state by examining JIB$L_FLAGS for the stuck process. A value of 1 means it's run out of BYTLM, 2 means TQELM and (in theory) 3 means both.

If a process in that state has relatives in the same process tree, or it's TQELM, then the state may be self limiting. As timers expire, or related processes return quota, the process will proceed.

If there are no relatives, the process is effectively deadlocked against itself.

It IS possible to free up a process in this state by patching the quota fields in the JIB. Some folk use XDELTA, but my preference is a purpose written kernel mode program. It's important to increment both the the count and the limit fields. So, add some quota to JIB$L_TQCNT and JIB$L_TQLM, or to JIB$L_BYTCNT and JIB$L_BYTLM, then poke the process with something like SHOW PROCESS/CHANNEL.

Obviously this in an inherently dangrerous thing to do (adding values to kernel mode system cells), but I've done it many times to recover systems that would otherwise require a reboot.

Anyone who can't work out how to do this from the above description probably shouldn't attempt it, but if you're desperate, I can send detailed instructions, just send me mail (my name with @ in the middle and .com on the end).
A crucible of informative mistakes
Bjay
Advisor

Re: Mutex Proc

toss the server = reboot
'this is Alpha 7.2-1 to add these backup processes are submitted via Tapesys

All the backup processes went to Mutex and then number of Oracle processes went into RWAST
at present I do not have Backup option I will post the options soon

backups were submitted 3 hrs b4 this situation