Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Can three looping processes, zero priority, affect performance significantly?

 
SOLVED
Go to solution
Jon Pinkley
Honored Contributor

Re: Can three looping processes, zero priority, affect performance significantly?

RE:"will these priority 0 jobs run to completion of their quantum or will they be interrupted by processes at higher priority?"

They will be preempted by a higher priority kernel thread (process) that has priority PRIORITY_OFFSET+1 more than (zero). The default value for the sysgen parameter PRIORITY_OFFSET is 0, so in that case, a priority 1 process will preempt the priority 0 process.

Reference: page 31 of "OpenVMS Alpha Internals: Scheduling and Process Control"

http://books.google.com/books?id=ydKIsgCiFVsC&pg=PA29&dq=priority_offset#v=onepage&q=priority_offset&f=false
it depends
EdgarZamora
Trusted Contributor

Re: Can three looping processes, zero priority, affect performance significantly?

To elaborate more on my earlier response... I had a similar problem with telnet processes not logging off properly and becoming undeletable detached processes (through normal means). VMS83A_DCL_V0300 fixed this problem. Below is an excerpt from the patch notes. It may not describe EXACTLY what you're experiencing, it didn't mine, but it sure did fix my problem. The patch fixes something with that NODELETE bit being set.

5.2.1.1 Problem Description:

If a parent process with its nodelete bit set spawned a
subprocess, if the subprocess was still active when
terminal attached to it was closed, the parent process
would go into a loop with 100% CPU utilization.

Images Affected:

- [SYSEXE]DCL.EXE


5.2.2.1 Problem Description:

After applying the VMS83A_RMS-V0600 or VMS83A_RMS-V0700
patch kits, a subprocess would go into a compute-bound
loop when its Telnet window on a remote PC was closed.

Images Affected:

- [SYSEXE]DCL.EXE


Clark Powell
Frequent Advisor

Re: Can three looping processes, zero priority, affect performance significantly?

I was hopeful that the patch would solve the problem but when I looked at the patch history:
DEC AXPVMS VMS83A_DCL V3.0 Patch Install Val 09-SEP-2008
so no luck there.

I have sent a crash dump to OpenVMS Engineering and I will report any developements.
John Gillings
Honored Contributor

Re: Can three looping processes, zero priority, affect performance significantly?

Clark,

On step down from priority 0

If you're worried about the impact of these processes on your CPUs, or the idle loop page zeroing work, why not suspend them?

$ SET PROCESS/SUSPEND

Provided they're not holding any locks, all they will do is consume a process slot.
A crucible of informative mistakes
Volker Halle
Honored Contributor

Re: Can three looping processes, zero priority, affect performance significantly?

Clark,


Then discussing this problem with Colorado Springs (or whatever it is now,) they said that there was a feature in OpenVMS that would identify these processes and lower the priority automatically.


AFAIK, OpenVMS does not do something like this.

There is one code path in SYS$EXIT, where OpenVMS lowers the own process priority EXPLICITLY to 0 and this is after the final $DELPRC_S (Delete self) call in process exit handling, so that - if the $DELPRC fails - the process would be looping at prio 0 with the $DELPRC status return value kept in R0 !

If this is what you're seeing, look at the PC values of such a looping process. This loop in SYS$EXIT is a 'BRB .', so the PC value would remain constant. Then go into SDA, set context to that process and issue a SHOW PROC/REGISTER. What's the value stored in R0 ?

Maybe this application was setting the NODELETE bit and forgot to clear it on some unexpected error path...

Volker.
Jon Pinkley
Honored Contributor

Re: Can three looping processes, zero priority, affect performance significantly?

Clark,

Volker's response is an example of why he gets consistently high points awarded.

My bet is with Volker's conjecture. I am expecting the crash dump to reveal that R0 has value SS$_NODELETE, and that the process is looping in SYS$EXIT.

This is one of the code paths that was never expected to execute, because the process is requesting that it be killed, and does not expect to survive long enough to execute the instructions following the $DELPRC. By having the process loop at priority zero, the context is saved so it can be analyzed to determine why the path was taken.

My opinion is that $DELPRC with pad == self should not honor the no delete bit, but that isn't the way it is currently coded. In other words, change the meaning of the nodelete bit to mean, "don't allow another process to kill me, but let me kill myself". The way it is now, the process can't be killed, but it can't do anything useful either.

By the way, the $DELPRC documentation does not mention SS$_NODELETE as a possible return status. It also says this about calling $DELPRC to delete the calling process:

"The Delete Process service allows a process to delete itself or another process. If you specify neither the pidadr nor the prcnam argument, $DELPRC deletes the calling process; control is not returned."

In my opinion, the current behavior is not correct, and should be modified to ignore the PCB$M_NODELET bit when the calling process is the target for deletion.

Can anyone think of a problem that the change would cause?

Jon
it depends
John Gillings
Honored Contributor

Re: Can three looping processes, zero priority, affect performance significantly?

re: Jon:

>In my opinion, the current behavior is not
>correct, and should be modified to ignore
>the PCB$M_NODELET bit when the calling
>process is the target for deletion.
>
>Can anyone think of a problem that the
>change would cause?

Well, not really, but then, that's a problem! Setting NODELET isn't something you do without a REALLY good reason. Summarily overriding it when you're in a code path that you didn't expect to be in anyway is definitely NOT the sort of risk that OpenVMS philosophy allows.

You might argue that SUSPend would be more appropriate (less resources consumed, and more obvious in a SHOW SYSTEM display), but then if I remember the code that Volker's talking about correctly, it's possible the $DELPRC could be asynchronous, so SUSP might block a process deletion that would otherwise complete.

Whatever is done in this "impossible" code path is really just papering over some other problem, so silently deleting the process is just hiding it even more. The correct approach is to identify how these processes are getting into an unexpected state and fixing it at the root cause.
A crucible of informative mistakes
Jon Pinkley
Honored Contributor

Re: Can three looping processes, zero priority, affect performance significantly?

Even if $DELPRC did not honor the nodelete bit when pid == self, that alone would not solve all problems cause by improper use of PCB$V_NODELET.

For example consider setting the no delete bit in a subprocess, and then killing the parent. In that case, I think the parent will enter an RSN$_ASTWAIT MWAIT state, and would deadlock.

So, as John Gillings said, the real problem needs to be found and eliminated. Since setting pcb$v_nodelet requires privilege, making incorrect use apparent will tend to get the real problem fixed sooner than if it is swept under the rug.

This reminds me of the Twilight Zone episode "Escape Clause"

http://en.wikipedia.org/wiki/Escape_Clause

except that the process doesn't have an escape clause once it is in the BRB loop.

Jon
it depends
Clark Powell
Frequent Advisor

Re: Can three looping processes, zero priority, affect performance significantly?

Here is the looping process and it does exactly what was previously said.

0000023E 18637 ;
0000023E 18638
00000249 18639 $DELPRC_S ; DELETE SELF
00000249 18640 PUSHL R0 ; SAVE ANY ERROR RETURNED
0000025E 18641 $SETPRI_S PRI=#0 ; MAKE NEXT LOOP HARMLESS
0000025E 18642 POPL R0 ; RESTORE THE ERROR FROM DELPRC_S
00000261 18643 20$: BRB 20$ ; ****** FELL THROUGH DELPRC SOMEHOW

The application, Cache, is setting the NODELET bit and the normal exit would clear it but in this case the telent session is initiated from a web browser and when the web browser is stopped by power outage or clicking on the "X" the NODELET bit is not cleared. We don't have this problem with normally initiated interactive telnet sessions but I don't know if that's because the users of such have been properly trained to use "LOGOUT" or if there is a built in protection. But, that question is beyond the scope of this discussion. I think that you all have done a super of analyzing this problem.

thanks
Clark Powell
Clark Powell
Frequent Advisor

Re: Can three looping processes, zero priority, affect performance significantly?

I will talk to GE about their Web Access Flowcast product running on Cache and see if they can come up with a better way than setting the NODELET bit. thanks for the help.
Clark Powell