Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

DCE and LIB$WAIT

 
SOLVED
Go to solution
Willem Grooters
Honored Contributor

DCE and LIB$WAIT

I inherited maintenance of a few programs that communicate with external systems using DCE, synchrounously - so it implies a wait in some form.
The program uses

V_STATUS := LIB$SPAWN ('WAIT 00:00:37');

where I would have used

V_STATUS := LIB$WAIT (37.0, 1);

There must have been a reason for this, and according the memory of one of my collegues, LIB$WAIT and DCE-code interfered causing a number of problems, and the SPWAN solution seemed the only way to solve them. But he didn't remember what was happening - it had to do messing with event flags, perhaps.

Can anyone tell us more on this issue?

Alas, I cannot tell anything on what DCE routines are actually called - we use a library enveloping (and thus hiding) the details. But if needed, I could dive into the objects to find out.

Environment:
OpenVMS 7.1 and 7.2 (Code developed and compiled on 7.2, linked and run on 7.1)

DEC VAXVMS VAX_DCEECO_015_1 V1.0
DEC VAXVMS DCE V1.5
Willem Grooters
OpenVMS Developer & System Manager
5 REPLIES 5
Ian Miller.
Honored Contributor

Re: DCE and LIB$WAIT

LIB$WAIT uses $HIBER and $SCHDWK so if something else does $WAKE then the wait can be shorter than you asked for.

Use $SETIMR & $WAITFR with a local event flag.
____________________
Purely Personal Opinion
Wim Van den Wyngaert
Honored Contributor

Re: DCE and LIB$WAIT

Had the same problem in 1995 with lib$wait. Had to test if the wait-time was completed and if not goto sleep again. Only happned from time to time and we never found why. The process wasn't receiving any wake up calls from the application. Context was cobol, oracle 7 (embedded sql), vax 5.5 (not sure, was it alpha already ?).

Wim
Wim
Willem Grooters
Honored Contributor

Re: DCE and LIB$WAIT

Wim:
The second parameter in LIB$WAIT would prevent that, according the documentation.
What my collegue did remember is that returning to activity could explode to minutes, even if the timer expired - sometimes, indeed. (He didn't know of this flag).
We could try, though, using SETIMR and WAITEF.
Willem Grooters
OpenVMS Developer & System Manager
John Gillings
Honored Contributor
Solution

Re: DCE and LIB$WAIT

Willem,

There are a few potential issues with the $HIBER/$WAKE mechanism when used for synchronization. Note these aren't "bugs" as such. They're consequences of the intended design features of the mechanism which can cause application misbehaviour if code makes invalid assumptions about what other threads of execution are doing.

The first is if there is a pending $WAKE, then subsequent $WAKEs are lost. This can result in a $HIBER hanging because another thread consumed the $WAKE intended for it. The converse is a $WAKE isn't "targeted", so a $HIBER may be woken prematurely by a $WAKE for another thread.

So, in your case a call to LIB$WAIT may complete early if a $WAKE is triggered from DCE (or any other thread of execution), or, if there was already a $WAKE pending, the LIB$WAIT will consume it, potentially leaving a DCE thread hanging later on.

Properly written code should at least be immune from unexpected $WAKEs, and it should not consume $WAKEs from other threads. Protecting it from having its own $WAKEs consumed elsewhere is more difficult.

On solution is to make sure every $HIBER/$WAKE pair has a flag to check that the event you're waiting for has really happened. Here's a wait sequence with several mechanisms to eliminate spurious wakeups and spurious hangs (in this case there is some redundancy)

$HIBER side

TimeToWake=0
$WAKE ; force wakeup
$HIBER ; consume any pending wakes
WHILE NOT TimeToWake
$HIBER
ENDWHILE
$WAKE ; leave a pending wake in case we've consumed one intended for another thread

$WAKE side
TimeToWake=1
$WAKE

Of course, this assumes that any other threads are protected by a TimeToWake flag, otherwise there will be spurious wakeups.

The SPAWN ensures that the WAIT command is isolated from other $WAKEs floating around the current process. It's a relatively expensive operation since it involves process creation, but if you're waiting 37 seconds it's obviously not in a tight loop and unlikely to be a problem unless you're seriously short of process slots. Crude but effective. Unless you're desperate to conserve CPU and memory I'd recommend leaving it as is.

The other advantage of SPAWN is you're specifying the time as a string, which makes it floating point format independent. That means it will behave correctly on both Alpha and IA64 without modification. As written, the call to LIB$WAIT will fail on IA64 unless you explicitly compile with /FLOAT=F_FLOAT, or add a float-type argument.

As Hoff has mentioned recently, if the problem is hanging in $HIBER, you can use the very blunt instrument of a $SCHDWK to inject regular wakeups to compensate for some thread consuming another threads $WAKEs. This can even be done externally if the program is run in a detached process (see RUN/INTERVAL)
A crucible of informative mistakes
Willem Grooters
Honored Contributor

Re: DCE and LIB$WAIT

Thanks to Hoff and John for their explanations. There are other issues with DCE but that's fodder for another discussion (I don't think these related)
Willem Grooters
OpenVMS Developer & System Manager