1751894 Members
5103 Online
108783 Solutions
New Discussion юеВ

Re: $WAITFR behaviour

 
SOLVED
Go to solution
Hoff
Honored Contributor

Re: $WAITFR behaviour

To add one other oft-overlooked interprocess communications mechanism that's available on OpenVMS: RMS. RMS can be a silly-fast communications channel for many applications, and with very minimal configuration requirements.
Claus Olesen
Advisor
Solution

Re: $WAITFR behaviour

To me it looks like a problem of who gets the cpu. OpenVMS will put the slaves on the runable queue when the master sets the ef but the slaves may not actually get the cpu for while after that. All while the master is free to continue in its endlessly loop of setting and clearing the efs whether or not the slaves have seen it. You may not be able to use realtime priorities but if the master could be set at a lower realtmie priority than the slaves then that could ensure that the slaves gets to run all the way up to the next eventflag before the master again gets the cpu.
Hein van den Heuvel
Honored Contributor

Re: $WAITFR behaviour

Absolutely Claus.

That's a much better and simpler way to explain it. Occam's razor.
As presented the master never waits for the slave.
There is no formal synchronization at all.
It's just 'likely' that the slave gets scheduled when the master goes to wait for the timer, but it is not garantueed.

The $show proc/cont just changed the priorities and scheduling.

Cheers,
Hein.
John Gillings
Honored Contributor

Re: $WAITFR behaviour

Re: Robert "except it is not clear to me that you actually need two locks to accomplish this."

If there are only two processes, you're absolutely correct. One lock will suffice, but it seems to me there may be several "Slaves" here. For maximum control, and to guarantee each slave gets a look in, have one lock for the master, and one for each slave. The slaves synch against master and their own lock. The master controls them all.

re: Hoff's comment about RMS - absolutely true. RMS files can be a simple way of using locks from DCL.
A crucible of informative mistakes
Barry Alford
Frequent Advisor

Re: $WAITFR behaviour

Thanks to all for your replies -- I didn't think that this would elicit such a discussion; I thought I had just misinterpreted how event flags, and especially $WAITFR, works.

Claus solved the problem by mentioning priorities; if I run the master at base priority 3 and the slaves at the default of 4 (tried it with two slaves so far). Hein suggested that the $show proc/cont changed the priorities, and this looks like it may be the case - but (like in quantum physics) I can't measure this without changing it!

I have not tried locks/ASTs as suggested by John and Robert; I may explore that later.

By the way, my Windows programming collegues tell me this is really simple in their world! However, I believe their event driven code is like the slaves registering with the master which Hein suggested with his $HIBER/$WAKE, albeit with some more help from the OS.
Hein van den Heuvel
Honored Contributor

Re: $WAITFR behaviour

>> Claus solved the problem by mentioning priorities; if I run the master at base priority 3 and the slaves at the default of 4 (tried it with two slaves so far). H

NO NO NO NO NO

Assining priorities does not solve problems it merely ides problems.

IF strict sncrhonization is a requirement, then the master somehow has to listen to the slave.

IF it's ok to miss a cycle or two then priorities can be good enough, but they are never a solution.
For example, at some point the maste may do an IO an get a priority boost from that. Or pixscan kicks in or whatever.

Since you have a time goin of anyway this could be acceptable in your case, as long as slaves look for all possible work, not just 1 item.
The folks with just one event event flag and no time may hang too long waiting for the action if a wakeup is missed.

Cheers,
Hein.
Barry Alford
Frequent Advisor

Re: $WAITFR behaviour

Hein, maybe I should have said that Claus had solved the problem of the slaves not seeing the setting of the event flag -- I don't mean to say that this is a solution for my application.

It /could/ be a solution, as some missed frames may not be critical - the slaves are intended to run at 50Hz (20ms cycle time) as their position info is monitored by a graphics process (running on a pc). (Currently, the monolithic application does all that the slaves do in this period.) I have not yet run at this rate or given the slaves any work to do; this may then show up the flakiness of the scheduling problem again...

If so, I would probably try the $HIBER/$WAKE solution suggested by Hein; then I have control of the "wakeup list" of process IDs, rather than depending on what I thought $WAITFR would do. (I will need a mechanism for the slaves to register their process ID with the master.)

Finally, I may also try setting the slaves free and run them at 50Hz but asynchronously to each other. This may be no worse than tolerating them missing a cyle when running under the master.
Hoff
Honored Contributor

Re: $WAITFR behaviour

Why are the secondaries even having this problem?

(There's an ancient software design rule I ascribe to: if it hurts, don't do it.)

Consider another approach... Have a scheduled AST or $schdwk or other such in each of the secondaries (repeating TQE), and free-run the secondaries pulling in the frames from whatever the shared storage is.

The other approach I've used in this environment is an Ethernet multicast. I have used 60-06 here when I first started with this scheme an eon ago (as that stays on-LAN), but I'd also look to use an UDP multicast datagram for various environments.

http://h71000.www7.hp.com/DOC/82final/6529/6529pro_005.html

Basically, the primary multicasts the data (or just a ping) periodically, and if the secondaries pick up on the multicast, they display the data. So long as the next datagram is captured and the data in the datagrams are independent, if a UDP multicast datagram is occasionally dropped on the floor, well, oh, well...

I've successfully used this or a similar scheme for any number of monitoring tasks over the years. (It's also entirely platform-independent; it works on OpenVMS and on anything else that can receive a UDP multicast.) (If you want to chat off-line, I can describe a boo-boo or two I've made when designing these sorts of factory-floor and other such comm systems over the years.)

As for coordination and the election of a primary process, that should use locks. Coordinating processes within a cluster through other means is perilous at best, and a whole lot more work to get right. I'd not use $hiber and $wake for this task, and would definitely not use event flags here.

John Gillings
Honored Contributor

Re: $WAITFR behaviour

Barry,

Fairly simple rule... any synchronization mechanism that depends on differential priorities is wrong! It WILL break somewhere, sometime, when you least expect it, and you won't be able to reproduce the failure.

>By the way, my Windows programming
>collegues tell me this is really simple in
>their world!

Synchronization between processes in OpenVMS is also really simple. Any operating system is fundamentally just a big synchronization engine, so it can't be any other way.

I suppose it could be argued that on OpenVMS it's made slightly harder because you have so much choice as to what mechanism you pick. So, rather than having to bend your design to fit what the operating system offers, you choose the best design for your application and then select the most appropriate mechanism to implement it.

Event flags can be good for "blind" signaling. "Wait here until something happens". There can be multiple processes all waiting on a common event flag, and they all get released at once, BUT you get no feedback, and no way to know all processes are ready before lowering the gate. They can be a good, light weight mechanism, but unless you're very careful you may introduce timeing windows, race conditions or false triggers.

Other mechanisms include HIBER/WAKE, locks, mailboxes, global sections, ICC, IPL, spinlocks, mutexes, ASTs, etc... See the Programming Concepts Manual.

Your first step is to make sure you can clearly describe what you want to do. I'm not sure I know that yet. I THINK it's like this:

master process
timed loop
send an event to each slave
confirm slave has received event
endloop

slave (multiple processes?)
loop
wait for event
confirm receipt of event
endloop

There are lots of possible ways to do this, but event flags probably aren't the best choice.
A crucible of informative mistakes
Claus Olesen
Advisor

Re: $WAITFR behaviour

Hoff suggested using locks. I tried it using this

lksb A,B,C;

if (!strcmp(argv[1],"master"))
mode=LCK$K_EXMODE;
else //slaves
mode=LCK$K_CRMODE;

for (x=0;;x++)
{
printf("%d\n",x);
sys_enq(A,mode);
sys_enq(C,LCK$K_NLMODE);
sys_enq(B,mode);
sys_enq(A,LCK$K_NLMODE);
sys_enq(C,mode);
sys_enq(B,LCK$K_NLMODE);
sleep(atoi(argv[2])); //your work here
}

(sys_enq is pseudo for your convenience wrapper of sys$enq) and my test run with one master and 2 slaves showed lock step without missteps. It has that characteristic that you mentioned that the parties do not need to know about one another. And they can come and go.