Operating System - OpenVMS

Problem with Temporary Mail Box

 
Hein van den Heuvel
Honored Contributor

Re: Problem with Temporary Mail Box

>> One thing I can assure you there is no syncronization error.

And one thing I can assure you, and I'll put a decent wager on this if you are interested, is that there is!
There has to be! Otherwise you would not be asking the question right?

Your mindset to solve this problem has to be 'there is a communication / synchronization error here somewhere in this application code. You must assume VMS is doing the right thing, and that the program is doing something wrong somewhere. Maybe error handling, maybe logic, maybe both.

>> Actually P1 will also create a mailbox (say MBA345), and it will do a sys$qiow to this mailbox.
Now when P2 process has created its mailbox (say MBA346) then it will write the coded value in P1 mailbox(MBA345), since P1 is continuosly reading

Well duh! Now you tell us! :-).
That sounds a lot more serious, but not certain. For example, the iosb for the qiow read is not 'double booked' by accident (same variable used for different IO).

>> we are not sending the name of temporary mailbox in the message, rather we have defined a logical "TEMP_MAILBOX_" which will contain the name of temporary mailbox created by process with process id .

Why not send the mailbox name? Might as well!
Why go through the trouble of creating a logical name?
In what table?
Could that be failing?
Mind you.. you don't actually have to send anything special. Just 'Hi!' will do, as the IOSB after the read will contain the PID for the sender.

I would recommend an LNM trace of the process startup:
$ANALYZE/SYSTEM
SDA>LNM LOAD ! If not already loaded
SDA>LNM START TRACE
SDA>SPAWN "start application"
SDA>LNM STOP TRACE
SDA>SET OUTPUT LNM_SHOW_TRACE.LOG
SDA>LNM SHOW TRACE

Good luck!
Hein.




Not applicable

Re: Problem with Temporary Mail Box

Process P2-P10 will write to P1 mailbox as follows using sys$qiow:

status = SYS$QIOW( 0,channel,code,&iosb,0,0,buffer,buf_size,0,0,0,0 );

where:

channel is the channel no. assigned with the P1 mailbox.
code is IO$_WRITEVBLK | IO$M_NOW
buffer = ""
buf_size = 164.

Process P1 is reading the mailbox as follows:

qiow_status = sys$qiow(efn_flg, DECEDI$DS_CB_REMOTE.ds_mbx_channel_in, IO$_READVBLK,&iosb_blk,0,0,&dsr,sizeof(dsr),0,0,0,0);

where:

efn_flg value is obtain as follows:

LIB$GET_EF ( &efn_flg );

Regards,
ajaydec
Colin Butcher
Esteemed Contributor

Re: Problem with Temporary Mail Box

This does sound like a design flaw in the application. It's a classic example of one of the problems that commonly occurs when moving code from a uniprocessor environment to a multiprocessor environment. The introduction of parallelism often exposes design flaws in the state machine. I've had it happen to me with code that "worked perfectly" for over 7 years, then promptly blew up when we tested it on a multiprocessor environment. I'd failed to correctly guard against a race condition that I'd not considered as the implicit assumption was that things would happen in the correct sequence. That's not always true in a multiprocessor environment.

Please look closely at the way in which you both synchronise and serialise access to the mailbox during the startup sequence. It's not enough to do one thing at once, your code also has to do the right thing in the right order. The lock manager is the mechanism you want to be using to implement the serialisation and synchronisation mechanisms. Don't "roll your own" with flag bits.

As for inter-process communication - if it's only a small amount of data to be passed around, consider using the extended lock value block. If it's a lot of data shared between the processes, consider using a global section and having sufficiently fine granularity of the data structures protected by locks so as to minimise wait states. Don't forget about termination mailboxes either.

Have fun.

Cheers, Colin (http://www.xdelta.co.uk).

Entia non sunt multiplicanda praeter necessitatem (Occam's razor).
Not applicable

Re: Problem with Temporary Mail Box

Hi All,

Thanks for your help and time, but still I am not able to get it right. I'll again try to explain the process in detail:

1) P1 process will start and it will create permanent mailbox and will read it as follows:

qiow_status = sys$qiow(efn_flg, DECEDI$DS_CB_REMOTE.ds_mbx_channel_in, IO$_READVBLK,&iosb_blk,0,0,&dsr,sizeof(dsr),0,0,0,0);

where:

efn_flg value is obtain as follows:

LIB$GET_EF ( &efn_flg );

2) P2-P10 process is being created and each process does the following:

i) Assign a channel to mailbox created by process P1 using sys$assign.
ii) Create a temporary mailbox.

status1 = SYS$CREMBX( 0, channel, msg_size, 0, 0, 0, &mbx_log_name );

where:
mbx_log_name is logical to temporary mailbox name
iii) pings/writes to the mailbox created by P1 as follows:
status = SYS$QIOW( 0,channel,code,&iosb,0,0,buffer,buf_size,0,0,0,0 );

where:

channel is the channel no. assigned with the P1 mailbox.
code is IO$_WRITEVBLK | IO$M_NOW
buffer = ""
buf_size = 164.
iv) After this waits for process P1 to write in its temporary mailbox as follows:
status = SYS$QIOW(efn_flag,io_channel_in,IO$_READVBLK,0,0,0,&P1_reply,sizeof(P1_reply),0,0,0,0);


3) Once Process P2-P10 writes in mailbox of process P1, process P1 will assign a channel to temporary mailbox of process P2-P10. Temporary mail box name will be defined in logical tmp_mbx_name_, where process_id is the process id of the process P2-P10.

Could anyone help me to know where, syncronization and parallelism is lacking because of which sometimes, I am getting IVDEVNAM error by process P1 when it is trying to assign a channel to temporary mailbox created by process P2-P10.

Regards,
ajaydec
Robert Gezelter
Honored Contributor

Re: Problem with Temporary Mail Box

ajaydec,

I agree with Hein, Hoff, Colin, et al that this is a synchronization error. The IVDEVNAM error can occur for any number of reasons, not necessarily only the obvious ones.

Personally, I prefer to chase these problems analytically, not by gathering a lot of data, which, in the end, will not be helpful. However, if one wishes to exclude some possibilities, one can collect the names of the devices being used by the ASSIGN call in a separate array, and then look at the array with the debugger. DO NOT use printf statements, as the extra time has a good chance of disrupting the timing behavior.

- Bob Gezelter, http://www.rlgsc.com
Guenther Froehlin
Valued Contributor

Re: Problem with Temporary Mail Box

So far I don't see a coding problem except that:

"iv) After this waits for process P1 to write in its temporary mailbox as follows:
status = SYS$QIOW(efn_flag,io_channel_in,IO$_READVBLK,0,0,0,&P1_reply,sizeof(P1_reply),0,0,0,0);"

...is not using an IOSB.

Here is what the book says about $ASSIGN and INVDEVNAM:

"No device name was specified, the logical name translation failed, or the device or mailbox name string contains invalid characters."

To get more information I would output/log the translation of the logical used in the $ASSIGN. My bet is that the transmission and/or handling of the logical name has a problem.

I don't see a synchronization problem.

/Guenther
Hoff
Honored Contributor

Re: Problem with Temporary Mail Box

[[[One common model for this type of situation is to have the master process (your P1) create a PERMANENT mailbox, with a system wide, well known logical name. The clients create a temporary mailbox then send a message to the master via the permanent mailbox, including the name of their temporary mailbox in the message. The master then opens a channel to the client mailbox and two way communication is established.]]]

The uni-directional mailbox design here is certainly quite typical.

Conversely, permanent mailboxes can often serve as a way to trigger run-time bugs in my experience. Permanent mailboxes require cleanup, where temporary mailboxes do not.

I can't think of a recent case where I've chosen to use a permanent mailbox, and I have typically removed such usage out of code I'm maintaining as it can help better deal with failures and restarts.

Mailboxes themselves are not something I'd tend to use in new applications, save for specific cases. If you're willing to overtly tie into OpenVMS, I'd suggest ICC. If you're more interested in portability, I'd look to use middleware or IP (IPv4 or IPv6) sockets.

[[[We are also doing the same, might be I am not able to explain it properly. The only difference is that we are not sending the name of temporary mailbox in the message, rather we have defined a logical "TEMP_MAILBOX_" which will contain the name of temporary mailbox created by process with process id .]]]

I'd suggest not naming an object (a variable, file, mailbox, etc) for what it is (since any of us can look at same and figure that out), but to name the object for what it is used for, and for what particular application is using it.

Something akin to FOO_CLIENT_pid, for instance, identifies FOO as the application facility associated with the mailbox, and that the mailbox is a client mailbox. Or FOO_scsnode_CLIENT_pid, if you're working with a cluster-visible object. Use of the facility also avoids colliding with some other programmer which chose to use the logical name TEMP_MAILBOX_pid, too...

Opinion only, of course.
Larry Bohan
Advisor

Re: Problem with Temporary Mail Box

maybe consider setting things up such
the the temporary mailbox logicals (names)
are entered in a group logical table?

iirc, this was (or similar to) the VMS 3.x
behavior.

we had some old code (mid/late 1980's) that required this behavior, but there might well be some larger (newer) reasons why this would *not* be good idea in the general case.

$ DEFINE/TABLE=LNM$PROCESS_DIRECTORY -LNM$TEMPORARY_MAILBOX LNM$GROUP

..or (approximately):

l_sts = lib$set_logical(
$DESCR("LNM$TEMPORARY_MAILBOX"),
$DESCR2("LNM$GROUP"),
$DESCRL("LNM$PROCESS_DIRECTORY"),
0, 0);
Jan van den Ende
Honored Contributor

Re: Problem with Temporary Mail Box

@Larry,

Just be aware that ANYBODY _WRITING_ anything in that mailbox now requires GRPNAM privilege!

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Hoff
Honored Contributor

Re: Problem with Temporary Mail Box

[[[Just be aware that ANYBODY _WRITING_ anything in that mailbox now requires GRPNAM privilege!]]]

Privileges are initially checked when an object (file, device, mailbox, logical name table, queue, etc) is access (or created), and not generally checked (again) at run-time as the channel is accessed. (Yes, there are specific operations that might involve extra checks -- tossing an IO$_DIAGNOSE function at the device, for instance -- but these are not typical).

And a programmer can set a group mailbox to any protection that might be required. (The associated logical name goes in the group table and grpnam can be (is) required there, but the mailbox itself has its own and separate protection.)

And if you're so inclined, you can reconfigure the group logical name table protection for a table, as these tables are also security objects and have ownership, protections and ACL capabilities.

Further, programmers will generally want to confirm the mailbox device ownership and the device protection are as expected, and any ACL that might be associated with the device. (Mailboxes are an ideal site for injecting messages into an environment; there can be security implications of a security-critical environment. This traffic can potentially require protection.)