Operating System - OpenVMS
1753872 Members
7186 Online
108809 Solutions
New Discussion юеВ

select() on non-sockets, porting library or C lib solution?

 
SOLVED
Go to solution
Ben Armstrong
Regular Advisor

select() on non-sockets, porting library or C lib solution?

I have been trying to resolve some outstanding problems in a port of Ruby to OpenVMS by Masamichi Akiyoshi[1]. Enough of this port works to be useful already, but much work is still to be done.

I am currently trying to fix a problem in the Ruby thread scheduling support. Ruby implements this with select(), which is famously restricted to only operate on sockets on VMS. If you try to pass select() some non-socket file descriptors, it throws an ENOTSOCK error. Is there a current or planned solution for this in the porting library or the C runtime library itself?

The release notes to The Jackets (A9) indicate that some work has been done on a select() wrapper, but the way the release note is phrased, it leaves me wondering if it is functional, or just stubbed out.

I don't currently use the porting library for Ruby, but I might if it turns out to solve any problems that I'm currently faced with.

Thanks,
Ben

[1] http://www.geocities.jp/vmsruby/en/
Sadly, it looks like Masamichi cannot continue with this work.
14 REPLIES 14
Brad McCusker
Respected Contributor
Solution

Re: select() on non-sockets, porting library or C lib solution?

Full function select() is on the roadmaps, but, it won't be in 8.3, sorry to say. Leo Demers (at hp dot com) is the business manager for UNIX Portability, and this falls into that category. Send him your cards and letters.

I can't speak for the "porting library" (not even sure which one you refer to).

Brad McCusker

Brad McCusker
Software Concepts International
Ben Armstrong
Regular Advisor

Re: select() on non-sockets, porting library or C lib solution?

The porting library I am referring to is this one:

http://h71000.www7.hp.com/openvms/products/ips/porting.html

The release note in question is from this page:

http://h71000.www7.hp.com/openvms/products/ips/porting_relnotes.html

* Initial support for select() jacket.

I wonder if that means select() works in a Unix way now, or if they've merely added a wrapper which is stubbed out for future implementation of Unix select(). (I'm hoping for the former, but fear the latter.) I have not looked at their select() code yet, but that is certainly my next step if nobody here already knows the answer.

Also, September 2003 (the release date of version A9) is now quite some time ago, so if there's something in the works in this package that fixes select() I'd like to know about it.

Ben
Kris Clippeleyr
Honored Contributor

Re: select() on non-sockets, porting library or C lib solution?

Ben,

The 'select' from the 'porting library' is a simple wrapper around the CRTL select function. No U**x stuff added.

Kris (aka Qkcl)
I'm gonna hit the highway like a battering ram on a silver-black phantom bike...
Craig A Berry
Honored Contributor

Re: select() on non-sockets, porting library or C lib solution?

Ben, for what it's worth, select() on non-socket file descriptors doesn't work on Win32 either, so you might see if there is a Win32 port of Ruby and what (if anything) they do about it.

One thing to look at is what file descriptors the select() is trying to process. If they are created specifically for the thread scheduling and not used for any other purpose, it may be possible to create them as sockets in the first place. Obviously that won't work if they need to be real files for some reason.

It strikes me as odd that select() would be used for "thread scheduling" since pthreads have their own documented techniques for synchronizing threads, but I know nothing about the Ruby implementation.
Ben Armstrong
Regular Advisor

Re: select() on non-sockets, porting library or C lib solution?

Chuck,

My understanding is that the win32 code is integrated in the main project CVS. Certainly there is evidence in the code of win32 support. So, I don't know what to make of the following.

http://www.ruby-lang.org/cgi-bin/cvsweb.cgi/ruby/eval.c?rev=HEAD;content-type=text%2Fx-cvsweb-markup

For one thing, it does indeed look like this code is intended to run on non-socket fds.

Also, I am not seeing where Ruby's eval.c would skip the select() for win32. Nor can I see any exception for win32 that prevents the select() directly, or that skips entering the WAIT_SELECT state, or that prevents the call of rb_thread_fd_writable() which sets that state (see io.c, which calls this function in several places).

Furthermore, the one-line test case using threads (and therefore select(), as shown above) we devised that failed under OpenVMS succeeded on Windows! I can't account for that if select operates on Windows the same way as on OpenVMS.

I am confused. I can't reconcile the evidence I have examined so far with your assertion that select() on non-socket fds in win32 doesn't work. Can you shed any light on this mystery?

As for your comments on pthreads, I must confess my ignorance. Debugging this problem in this port of Ruby has been rather a "baptism by fire" for me, as I have no experience in programming with threads.

Ben

Craig A Berry
Honored Contributor

Re: select() on non-sockets, porting library or C lib solution?

Ben,

As far as I know I am no relation to Chuck, and the weakness of my guitar skills tends to support this :-).

On Win32 not supporting select() on non-sockets, see:

http://www.modperl.com/perl_networking/errata.html#ch12

I'm guessing that the reason Ruby doesn't give errors when doing select() on non-sockets on Windows is that the Windows version of select() apparently just ignores these calls rather than returning an error. As the URL above indicates, this can lead to race conditions or at least excessive CPU consumption.

VMS rather characteristically gives you the bad news early and helps prevent you from shooting yourself in the foot with code that appears to work but doesn't.

You can probably duplicate the Windows behavior by changing:

n = select(max+1, &readfds, &writefds, &exceptfds, delay_ptr);
if (n < 0) {
int e = errno;

to

n = select(max+1, &readfds, &writefds, &exceptfds, delay_ptr);

#ifdef __VMS
if (n < 0 && errno == ENOTSOCK) n = 0;
#endif
if (n < 0) {
int e = errno;


It seems unlikely that this would be a good way to go, but sometimes desperate hacks can at least buy you some time until you can come up with a better fix.
Ben Armstrong
Regular Advisor

Re: select() on non-sockets, porting library or C lib solution?

Hm, seems my personal name hash has a bit of a collision problem. Sorry about that, Craig.[0]

Your hack is essentially what I've done, only I hacked it at a different level, preventing "wait on select" state from ever being entered. And yes, we realized when we made this hack that it would be subject to those problems.

Ben
[0] Incidentally, ITRC's forums would be much more pleasant to use and less prone to this sort of accident if they'd show the entire thread as context on the "postanswer" page, as other forum systems do (e.g. Geeklog).
Ben Armstrong
Regular Advisor

Re: select() on non-sockets, porting library or C lib solution?

Well, it turns out my solution was buggy. I hadn't considered that WAIT_SELECT isn't the only case where needs_select=1. It is also set, for example, by WAIT_FD. So back to the drawing board.

I tried your hack, and it didn't work any better. I believe this is because when we set n = 0, we're not telling the scheduler to put any threads into THREAD_RUNNABLE state. It must be just too late in the day and I've been staring at the code too long, as I'm not seeing an elegant way to fake it.

Ben
Ben Armstrong
Regular Advisor

Re: select() on non-sockets, porting library or C lib solution?

My biggest barrier to resolving this on my own right now is that Windows select() apparently does the "right thing" but it is unclear to me what that actually is, and how to simulate that in the VMS case. If anyone could explain, I'd really appreciate it.

It's clear to me now I ought to take this problem to the ruby-core list, as it now looks like we're into the area of core Ruby problems that just happen to work out OK on Windows due to some quirk of how Windows select() is implemented. I think a more deliberate effort should be made to address the issue in a general way given that it afflicts multiple non-Unix platforms (e.g. some configuration option like SELECT_SUPPORTS_FDS). I am suprised this known issue with Windows select() doesn't even get flagged with a comment in the code about it.

Ben