HPE Community read-only access December 15, 2018
This is a maintenance upgrade. You will be able to read articles and posts, but not post or reply.
Hours:
Dec 15, 4:00 am to 10:00 am UTC
Dec 14, 10:00 pm CST to Dec 15, 4:00 am CST
Dec 14, 8:00 pm PST to Dec 15, 2:00 am PST
Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

VAXC$ESTABLISH, LIB$FIND_IMAGE_SYMBOL, and threads

 
SOLVED
Go to solution
Craig A Berry
Honored Contributor

VAXC$ESTABLISH, LIB$FIND_IMAGE_SYMBOL, and threads

I'm dealing with some code that basically does this:

VAXC$ESTABLISH(handler);
LIB$FIND_IMAGE_SYMBOL(image, symbol ...);

Then the handler calls SYS$PUTMSG with (parts of) the signal array that it gets and provides SYS$PUTMSG with an action routine to store the resulting formatted message in a safe place. The definition of "safe" needs further examination, which I'll get to in sec.

This all works peachy most of the time, but here's the kicker: the SYS$PUTMSG action routine calls pthread_getspecific in order to say, "What thread am I?" and then stores the error message in thread-specific data for use by somebody downstream doing clean-up. Occasionally, under heavy load and with multiple threads active, an error condition in LIB$FIND_IMAGE_SYMBOL results in either an access violation when trying to store the error message, or the sanity checking in the downstream destructor routine says, "Hey, this data belongs to some other thread, why are you giving it to me to clean up?"

I'm guessing that the exception handler does not necessarily execute in the same thread as the call to LIB$FIND_IMAGE_SYMBOL, so the call to pthread_getspecific reports some other thread (parent thread? thread 0?). Is that true? Is it documented anywhere? What about SYS$PUTMSG's action routine -- does it execute in the same thread as the routine that calls SYS$PUTMSG? Is there any way from the condition handler to find out what thread I was in when calling LIB$FIND_IMAGE_SYMBOL?

I'm guessing not, and the only solution I can think of is to get rid of all the thread-aware bits of the condition handler and the action routine and just store the error message in global static data. Then do something like:


VAXC$ESTABLISH(handler);
pthread_mutex_lock(&mymutex);
LIB$FIND_IMAGE_SYMBOL(image, symbol ...);

if error, copy error message from global
data to thread-specific data

pthread_mutex_unlock(&mymutex);

Is that the best/only way to deal with this situation?
8 REPLIES
Hoff
Honored Contributor

Re: VAXC$ESTABLISH, LIB$FIND_IMAGE_SYMBOL, and threads

I might try a global work queue, depending on what you're up to, to get the information back to where you need it.

While I like and use ASTs and KP threads and multi-processing designs and such, the DECthreads all-in-one design (to my mind) has never seems to have realized its full potential nor full stability; things just seem to go weird with threading models.

Please don't read this as me placing any blame on DECthreads, either. Getting DECthreads or any threading-based application code "right" is an effort within any a non-trivial project. Application coding errors and the occasional error in the underlying API tend to be subtle. Threading -- of any sort -- is a tough approach.

http://www.artist-embedded.org/artist/The-Trouble-With-Threads.html

John Gillings
Honored Contributor

Re: VAXC$ESTABLISH, LIB$FIND_IMAGE_SYMBOL, and threads

Craig,

Isn't this all a bit byzantine for what amounts to a fairly simple error? I think it was rather silly for LIB$FIS to be implemented as signalling errors. Would you be going to all that trouble if LIB$FIS just returned a condition code like any other normal RTL function?

What it comes down to is "Please give me an address for this image/symbol pair. Did it work or not?". Sure you lose a bit of information by not getting a signal array, but it's unlikely to tell you anything you didn't know anyway.

I usually wrap LIB$FIS into a jacket to convert the signalled error into a return status. MACRO32 for brevity and universality:

.TITLE LIB_FIND_IMAGE_SYMBOL
.PSECT code,RD,NOWRT,EXE
.CALL_ENTRY,MAX_ARGS=5,HOME_ARGS=TRUE,LABEL=LIB_FIND_IMAGE_SYMBOL
MOVAB G^LIB$SIG_TO_RET,(FP)
CALLG (AP),LIB$FIND_IMAGE_SYMBOL
RET
.END

(Note this will work on all 3 platforms, the MACRO compiler is smart enough to convert the VAXism of writing the handler address to FP into an appropriate construct for Alpha or IA64).
A crucible of informative mistakes
Craig A Berry
Honored Contributor

Re: VAXC$ESTABLISH, LIB$FIND_IMAGE_SYMBOL, and threads

Hoff, John, thanks for the replies.

Hoff, a global work queue is a fine idea, but would require a major architecture change that I'm not really up for. Professor Lee's essay is also interesting reading but doing anything practical with it would involve inventing and implementing a new language that uses concurrency in a completely new way, including writng many new compilers from scratch targeting many platforms, then rewriting many thousands of lines of code in this new language. Since the entire resource pool for this project consists of me, myself, and I and as much of our Sunday afternoon as we are willing to spend on it, I think that approach will have to go on the back burner and POSIX threads will have to stay in the foreground for now.

John, thanks for the MACRO example of a handler jacket for LIB$FIS. Isn't that essentially the same thing VAXC$ESTABLISH or its cousin LIB$ESTABLISH will do for you? I'm under the impression that they stimulate the compiler to generate stack manipulations rather than actual RTL calls, with details varying by architecture and language but all similar in principle.

It is indeed a bit awkward to go to these lengths to find out what happened when trying to grab a symbol from a shared library. I suppose a lot of the potential error conditions involve RMS, so there is a possibility of having an STV as well as an STS that are of interest.

The code I'm fiddling with is here:

http://tinyurl.com/5lzsby

though that will not do you a lot of good if you aren't accustomed to reading the Perl sources. It will go through a pre-preprocessor before being handed to the C compiler and gettng umpteen layers of macros peeled back.

I've implemented my own suggestion of a mutex lock and the good news is it didn't work. That is, the critical section I'm dealing with around LIB$FIND_IMAGE_SYMBOL is not where the corruption is coming from and it's possible it has nothing to do with threads at all. It would still be nice to get confirmation that my assumptions about threads and exception handlers are correct.
John Gillings
Honored Contributor
Solution

Re: VAXC$ESTABLISH, LIB$FIND_IMAGE_SYMBOL, and threads

Craig,

> It would still be nice to get
>confirmation that my assumptions about
>threads and exception handlers are
>correct.
...
>I'm guessing that the exception handler
>does not necessarily execute in the same
>thread as the call to LIB$FIND_IMAGE_SYMBOL

From a purely theoretical perspective, an exception handler can only execute in the context of the thread which generated the exception. Everything to do with the exception lives on the stack, and switching thread context switches stacks. Therefore it's not possible to handle the exception from anywhere else.

I'd be very wary of calling any pthread routines from inside a handler.

Regarding the MACRO32, yes it's the same thing as VAXC$ESTABLISH, just shorter, simpler and doesn't need a compiler license to build.

Realistically, the kind of errors you're likely to get from LIB$FIS are very limited. STS and STV aren't really that critical. Simplify it down to reporting image, symbol & status, eliminating the signal by using LIB$SIG_TO_RET and convert LIB$FIS into a normal RTL routine.

Worst case you can have a simple, single threaded program that takes the image symbol pair to investigate further (just let it signal the error), after all, for a given pair it either works, or it doesn't!
A crucible of informative mistakes
Hoff
Honored Contributor

Re: VAXC$ESTABLISH, LIB$FIND_IMAGE_SYMBOL, and threads

No answer here.

The AST and threads article (below) or related documentation should be extended to include dealing with signals; you've a very good question here. And you're almost certainly familiar with these topics, given what you're up to. But here's an overview -- there wasn't much on mixing ASTs and threads when this was written, and there doesn't look to be much with mixing threads and OpenVMS signals...

http://h71000.www7.hp.com/wizard/wiz_6099.html

I'd not assume anything about which threads were active when the signal hit; I expect you don't. (I'd half wonder if you got sliced in mid-signal here, too.)

Thread priority doesn't feel like the right solution here, though.

What's the handler? lib$sig_to_ret?

The other "fun" here involves cleaning up all the threads if this one hits an error.

A top-level queue would likely be the easiest way out here.
John Gillings
Honored Contributor

Re: VAXC$ESTABLISH, LIB$FIND_IMAGE_SYMBOL, and threads

Craig,

Trying to understand your code sample...

I notice your findsym_handler returns SS$_CONTINUE. That doesn't seem right to me, as it passes control back to LIB$FIS at the point of the signal and attempts to continue - I'd expect bad things like an ACCVIO. Don't you want to SS$_UNWIND, to return to the caller of your jacket routine?

I still don't see what this complexity buys you, other than a headache. If you want to see a message written to SYS$OUTPUT why not change your handler to simply adjust the severity of the condition in the signal array to "W" and SS$_RESIGNAL (the signal array will be on the stack, and therefore writeable) - let the default traceback handler display the message for you (plus a stack dump), but because you've changed the severity, processing will continue after a RET to the caller of your jacket.
A crucible of informative mistakes
Craig A Berry
Honored Contributor

Re: VAXC$ESTABLISH, LIB$FIND_IMAGE_SYMBOL, and threads

Well, hmm. I have tried both of John's suggestions. First I replaced SS$_CONTINUE with SS$_UNWIND. I still got the same accvio plus before that saw lots of other stack dumps signaled that had previously been caught. Maybe there are compiler-generated stack manipulations that aren't compatible with using SS$_UNWIND from C. That's just a guess based on the warnings in the docs to only use VAXC$ESTABLISH, not LIB$ESTABLISH, from C.

Then I got rid of the custom handler entirely and used lib$sig_to_ret instead. That seemed to do the trick, so that's what I've committed:

http://public.activestate.com/cgi-bin/perlbrowse/p/34046

As far as the complexity of an exception handler that tries to capture the full chain of errors, I have the traditional iron-clad excuse that I didn't write it. Plus I couldn't see that the code was doing anything evil or undocumented; in fact it looked a lot like various examples scattered here and there in the docs. But there are probably gotchas related to exception handling that are not fully documented, and simpler does look like it's better in this case.

For the record, it's not at all clear that threading directly caused the problem, but it certainly does multiply opportunities for mayhem, stack exhaustion, etc.

Thanks again for the analysis and advice.
Hoff
Honored Contributor

Re: VAXC$ESTABLISH, LIB$FIND_IMAGE_SYMBOL, and threads

In my experience, he who touched the code module last always trumps he who wrought it.

:-)