Operating System - OpenVMS
1752752 Members
5200 Online
108789 Solutions
New Discussion юеВ

Re: HPBASIC Thread safe?

 
SOLVED
Go to solution
True_Geek
Occasional Visitor

HPBASIC Thread safe?

Hi

 

I'm wonder if there anything similar in HPBASIC as Fortans and Pascals "VOLATILE"?

 

I have /NOOPT when compiling.

I have not tried /SYNCH and I wonder if that would help?

Some times it seams that code ain't executed in the order it's written...

 

BR

n00b

 

 

 

 

 

19 REPLIES 19
abrsvc
Respected Contributor

Re: HPBASIC Thread safe?

More specifics about the OS level, Language version etc .would be nice.

Given that you are requesting /NOOPT, you should not see anything optimized away that the equivalent of /VOLATILE would prevent.

One option that comes to mind though, would be to declare the variable as External. With the external reference, the compiler should not be able to optimize it in any way as it will not know much about it.

Why do you believe that there are out of order execution issues? Can you provide more details about the program environment?

Thanks,
Dan
Hein van den Heuvel
Honored Contributor

Re: HPBASIC Thread safe?

I don't think you can get Basic to be tryuly thread safe, but if I recall correctly it  is relatively safe anyway since it does not optimize variables into registers and such. 

 

I would stick anything that can change behind it's back into a MAP or COMMON to be sure.

For more tricky stuff you may need to protect with $SETAST (yuck) or a wrapper function with a sync mechanism.

 

What problem are you trying to solve?

What are you observing that makes you conclude the order is not as expected/intended.

 

Itanium? OpenVMS version? Basic version?

 

Hein

 

 

Craig A Berry
Honored Contributor

Re: HPBASIC Thread safe?

I think I remember being told that BASIC is thread-safe but not AST-reentrant.  Or, in other words, it depends on the threading model being used, which I don't see specified yet.

True_Geek
Occasional Visitor

Re: HPBASIC Thread safe?

Hi

 

Openvms 8.3 Alpha  and basic 1.7.

 

Sorry for my English in advance :)

 

I started to think a bit, yay :) And made my self some test programs, see test.zip.

 

The "run_counter1" just simulated the sys$enqw, but not generate what i expected when forcing them to diffrent CPU:s... so skip that for now.....

 

The run_counter2 simulated my problem.

Edit build.com were you place your source, se the MY_EXEC define...

@build

 

then run run_odd.com and run_even.com in 2 diffrent screens.

check what cpu's you have on system and clear and set the once you have in run_even/run_odd.com

 

The idee is simple:

 

 map (counter_common)       byte in_use,       &
                                                     long counter, &
                                                     long counter_array(32000)
 

Loop over every odd place in counter_array and write a value known only localy to program, then loop again check that it's the right value on all odd in array.

 

While doing the odd thing above, do the same thing for but for even places...

 

In theory they access diffrent parts of shared memory, you could expect problem if you force them to diffrent CPU's due to cache updates etc, but not of you force them to same cpu..

 

My test bench is a  2 cpu system, see the "set aff" in run_even/run_odd.

 

I was runing on same CPU but still they messed up.

 

Run 1

 

Terminal ODD:

@run_odd

$set process /affinity /permanent /set=0 /clear=(1)
$run run_counter2
1
 Interrupt - I interupted with ctrl-C because EVEN broke

 

Terminal EVEN:

@run_even

$set process /affinity /permanent /set=0 /clear=(1)
$run run_counter2
0
Process 2: Counter_array[ 19892 ]; 3753  <> last_counter: 3764

 

Even should find same value all the way over the loop.

It expected to find 3764, but found 3753... how come???

 

It show typical cpu cache problem, and it's repitable for me......

 

In real life i force those old vax programs to one cpu, but they still time to time "crash" due to the shared array get mixed data.

 

So is it a Openvms bug? -  that at heavy load it migrates processes to other cpu even if it's should not.

Or a compiler bug?

 

BR

n00b

 

 

 

 

 

 

True_Geek
Occasional Visitor

Re: HPBASIC Thread safe?

Quick update.

 

If i select a FIXED value of LAST_COUNTER and not increment it, then ODD and EVEN never break.

How odd...

 

BR

n00b

True_Geek
Occasional Visitor

Re: HPBASIC Thread safe?

 

Update:

 

Fixed to that the value i add to the array will be ODD if write to ODD array place and EVEN if written to EVEN.

Also i write out the i-2 arrays value and i don't exit program, just let it run the rest off array.

 

Terminal ODD:

run run_counter3
Start value for this process? 1
Process 2: Counter_array[ 12011 ]; 2297  <> last_counter: 2305
Prev: Counter_array[ 12009 ]; 2305
Process 2: Counter_array[ 12013 ]; 2297  <> last_counter: 2305
Prev: Counter_array[ 12011 ]; 2297
Process 2: Counter_array[ 19459 ]; 3737  <> last_counter: 3759
Prev: Counter_array[ 19457 ]; 3759
Process 2: Counter_array[ 19461 ]; 3737  <> last_counter: 3759
Prev: Counter_array[ 19459 ]; 3737
Process 2: Counter_array[ 337 ]; 4333  <> last_counter: 4339

 

Terminal EVEN:

 run run_counter3
Start value for this process? 0
Process 2: Counter_array[ 30340 ]; 5276  <> last_counter: 5304
Prev: Counter_array[ 30338 ]; 5304
Process 2: Counter_array[ 7234 ]; 7690  <> last_counter: 7714
Prev: Counter_array[ 7232 ]; 7714
Process 2: Counter_array[ 24178 ]; 8220  <> last_counter: 8274
Prev: Counter_array[ 24176 ]; 8274
Process 2: Counter_array[ 25090 ]; 9170  <> last_counter: 9198
Prev: Counter_array[ 25088 ]; 9198

 

The thing... with this.... is that it always looks like the ODD only see wrong "ODD" value like the value many cycles ago, same goes for EVEN.

 

When running alone, it OK, but when i start the other process the other seams to read OLD data....

 

The code:

 

       input "Start value for this process";start

        last_counter = start


        while(1=1) !loop forever

                i = start
                last_counter = last_counter + 2

                while (i < 32000 )

                        counter_array(i) = last_counter
                        i = i + 2
                next

                i = start

                while (i < 32000)

                        If counter_array(i) <> last_counter then
                                print "Process 2: Counter_array[";i;"];";counter_array(i);" <> last_counter:";last_counter
                                print "Prev: Counter_array[";i-2;"];";counter_array(i-2)
                        End if
                        i = i + 2
                Next


        next

 

 

BR

n00b

 

 

John Gillings
Honored Contributor
Solution

Re: HPBASIC Thread safe?

Spoiler
 

BR,

   I don't think this has anything to do with thread safety, AST reentrancy or stale CPU caches, it's a simple case of unsynchronised access to a shared write access data structure.

 

  There's no point in arguing about theory, or trying to figure out what might be happening. IF you have a shared data structure that will be written by multiple processes, ESPECIALLY when the unit of reference is smaller than a quadword, THEN you need some form of synchronisation. No ifs, ands, buts or maybes. I present your posting as evidence. If you find you need to lock a process to a CPU in order for your algorithm to work, then IT'S WRONG! Locking multiple processes to a single CPU is basically using the CPU itself as a very expensive lock.

 

  What you need to work out is the required granularity of locking and choose an appropriate mechanism. $ENQ is general and fairly simple. Depending on the frequency of access and the performance requirements you may be able to lock the whole table. Typically your writers would $ENQ a null mode lock, then convert to EX to write. Readers would $ENQ a null mode lock and convert to CR when they want to read.

 

  Don't even think about trying to hack around this fundamental principle. Without proper synchronisation your code WILL FAIL somewhere, sometime, and most likely in a manner that's very difficult to reproduce or diagnose.

 

  Remember that the underlying hardware accesses memory in whole quadwords. So, to update a longword, the hardware reads the surrounding quadword, masks in the updated longword and writes back the whole quadword. If two processes attempt to update adjacent longwords simultaneously, only one will win. This is known as "word tearing".

 

A crucible of informative mistakes
True_Geek
Occasional Visitor

Re: HPBASIC Thread safe?

Hi

 

Thank you for your reply.

 

It was most help full.

 

In the real application we have sync with enq, ast etc. but that don't help.

 

The thing you said about Quadword sounded like the right thing.

So I tested it

 

So I=I+4

And i start the processes with "0" and "2" (due to i have 4 bytes long)

 

Then it works perfectly!!! 

 

So now I have a solution:

To encapsule the struct/Record with 1-8 bytes (8 bytes most safe) from start and add a

STRING FILL = 0 at end to get it to even boundary.

 

Just one question remain... just because i'm a n00b on alpha.

 

If Set aff realy works and kernel don't have a bug that starts to migrate processes during load, then:

 

What i don't understand is that both processes sits on same cpu and share same cache.

I would expect that we have only one copy of the memory area in CPU and there for the QUADWORD problem should not exists.

But it do exists, so then the memory area exists in 2 instances in the CPU or SET AFF is broken.......

 

Any way that discussion is perhaps point less if i can get my hand on some good document over Alpha internal memory/cache hanling. Any one have a link?

 

 

 

Br

n00b

 

 

 

 

 

 

abrsvc
Respected Contributor

Re: HPBASIC Thread safe?

The issue here is less to do with the cache and more to do with the sync that John explained. For more information, look up the LDL_L/LDQ_L and STL_C/STQ_C instruction sets for Alpha. Reference the Alpha Architecture Reference Manual pages 4-8 thru 4-12. This is more commonly known as the "load locked/store conditional" instruction group. This will further explain the problem/solution.

Dan