Operating System - OpenVMS
1752307 Members
5138 Online
108786 Solutions
New Discussion

Re: porting c programs from alpha to IA64

 
saulmashbaum
Occasional Contributor

porting c programs from alpha to IA64

We moved our application from Alpha OpenVMS 6.2 to IA64 OpenVMS 8.3 about 9 months ago.

 

We recompiled all our C programs with the HP C compiler. After a bit of work, we got a clean compile

for all our c programs.

 

Our application works as expected in the new environment, but from time to time "bombs",

suddenly stopping and sending the users back to the OS.

 

This happens quite rarely, so it's basically only an irritant, but when it happens, it doesn't exactly inspire user confidence

in our system

 

The cc qualifiers we used are /standard=vaxc/float=ieee_float/nooptimize/nolist.

 

On a one-time basis, we checked our c programs during compile using /check=all/warnings=warnings=all, and the

programs compiled cleanly.

 

We strongly suspect other cc qualifiers will help solve this problem.

 

Any suggestions, tips, and tricks would be greatly appreciated.

 

Saul Mashbaum

 

 

 

 

13 REPLIES 13
Hein van den Heuvel
Honored Contributor

Re: porting c programs from alpha to IA64

99.99% sure there are still issues with the application code.

No compiler switch will fix this.

The fact that you indicate compiling with /NOOPT is highly suspect.

To me that alone already suggest insufficiently understood / bad code.


 So when the process goes 'back to DCL' is there an error message? Status?

Any last gasp messages?  Can a process DUMP be forced? (set proc /dump)

Has there been an attempt to monitor process resources and quotas ( SHOW PROC/CONT .... hit: Q )

The issue is most likely in a timing sentive area.

Anything asynchroneous going on? Accidently or not?

(SYS$GETJPI or SYS$QIO used or the W variant?

 

Good luck!

Regards,

Hein van den Heuvel 

 

 

 

Hoff
Honored Contributor

Re: porting c programs from alpha to IA64

This is the wrong forum for the question, but since Hein started the thread here....

 

The rationale behine that /STANDARD=VAXC setting is compatibility with a C compiler that was replaced in 1993 (on OpenVMS VAX), and particularly compatibility with a C compiler that didn't follow ANSI/ISO, and that had very poor error diagnostics and error detection.  

 

That /NOOPTIMIZE is also specified implies other latent errors exist in the code.  Access to uninitialized or stack memory, buffer overruns, etc, can sometimes be masked by disabling the optimizer.

 

Put another way, the choice of compilation qualifiers here tend to suppress the compiler-detected errors that the current C compiler provides, and mask errors that might trigger errors in optimized object code.  Remove the /STANDARD=VAXC qualifier from the source code build, fix the diagnostics and the errors that will get reported, and see if your code gets more stable.  And see if the /OPTIMIZE now works.

 

Further, many programmers and various organizations consider it best-practices to increase the sensitivity of the compilers to errors.  To increase the sensitivity of the tools to source code errors; not to decrease that sensitivity with /STANDARD=VAXC or similar compiler-specific settings on other C compilers.  For instance, the OpenVMS C compiler's /WARN=ENABLE=QUESTCODE error setting can be requested, and can then report source code that's syntactically valid, but that might not work as intended.  

 

Consider /WARN=(VERBOSE,ENABLE=(QUESTCODE)) as a starting point, removing the /STANDARD=VAXC.

 

And FWIW, please don't cross-post.  If you have the urge or the scheduling need to to cross-post a question to ge the fastest possible answer, then consider that you'll want to acquire a formal escalation channel.  An organization where you can get a (fast) answer to these questions, and tailored to your particular requirements.  Forums are hit-or-miss, with answers posted when we see the questions, and folks (like me) can sometimes provide you with bad answers.  And system management?  Eh? This is a programming bug, at least until you prove it's not.

saulmashbaum
Occasional Contributor

Re: porting c programs from alpha to IA64

I thank Hein van den Heuvel for his very prompt response.

 

I will freely grant that if the underlying application code is flawed, tweaking the compiler qualifiers

is not going to solve the problem.

OTOH since the same code did *not* bomb on alpha, and porting c to IA64 is known not to be problem-free,

it seems to me to be reasonable to strongly suspect that the problem is due to differences in the Alpha/Itanium

architecture, particularly regarding float variables. It seems possible that some compiler qualifier

can help us with this.

 

I'd be interested in understanding the claim that the /noopt qualifier is "highly suspect" and points to "insufficiently understood / bad code".

Should we we compiling without this qualifier?  

 

We haven't yet identified on what the application is bombing; since the phenomenon is so rare, it's hard

to take system actions like a process dump to catch it.

 

Does anyone else here strongly feel that HP C compiler qualifiers are not relevant to the problem?

 

Saul Mashbaum  

saulmashbaum
Occasional Contributor

Re: porting c programs from alpha to IA64

I'm new here; when I posted my question here, I was unaware that there is a languages forum, clearly a more appropriate one for a question on porting C.

 

Some of the threads on this forum, which I saw before I posted, related to C, so it was reasonable to believe that

a question on C is okay here.

 

As I pointed out, we compiled all our programs with /warning=warning=all/check=all. With these qualifiers, the

compiler did not find anything suspect. Indeed, we will now compile without the /STANDARD=VAXC

qualifier and /noopt qualifiers, and see if the compiler comes up with anything.

 

Saul Mashbaum

H.Becker
Honored Contributor

Re: porting c programs from alpha to IA64

>>> ...noopt qualifier is "highly suspect"... Should we we compiling without
this qualifier?
 
Yes, use (the default) /optimize and use /noopt when you want to make debugging easier.
 
Without knowing what kind of error message/exception you are getting ... You mentioned floating point and I'm sure you are aware that floating point operations are neither commutative nor associative. Here /noopt and /opt may show differences, because the optimizer is free to re-arrange terms in an expression as long as it is correct in math.
 
Anyway, taking a process dump is cheap and easy. You probably read about it in a recent thread, here in this forum.
Hoff
Honored Contributor

Re: porting c programs from alpha to IA64

That the code didn't overtly fail on Alpha does not imply that there are no latent bugs; that the code is bug-free.  

 

There have been latent bugs for decades in some commonly-used code, too.  

 

I've found some of my own after a decade or more of use of the code.

 

As for your questions around the applicability of the qualifiers, various programmers will turn off the optimizer because they encountered odd run-time errors with the code.  If that compilation switch change then masked the error, then some programmers will choose to leave it; to not to research what is often a latent error in the code, and the code will remain unoptimized.  (These can be somewhat nasty to find, too; stack-smashing, buffer overflows, IOSBs or variables that are used asynchronously and particularly after the stack frame leaves scope, etc.)

 

Various programmers also use /STANDARD=VAXC because the compiler-generated diagnostics were viewed as noise.  Some will use it because some module actually contains K&R constructs or VAX C extensions, and because they (for whatever reason) didn't remove those constructs.

 

Together, the use of /NOOPTMIZE and /STANDARD=VAXC imply that this might be poor-quality and/or older C code.   Most of the VAX C code (K&R-era code, rather than the ANSI C compiler for VAX) was pretty wonky.

 

Put another way, few folks will choose to use those qualifiers for any reason other than masking errors and diagnostics; errors that the compiler found, and that the programmer decided were spurious and/or too much work to address.  Some of the diagnostics generated in these older code-bases will be spurious.  Some will be pedantic.  Some will be latent bugs.

 

I've seen more than a few of these cases over the years that tracked back to (for instance) dropped a quadword 1 into the stack of a now- inactive stack frame, and nothing in the code ever noticed that.  Well, not until after a VMS upgrade, or a faster VAX processor or an SMP box was deployed, or after an Alpha or Itanium port.

 

>it seems to me to be reasonable to strongly suspect that the problem is due to differences in the Alpha/Itanium
architecture...

 

My preferred rule of thumb for these sorts of run-time errors: if the error is cropping up within my code, then it's my bug, at least until I prove it's not my bug.  Usually starting with higher-than-typical diagnostics.  Once I've proved that the trigger is not my bug, then I usually have some sort of a reproducer, and I can hand that reproducer off to whomever is supporting the errant code (which can also makes for faster turn-around for a fix, too) , and I can later use that reproducer to verify the fix when it arrives.  

 

And if I find that the bug in my code, well, I've found the bug.  

 

Assuming that these sorts of run-time errors are caused by somebody else's bug - as a starting point for the debugging and the related discussion - usually only delays the resolution. 

 

To harden a production application, adding process dumps to normal operations is typical, as is adding signal handlers into code, and adding integrated debugging.  All typical.  Increasing the reliability of existing code and getting to where it can be more easily debugged is an ugly, slow and nasty slog - particularly in large and gnarly and old C code code-bases - but that's how applications are made more reliable, and how you can catch and debug and resolve errors with fewer failures.

 

 

Craig A Berry
Honored Contributor

Re: porting c programs from alpha to IA64

It wasn't clear to me from the original post when the /float=ieee_float qualifier was added to the mix.  It's actually redundant on Itanium because it's the default there.  If that qualifier was not being used on Alpha, then one potential problem is that D_FLOAT or G_FLOAT data may be getting pulled into the program from somewhere and acted upon as if it were IEEE data.  That is certainly guaranteed to produce incorrect results and crashes are likely, though not guaranteed.

GuentherF
Trusted Contributor

Re: porting c programs from alpha to IA64

I doubt there is any issue with floating points. If there would the application would show vastly different numbers/errors from floating point results immediately.

 

So the application "bombs"!? Haha...pretty precise error statement :-)

 

If the application goes right back to DCL without any further indication I'd expect the code is doing something at executive level. Does the code use "User Written System Services" or calls SYS$CMKRNL at some point?

 

/Guenther

John Gillings
Honored Contributor

Re: porting c programs from alpha to IA64

>but from time to time "bombs",

 

  Rather than expecting wild guesses based on this description, could you please post the exact and complete error message that you're describing as "bombs".

 

A crucible of informative mistakes