Fortran and C alignment

Jenwae · ‎08-13-2009

Hi,

I have searched the KB, but did not find what is needed.

We have a legacy application written in Fortran, that we are converting to C.
For unit testing purposes (also divide-and-conquer), we typically convert a process to C, and leave the rest in Fortran. This will require our converted C process to call certain modules that are in Fortran, that is, calling Fortran from C (sharing programs in mixed languages). Our problem now lies in the alignment especially for COMMON blocks (shared memory).

Our legacy uses many COMMON, and we have "successfuly" map the global section in C to the COMMON in Fortran. However, we found out (through compile maps) that the global section size in C does not match the one in Fortran, and this will cause issues at run-time. This may be caused by alignment.

Using MMS, we built the FOrtran using:
FORTRAN/noopt/float=ieee_float/align=(commons=(natural,multilanguage),records=natural)

For C:
cc/noopt/standard=C99/member_alignment/extern_mode=common_block/float=ieee_float

We think the build commands are correct, as previously, for Fortran, the records=packed is used.

However, because of the way our legacy was coded, the build commands may not cure the problem. In a typical declaration, the different data types are interspersed, as opposed to clean ordering. For eg,

INTEGER*2 c1w
INTEGER*4 c2l
INTEGER*2 c3w
REAL*8 costr
INTEGER*2 c4w

COMMON /xyz/ c1w,c2l,c3w,costr,c4w

Is there a way to resolve this alignment issue without having to change the legacy code (there are countless of such COMMON declaration)?

Thank you very much.

Hoff · ‎08-13-2009

If I understand the question and in the C code, you can control alignment at the structure level with the following:

#pragma member_alignment save
#pragma [no]member_alignment
whatever data you are looking to control
#pragma member_alignment restore

The pragmas can also be used to control ref-def and related external addressing, as well. When working with a common, these are key. See #pragma extern_model et al.

Personally, I'd look to ditch the Fortran common stuff entirely going forward. Global sections have various advantages, and the installed commons have various disadvantages.

Though you're probably past this stage, here's some stuff I wrote up on working with Fortran commons in C code, if that's of interest:

http://h71000.www7.hp.com/wizard/wiz_2486.html

Steven Schweda · ‎08-13-2009

Or manually pad the C structures so that the
(real) C data align with the Fortran data.

Someone had to create the C structures for
the new C code, right? How was that done?
(I see your Fortran example code. I don't
see the corresponding C code.)

Joseph Huber_1 · ‎08-13-2009

Assuming the order of variables in the Fortran common and the C structures are the same, then the length of the Psects is determined by padding.
Specify /PSECT_MODEL=MULTILANGUAG in CC, corresponding to /align=commons=(natural,multilanguage) in Fortran, to get the same Psect size.

http://www.mpp.mpg.de/~huber

Joseph Huber_1 · ‎08-14-2009

BTW,
if compiled with NOMULTILANGUAGE, DECC still padds the Psect to a multiple of 8 bytes, while Fortran does no padding at all in this case.

So to mix languages always align natural AND multilanguage.

http://www.mpp.mpg.de/~huber

Robert Gezelter · ‎08-14-2009

Jenwae,

Needless to say, using the compiler listings to VERIFY that the variable alignments actually are correct is an excellent idea.

Both compilers have various listing options. A careful review of actual printed variable assignments (offsets into structures) would be sound procedure.

- Bob Gezelter, http://www.rlgsc.com

Jenwae · ‎08-14-2009

Hi All,

Thank you so much for the replies.

Hoff, I almost did that but the application is huge, and to insert that pragma section for each module is not too practical. Also, trying the pragma directives on one individual module reveals that it does not work (seemingly).

Steve:
The real issue is that the section for Fortran is SMALLER than the C. I want to avoid changing anything on the legacy Fortran.
Also, when I used the option COMMONS=(NATURAL, RECORDS=NATURAL), instead of the original RECORD=PACKED, the FOrtran section is still SMALLER than C, which I don't understand why.

In the example below:

INTEGER*2 c1w
INTEGER*4 c2l
INTEGER*2 c3w
REAL*8 costr
INTEGER*2 c4w

the C equivalent (converted automatically) by a translator) is:

short int cw1
long int c21
short int c3w
double costr
short int c4w

For every short int, the C compiler will pad an additional 2 bytes (to make it natural-aligned). However for Fortran, it still maintain a Integer*2 (without padding) even after using RECORDS=NATURAL

Joseph:
I tried both multi and nomulti, and all opther option combination...to no avail.

Robert:
I linked with the option /map. From the map, I could see the section for C is always bigger than that of Fortran for that particular global sections.

Steven Schweda · ‎08-14-2009

> #pragma member_alignment save
> #pragma [no]member_alignment
> whatever data you are looking to control
> #pragma member_alignment restore

This seems to work for me. Did you try it?

alp $ type cs.c
#include

main()
{
struct
{
short int cw1;
long int c21;
short int c3w;
double costr;
short int c4w;
} cs;

printf( " cw1: %2lld.\n", (long long) &cs.cw1- (long long) &cs);
printf( " c21: %2lld.\n", (long long) &cs.c21- (long long) &cs);
printf( " c3w: %2lld.\n", (long long) &cs.c3w- (long long) &cs);
printf( " costr: %2lld.\n", (long long) &cs.costr- (long long) &cs);
printf( " c4w: %2lld.\n", (long long) &cs.c4w- (long long) &cs);
printf( " size: %2d.\n", sizeof cs);
}
alp $ cc cs
alp $ link cs
alp $ run cs
cw1: 0.
c21: 4.
c3w: 8.
costr: 16.
c4w: 24.
size: 32.

So, the default is 4-byte alignment, but:

alp $ type cs2.c
#include

main()
{
#pragma member_alignment save
#pragma nomember_alignment
struct
{
short int cw1;
long int c21;
short int c3w;
double costr;
short int c4w;
} cs;
#pragma member_alignment restore

printf( " cw1: %2lld.\n", (long long) &cs.cw1- (long long) &cs);
printf( " c21: %2lld.\n", (long long) &cs.c21- (long long) &cs);
printf( " c3w: %2lld.\n", (long long) &cs.c3w- (long long) &cs);
printf( " costr: %2lld.\n", (long long) &cs.costr- (long long) &cs);
printf( " c4w: %2lld.\n", (long long) &cs.c4w- (long long) &cs);
printf( " size: %2d.\n", sizeof cs);
}
alp $ cc cs2
alp $ link cs2
alp $ run cs2
cw1: 0.
c21: 2.
c3w: 6.
costr: 8.
c4w: 16.
size: 18.

If that's not what you want, then what _do_
you want? One of us seems to be missing
something. You or I?

Hoff · ‎08-14-2009

I've used the exact technique and the exact construction with the compiler pragmas. It works in all the cases I've tried. The pragma-based approach also works nicely in larger applications, when module-wide settings for non-native alignments can be detrimental to application performance.

If you're repeating structures (eg: a C array) within the common, then you can get padding at the end of the structure.

Use the debugger. Examine memory. Figure out what is going on.

And chuck the common-based design out the airlock at your first opportunity; position-independent designs are much more flexible.

Joseph Huber_1 · ‎08-15-2009

>>
Joseph:
I tried both multi and nomulti, and all opther option combination...to no avail.

What compilers/system versions do You have ?

Mine, starting with DECC 6.5, Fortran 7.2, VMS 7.3-1 and upwards (Alpha and Itanium) both Fortran and C commons are exactly the same length (32 bytes), and the individual fields as Steven showed, if compiled with natural alignement and multilanguage padding.

http://www.mpp.mpg.de/~huber

Steven Schweda · ‎08-15-2009

> What compilers/system versions do You
> have ?

Why stop there? Where's the real code?
Where's the real build procedure? Where're
the real results? What's the real problem?

All we have so far is a lot of hand-waving
and claims, but no actual failing test case
with which to work. _My_ test cases seem to
work as expected (following advice already
given here), so I'm happy.

Robert Gezelter · ‎08-15-2009

Jenwae,

Non-natural aligned data is a long standing practice. It is admittedly difficult to imagine in these days of gigabyte hand-held devices, but not that long ago every byte, not to say bit, was precious, and data structures were packed very tightly. Remember, the original VAX-11/780 was sold in configurations of 128Kbytes [VAX-11/780 Hardware Handbook, pp 11, 1978).

The data structures used for interfacing to X-Windows were packed very tightly, and there is associated documentation. The DECnet connection block is another example of a record format that is packed non-naturally aligned. There are many others.

The pragma referred to by Hoff is placed in the external include file for the record definitions, not in "every module". If the exiting code base has COMMON block definitions in every module, that is unfortunate. The convention in C/C++ is to use the #include compiler directive to incorporate an external file (.h) containing record definitions. Typically, the #pragma directives are included in the .h file.

This does work. I would suggest that it would be appropriate to produce a small FORTRAN/C file which demonstrates the problem so that the problem can be seen precisely.

- Bob Gezelter, http://www.rlgsc.com

Robert Gezelter · ‎08-15-2009

Jenwae,

One of the followup posts contained:

"Robert:
I linked with the option /map. From the map, I could see the section for C is always bigger than that of Fortran for that particular global sections."

Note the "linked with the option /map". This will not be particularly useful, while it will highlight the differences in the overall computed lengths.

One wants the COMPILER layouts of structures. This will be in the COMPILER listing. By the time the LINKER is involved, that information is all irrelevant.

- Bob Gezelter, http://www.rlgsc.com

John Gillings · ‎08-17-2009

>For every short int, the C compiler will
>pad an additional 2 bytes (to make it
>natural-aligned). However for Fortran,
>it still maintain a Integer*2 (without
>padding) even after using RECORDS=NATURAL

I think C is getting it wrong. The natural alignment for a WORD (short int) is WORD alignment (ie: even byte boundaries), so you'd expect either no padding, or a single byte to move the field to an even byte boundary. If C is adding 2 bytes to longword align the field, that's a different type of alignment from the Fortran definition of "natural".

You may need to manually align your records to ensure they match across languages. Since you say you don't want to change the Fortran, use a compiler listing to determine the exact placement of fields, then use NOMEMBER_ALIGNMENT pragma in C and define the record, if necessary with your own padding to place fields where they need to be.

A crucible of informative mistakes

Steven Schweda · ‎08-17-2009

> I think C is getting it wrong. [...]

We still don't know what happens in the
secret code cited in the original complaint,
but in the example code which I created from
the original vague description, the C
compiler seems to do fine. The last two
"short" guys _are_ 4-byte aligned, but they
follow properly aligned 4- or 8-byte members
("long int", "double"), so they get
better-than-needed alignment as a free bonus.

> >For every short int, the C compiler will
> >pad an additional 2 bytes (to make it
> >natural-aligned).[ ...]

Remember, we have only this assertion, with
no actual evidence for it. What's true in my
actual example code is that padding is added
_after_ a "short int", so that the thing
_after_ that "short int" is getting naturally
aligned. No one is doing any padding to
super-align the "short int" members. The
only padding done is to align naturally the
guys which follow them. Which is what I'd
expect. Isn't it? I think that that's what
I expect.

Joseph Huber_1 · ‎08-17-2009

>>
I think C is getting it wrong. The natural alignment for a WORD (short int) is WORD alignment (ie: even byte boundaries),

No, C is doing it exactly right: just arrange the shiort words in sequence, and they are aligned realluy naturally, i.e. on consecutive word addresses as one would assume. There is no 4 byte default or padding.
Only at the end of a structure C padds to the next full 8 byte boundary, so that the total length is always a multiple of 8.
Fortran does the same natural alignment, it just does not pad at the end of a common unless "multilanguage" is specified, then both compiklers padd to the next 16-byte multiple.

Just see Stevens example with rearranged variables below, with 2 consecutive shorts, and the single byte at the end forcing size=40.

#include
int main()
{
struct
{
short int cw1;
long int c21;
short int c3w;
short int c4w;
char b1[3];
short int cw5;
char b2[3];
double costr;
char b3;
} cs;

printf( " cw1: %2lld.\n", (long long) &cs.cw1- (long long) &cs);
printf( " c21: %2lld.\n", (long long) &cs.c21- (long long) &cs);
printf( " c3w: %2lld.\n", (long long) &cs.c3w- (long long) &cs);
printf( " c4w: %2lld.\n", (long long) &cs.c4w- (long long) &cs);
printf( " b1: %2lld.\n", (long long) &cs.b1- (long long) &cs);
printf( " cw5: %2lld.\n", (long long) &cs.cw5- (long long) &cs);
printf( " b2: %2lld.\n", (long long) &cs.b2- (long long) &cs);
printf( " costr: %2lld.\n", (long long) &cs.costr- (long long) &cs);
printf( " b3: %2lld.\n", (long long) &cs.b3- (long long) &cs);
printf( " size: %2d.\n", sizeof cs);
}

_HUB>cc commonc2.c
_HUB>link commonc2
_HUB>run commonc2
cw1: 0.
c21: 4.
c3w: 8.
c4w: 10.
b1: 12.
cw5: 16.
b2: 18.
costr: 24.
b3: 32.
size: 40.

http://www.mpp.mpg.de/~huber

Jenwae · ‎08-18-2009

Hello to All of You,

Sorry for my late reply. I've been trying out a few things as per your suggestion.

OK, some of the findings to your suggestions:

Using the #pragma directives (noalignment) for this BFCCOM struct, it works brilliantly as Hoff, Steve and Robert suggested. This approach works after I build the Fortran without alignment too, using /align=(commons=packed,record=packed).

Joseph suggested using the /psect_mode=multilanguage. It works brilliantly, after I adjust the Fortran option /align=(commons=natural,record=natural).

So the clue is to use the right combination. Thank you so much to all of you. I really appreciate that. I will use Joseph's solution so that I can avoid inserting the directives.

There's still some peculiarity as to how the alignment actually works. The compiler does not behave consistently, or I could have understood the wrong way. Anyway, the content below is just for curiosity as to how the sizes can differ.

To make it more understandable, I'm using the similar code definition.

The Fortran header file BFCCOM.INC looks like this:

INTEGER*2 DATEW(400)
INTEGER*4 FILEL(400)
INTEGER*4 SIZEL(400)
INTEGER*2 CNDXW
INTEGER*2 CTOPW
LOGICAL*1 LIVT(400)
BYTE CURB
INTEGER*4 BITL(768)
BYTE CEND

COMMON /BFCCOM/ DATEW, FILEL, SIZEL,
CNDXW, CTOPW, LIVT, CURB,
BITL, CEND

The converted C equivalent is:

typedef struct bfccom {
short int datew[400];
long int filel[400];
long int sizel[400];
short int cndxw;
short int ctopw;
BYTE livt[400];
BYTE curb;
long int bitl[768];
BYTE cend;
} BFCCOM;

extern BFCCOM bfccom;

In our legacy system, we have a abc.FOR program calling a def.FOR subroutine, and both have the BFCCOM common as include, so that they can share data.

In our "new" system, we are converting code to C, but at this point of time, we need to inter-mix C and Fortran.
So we changed to abd.C calling the same def.FOR (C calling Fortran).
Obvously, we need to map the BFCCOM in C to match the one included in def.FOR so that they can share data, as before.

Hence, I compile the C using the option cc/extern_mode=common_block/member_alignment/standard=C99/float=ieee_float

I compile the Fortran using important option such as /align=(commons=natural,records=natural)

IN the abc.C routine, I did a series of printf statements for both the sizes and the starting addresses of the members. I get (size, then start address)

datew = 800 -> 0
filel = 1600 -> 800
sizel = 1600 -> 2400
cndxw = 2 -> 4000
ctopw = 2 -> 4002
clivt = 400 -> 4004
curb = 1 -> 4404
bitl = 3072 -> 4408
cend = 1 -> 7480

In the abc.FOR routine, I did the same prints, and got the same result. (the map in def.FOR is same as abc.FOR, so I use abc.FOR for convenience).

However, when I look at the map produced by each C and Fortran build, the length is different!

In the C build, I see:

Psect Name Module/Image Length
BFCCOM
ABC ( 7488.)
DEF ( 7481.)

IN the Fortran build, I see:

Psect Name Module/Image Length
BFCCOM
ABC ( 7481.)
DEF ( 7481.)

Somehow, after the build, the BFCCOM is extended by another 7 bytes in the C program! Because of this, ABC cannot successfuly share data with DEF.

I'm not sure where the padding happens, but the sizes of the following might help:
1) size of bfccom object/instance = 7484
2) sum of struct's (bfccom) member sizes = 7478
3) size of the BFCCOM in the map = 7488

I suppose C compiler pads the "bytes" CURB and CEND to natural-align to long (4 bytes), and this additional 6 bytes increase size from (2) to (1). Maybe someone can explain if you have the time.

Thanks!

Hoff · ‎08-18-2009

I'd probably use C pointers and the Structure Definition Language (SDL) here, and create and assign data structures. SDL lets you declare the data structures, and then load the definitions into every module using includes. And you can use these same definitions across a mixture of programming languages; C or Fortran or whatever. (This is how OpenVMS itself is built; using SDL.)

I'd not look to use strict COMMON mapping, and would not look to try to force-fit a Fortran design into some new C code.

You're writing new C code, after all. Not old Fortran code.

If you really want to do this without SDL, I'd use the following:

typedef struct bfccom {
short int *datew;
long int *filel;
long int *sizel;
short int *cndxw;
short int *ctopw;
BYTE *livt;
BYTE *curb;
long int *bitl;
BYTE *cend; } DFCCOM;

And set the pointers to the COMMON at run-time.

In particular, please stop replicating the existing Fortran limits and the existing designs. For instance, there are fixed-size arrays in the Fortran code. You don't need to do that when you're not using a COMMON, so (here) set up to use the COMMON, and when you excise the last of the Fortran code from your environment you can then use C and RTL and OpenVMS system service calls to adjust and tune and even increase the size of this stuff on the fly.

If you replicate the exact design, you can end up replicating the same old limits.

Joseph Huber_1 · ‎08-18-2009

>>

I suppose C compiler pads the "bytes" CURB and CEND to natural-align to long (4 bytes), and this additional 6 bytes increase size from (2) to (1). Maybe someone can explain if you have the time.
>>

Don't suppose: the C padding happens at the end of the struct up to the next 8-byte border, not for individual members, as I have shown in a previous reply.
You (purpeously ?) omitted the MULTILANGUAGE option in the Fortran align, and the /PSECT_MODE=MULTILANGUAGE option in CC:
this forces both compilers to pad up to the next 16-byte border. The member alignment stays as is with natual, but the total length of the psect/common will be the same for both languages.

http://www.mpp.mpg.de/~huber

circepb · ‎08-19-2009

Try the following and I think it should work
$ ccxd == "CC/DecC/noopt/noWarn/Float=D_Float/SHARE_GLOBALS/extend/" + -
"EXTERN_MODEL=COMMON_BLOCK/deb/lis"

Jenwae · ‎08-20-2009

Hi Guys,

Thanks for all the replies. On my machine, it works. But when we try to do the same on another Intgerity machine, we got the prior alignment problems. I attach the information in doc, so that it retains the format.

Kindly help.

Hoff · ‎08-20-2009

Two obvious paths here include altering the C code use C pointers to reference the data structures (an approach which has other benefits going forward), or an upgrade of the OpenVMS I64 boxes to use consistent versions of the operating system and tools.

Joseph Huber_1 · ‎08-20-2009

It looks as if the Fortran compiler used natural alignment seen after the single byte at BFCSBFBITL, starting at a longword aligned address.
If You really did what You said, the this would mean the Fortran compiler ignored the packed alignement option at this system, but I can't test it here, and anyway doubt it. Or Your effective MMS rules result in a different set of options.

You have to look into the Fortran compiler listing to see the options actually used.

And I can't understand why You go back to packed alignement: at least on itanium this is a nogo!

http://www.mpp.mpg.de/~huber

Joseph Huber_1 · ‎08-20-2009

And I tested on a HP rx2620 using HP Fortran V8.1-1-104930 , and got

Address Type Name
1-00001134 I*1 BFCCURB
1-00001135 I*4 BFCSBFBITL
1-00001D35 I*1 BFCEND

i.e. alignment is packed: longword BFCSBFBITL has an odd byte address.

Also the linker did not magically align it. At run time the addresses look like:

BFCCURB: 69940
BFCSBFBITL: 69941
BFCEND: 73013
clearly unaligned !

So review Your compiler listing!

http://www.mpp.mpg.de/~huber

Jenwae · ‎08-25-2009

Hi Hoff,

Thanks for your suggestion. I haven't got the chance to try out your C pointer since I managed to finally get it working. But you know what you are talking obviously, and I am going to seriously try it in the near future, moving forward.

Hi Joseph,

I got it working, thanks to your hint! I enabled the /LIST in the mms description file, and in doing so, I also deleted some commented lines. These lines were commented to preserve the original flag options used in the build command. These comments appear only in the second system rx2600(where it did not work. One would think it should not matter, but it did.

Once taken out, the mms description file is read correctly, and tha mapping completed with matching sizes.

Thanks, you have been a great help.

I'm closing this thread, because contributions from all of you have made this possible.

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Fortran and C alignment

Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment

Re: Fortran and C alignment