Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

alignment faults

SOLVED
Go to solution
Jess Goodman
Esteemed Contributor

alignment faults

We have a set of batch jobs running on a VMS 7.4 Itanium that generate a fairly high rate of alignment faults (~10K/SEC). I traced this down to code that calls LIB$CVT_VECTIM. The code passes in successive elements of an array of NUTIMs. Since one NUMTIM element is 7 short-words, every other NUMTIM in the array is at an address that is word-aligned, but not long-word aligned.

My questions are:

* Since the argument to LIB$CVT_VECTIM is defined as an array of words, why does it fault when the argument is word-aligned?
* It took a while to track this down because the faults did not show up on my screen, although I had run a program that called sys$perm_report_align_fault. They also did now show up when run in the debugger with SET BREAK/EXCEPTION on. Does that mean that LIB$CVT_VECTIM somehow hides these alignment faults?

Below is a test fortran program that generates an alignment fault by callinig LIB$CVT_VECTIM this way. Note that the call to SYS$NUMTIM with the same argument does not fault.

Attached is the output of SDA> FLT SHOW TRACE for this test program.

IMPLICIT NONE
INTEGER*8 VMS_TIME

STRUCTURE /NUMTIM_DEF/
INTEGER*2 YEAR
INTEGER*2 MONTH
INTEGER*2 DAY
INTEGER*2 HOUR
INTEGER*2 MINUTE
INTEGER*2 SECOND
INTEGER*2 HUNDREDTHS_OF_SECOND
END STRUCTURE
RECORD /NUMTIM_DEF/ NUMTIMES(2)

CALL SYS$NUMTIM( NUMTIMES(2), )
CALL LIB$CVT_VECTIM(
1 NUMTIMES(2),VMS_TIME)

CALL EXIT
END

I have one, but it's personal.
9 REPLIES
John McL
Trusted Contributor

Re: alignment faults

I suspect that NUMTIM won't suffer alignment faults because the target 7-word array will be aligned to a longword boundary.

What happens if you specify your data structure as having 8 bytes, the last being essentially an unused pad?
RBrown_1
Trusted Contributor

Re: alignment faults

John McL >>>>
I suspect that NUMTIM won't suffer alignment faults because the target 7-word array will be aligned to a longword boundary.
<<<<

I don't see it that way, John. If Jess calls SYS$NUMTIM with NUMTIMES(2) and then calls LIB$CVT_VECTIM with NUMTIMES(2), why would the alignment of the 7-word array be different between the two calls?
Robert Gezelter
Honored Contributor

Re: alignment faults

Jess,

So that we are all on (pardon the pun) precisely the same page, could you please do your compile of the example with the machine code listing enabled and post the resulting listing as an attachment? It should give everyone the precise details of the parameters and data structures, without involving the potential clutter of different settings and options.

[A small nit [smile], I strongly suspect that you are running OpenVMS 8.4 on Itanium]

- Bob Gezelter, http://www.rlgsc.com
GuentherF
Trusted Contributor

Re: alignment faults

Smells like LIB$CVT_VECTIM is using a quad or longword instruction to move the array to a temporary buffer (wild guess). And $NUMTIM is just more carfully coded.

Options I see are adding an extra word to the structure or, escalate to HP to fix LIB$CVT_VECTIM.

/Guenther
John Gillings
Honored Contributor

Re: alignment faults

Jess,

You'll need to look at the source for LIB$CVT_VECTIM to see for certain. Here are a few guesses...

The LIB$ routine probably calls the system service. DEBUG won't break inside a system service.

When you call the system service directly, you give it an aligned field, perhaps the LIB$ routine is passing it an unaligned buffer and copying back to your argument(seems unlikely but stranger things have happened).

Remember that alignment faults will occur when referencing anything not longword aligned, so anything with word fields will fault. Maybe the system service carefully masks and shifts the fields into a register, then copies to memory, while the LIB$ routine does word references?

Get hold of the source (or just change your code to use SYS$NUMTIM).
A crucible of informative mistakes
Hein van den Heuvel
Honored Contributor
Solution

Re: alignment faults


LIBDATE_CVT ... LIB$CVT_VECTIM does not call a system service but rolls it's own thing.

It starts out by testing the first two elements as a single longword.
See below.

Just add an 8th word as a filler and be happy.

Regards,
Hein

:
GLOBAL ROUTINE LIB$CVT_VECTIM( INPUT_TIME, ! address of array
:
BIND TIMBUFF = .INPUT_TIME : VECTOR [7,WORD]
:

!+
! now check to see if the vector represents an absolute time or a
! delta time. If TIMBUFF is a delta time then the first 2 words will
! equal zero, year and month.
!-

TIME_FLAG = K_ABS_TIME; !assume absolute time

IF .TIMBUFF<0,32,0> EQL 0
THEN
TIME_FLAG = K_DELT_TIME; !Time is a delta time

Jess Goodman
Esteemed Contributor

Re: alignment faults

Thanks everyone for your replies! I'm always amazed at how quickly answers get posted here. Sorry it took me a day to get back to this.

Hein, thanks for finding the VMS source code that generates the alignment fault. That looks like Bliss code, which surprises me - I thought I had heard that the Itanium port was done without a Bliss compiler, but that could have been mis-information.

I still don't understand why after using:
$ MCR SYS$EXAMPLES:SET_ALIGN_REPORT ON
the alignment faults in LIB$CVT_VECTIM do not display to SYS$OUTPUT. Anyone know why?

(This was embarrassing, since I had assured a colleague that if they didn't show up when she ran the program interactively, then that program was not causing the faults we were seeing with MONITOR ALIGN.)

Actually, for a while today I thought that SET_ALIGN_REPORT was totally broken on VMS 8.4. Then I realized that after you do SDA> FLT LOAD that SET_ALIGN_REPORT ON is ignored until you do SDA> FLT UNLOAD.

I also figured out that to catch these with the debugger I first have to first call SYS$START_ALIGN_FAULT_REPORT, which must be what the SDA FLT code does.

Adding a pad word to our code would not be easy for us. The 7-word NUMTIM array is being passed into a subroutine that is linked into hundreds of programs, and for many of them this NUMTIM argument is part of a larger structure that is mirrored to disk.

So instead we decided to rewrite that subroutine today. I used some old code that I wrote before LIB$CVT_VECTIM was available on VMS.

Our user-mode alignment faults on our IA64 box dropped to almost zero when we put the new version of the application online.

However we are still getting executive-mode alignment faults at a rate of almost 4,000/second. These all occur at the same place within RMS - see the attached output from SDA> FLT SHOW TRACE.

Does anyone know how to stop these faults?
I have one, but it's personal.
Hein van den Heuvel
Honored Contributor

Re: alignment faults

>>> However we are still getting executive-mode alignment faults at a rate of almost 4,000/second.
>>> These all occur at the same place within RMS - see the attached output from SDA> FLT SHOW TRACE.

>>> Does anyone know how to stop these faults?

Yes,

1) Ask HP for a special RMS image. There is one.
or
2) Undo Update 12
or
3) Disable Global buffers.
or
4) Wait for the next HP update cycle

or, just ignore them for now that at that low rate.
Alignment faults are only a serious issue when they become the cause of Memory Management MPsync

See C.O.V discussion: "V8.3-1H1 SYS V12 patch causes exec alignment faults using RMS Global buffers"

http://groups.google.com/group/comp.os.vms/browse_thread/thread/140fce225a83960e

Hein
Jess Goodman
Esteemed Contributor

Re: alignment faults

Hein,

Thanks once again. I already got the new RMS.EXE from VMS support. After rebooting my exec mode alignment faults have dropped significantly; peak is now only around 200/sec.

FYI: For VMS 8.4 it was Update 4 that introduced this problem.

I am still wondering why the LIB$CVT_VECTIM faults do not show up with SET_ALIGN_REPORT ON.

Jess
I have one, but it's personal.