Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Seemingly random crashes with VMS 8.3 Alpha

 
SOLVED
Go to solution
Martin Vorlaender
Honored Contributor

Seemingly random crashes with VMS 8.3 Alpha

Hi!

Yesterday I changed a faulty CPU fan in one of our DS10Ls. Since then the system crashes seemingly randomly. However, I noticed a pattern to crash at OTS$MOVE_C+00024, but with various bugcheck types in various images. More recently it seems consistently crashing with "SHADDETINCON, SHADOWING detects inconsistent state" early in the boot.

The configuration is 3 DS10Ls (1 vote each) connected to two EVAs holding the shadowed system and data disks, and a quorum disk (2 votes); OpenVMS Alpha V8.3 with all current ECOs.

Anyone seen something like this before?

I'm appending all CLUE files (the last one generated by hand, as I have not yet tried to reboot the machine).

Thanks,
Martin
4 REPLIES 4
Volker Halle
Honored Contributor

Re: Seemingly random crashes with VMS 8.3 Alpha

Martin,

I seem to be unable to download your attachment. Could you mail it to me ?

Volker.
Volker Halle
Honored Contributor

Re: Seemingly random crashes with VMS 8.3 Alpha

Martin,

the first obvious question would be: what else did you change (on the system disk), since you last booted that DS10L or any other node, which boots from that disk ?

The 19 crashes may seem random, but they show just only a few different footprints:

SHADDETINCON, 2 different offsets

SSRVEXCEPTN, ACCVIO on S0 address, OTS$MOVEC+00024, RA=EXE$PERSONA_IMPORT_ARB_C+001F4

UNXSIGNAL, seems to go together with stack corruption

If you want to rule out a HW problem: can you boot the other DS10L from that root and vice versa ? Otherwise these kind of 'random' footprints may look like pool corruption ? Set SYSTEM_CHECK=1 ?

Volker.
Volker Halle
Honored Contributor
Solution

Re: Seemingly random crashes with VMS 8.3 Alpha

Martin,

EXE$PERSONA_IMPORT_ARB calls OTS$MOVE to copy data from the JIB to the PSB. I would guess R18 to point to the SRC address.

But if you look at the 64-bit values for R18 in the SSRVEXCEPT crashes, you'll notice, that the high-order longword for this S0 space address is not always FFFFFFFF !!

$ sea *helios*.lis "Argument #3 "

R18 64-bit values seen are:

FFF7F7FF.81FB778C JIB+0000C
FFFFF7FF.8203968C JIB+0000C

Maybe the bad CPU fan has actually caused a more severe HW problem...

Volker.
Martin Vorlaender
Honored Contributor

Re: Seemingly random crashes with VMS 8.3 Alpha

I cross-booted the machine that crashed with one of the other DS10Ls, and the crashes stayed with the hardware, and didn't go with the system roots. So I assume something else has died when the CPU fan went faulty.

Thanks, Volker.