Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

INVEXCEPTN, Exception while above ASTDEL

BG Jeong
Advisor

INVEXCEPTN, Exception while above ASTDEL

hello all.

Suddenly my DS20 cluster system crashed.
after it rebooted.
I tried to find why the system crashed but

I didn't find any hardware ploblems.

Attached is the SDA crue crash output.


Thanks in advance
Tru64 from Korea
4 REPLIES
Hein van den Heuvel
Honored Contributor

Re: INVEXCEPTN, Exception while above ASTDEL

Well, the error is

%SYSTEM-F-ROPRAND, reserved operand fault at

At that point in the code the system is getting an IRP from the system-wide queue.
It has done so billions of times, so the code is reasonable, but the queue data itself much have been bad.

This could 'easily' by a totaly unrelated piece of (privilleged) code accidently stomping on the queue header.
So IOPOST could be a victim or a cause here.

What changed recently ?

New products installed?
(Oracle) product upgrades?
New hardware interfaces?

I would try to take the event as an excuse to upgrade to 7.3-2 ? The code in the IOC$IOPOST space changed significantly, protecting itself better at first glance.

A real support engineer would looks at the data pointed to be the register contents and IOC$GQ_POSTIQ to get a better picture of how/why it tripped over.

fwiw.
Hein van den Heuvel
Hoff
Honored Contributor

Re: INVEXCEPTN, Exception while above ASTDEL

As Hein writes, I/O Postprocessing blew up. A detailed look at the crashdump "carcass" might reveal more.

There could be any number of triggers here, from buggy kernel software to a latent OpenVMS driver error to an Oracle issue to the potential for an underlying hardware error.

ECO OpenVMS V7.2-1 (ancient, unsupported) and Oracle to current as well as any other kernel-mode code present, consider upgrading to more current (and supported) versions of these packages, and (if the problem persists) escalate to support. If you do escalate, you'll probably be asked to upgrade, as V7.2-1 is ancient and no longer supported.

HP support might be willing to run the CLUE CRASH against a database or three, if you have a support. (But I'll bet they'll simply tell you to do what I've just told you to do here, too. They're unlikely willing to fix new bugs in old releases. So if this is buggy software and not fixed by an existing ECO, you're probably going to be upgrading...)

Robert Brooks_1
Honored Contributor

Re: INVEXCEPTN, Exception while above ASTDEL

For anyone within HP who is paying attention, this IO_ROUTINES execlet is from the 121st remedial build for V7.2-1.

Here's the offending line from [SYS]IOCIOPOST.LIS, which as diagnosed earlier, has to be due to a corrupt queue

$REMQHI_R IOC$GQ_POSTIQ,R5,- ; Get IRP from system-wide queue
ENTRY_REMOVED=60$ ; If entry removed then go process it
; (Uses: R0, R1, R5, R22, R23)

-- Rob
Volker Halle
Honored Contributor

Re: INVEXCEPTN, Exception while above ASTDEL

BG Jeong,

you can check the IO post-processing queue IOC$GQ_POSTIQ in the dump with:

SDA> vali que/self/list ioc$gq_postiq

and/or

SDA> vali que/self/list/back ioc$gq_postiq

until you find a corrupt (IRP) packet. If you find the address of a corrupt packet, try to format the packet with:

SDA> FORMAT

This may tell you, which software/device-driver owned that packet and may have been responsible for this problem.

Are you using AMDS/Availability Manager ?

Volker.