Operating System - OpenVMS
1753529 Members
5068 Online
108795 Solutions
New Discussion юеВ

Re: BADVECTOR bugcheck following VMS 7.3-2 upgrade

 
SOLVED
Go to solution
Jeremy Begg
Trusted Contributor

BADVECTOR bugcheck following VMS 7.3-2 upgrade

Hi,

We are in the process of upgrading an AlphaServer 8200 from OpenVMS V7.2-1 to OpenVMS V7.3-2.

We first installed COBRTL 2.8 and FORRTL 7.5 from the layered products library CD-ROM because we knew from past experience that 7.3-2 upgrades had caused problems without those RTLs installed first.

Having done that, we booted from the OpenVMS V7.3-1 CD-ROM and upgraded the system to 7.3-1.

We then shut down the system rebooted (to let the post-upgrade procedures run), then shut down again and booted from the V7.3-2 CD-ROM.

We upgraded to VMS 7.3-2, rebooted the system, then installed the PCSI V5 patch update and logged out.

We logged in again, installed the VMS732_UPDATE V19 kit, and rebooted. And this is when we run into a serious problem.

The system begins to boot, displays the VMS banner, displays some information about the RAID disks, then issues a BUGCHECK code 704: "BADVECTOR, inconsistency in system service vector table". The current process is SWAPPER.

It writes a compressed dump and then tries to reboot, repeating ad-infinitum.

Has anyone seen this problem before?

Thanks,
Jeremy Begg
12 REPLIES 12
Volker Halle
Honored Contributor

Re: BADVECTOR bugcheck following VMS 7.3-2 upgrade

Jeremy,

if you can, please escalate this problem to HP ! There most likely is some problem in one of the patches in VMS732_UPDATE-V1900.

Could you provide a CLUE file from this crash as an attachment ?

Boot from the V7.3-2 CD, mount the upgraded disk and issue the following commands and capture the output:

$ SET TERM/WIDTH=132
$ ANAL/CRASH system_disk:[SYSn.SYSEXE]
SDA> CLUE CRASH
SDA> CLUE REGISTER
SDA> CLUE STACK
SDA> CLUE CONF
SDA> CLUE MEM/STAT
SDA> EXIT

This bugcheck is declared, when there is some inconsistency when trying to load and connect a system service.

It would be most interesting to find out, which system service load failed from which execlet.

Look for old .EXE files in SYS$SPECIFIC:[SYS$LDR]. Are you using any 3rd party privildged code product, which provides execlets ?

Volker.
Jeremy Begg
Trusted Contributor

Re: BADVECTOR bugcheck following VMS 7.3-2 upgrade

My thoughts exactly, and a fault report has been logged with HP. Unfortunately this site does not have weekend coverage so resolution is going to have to wait until Monday.

The CLUE commands are a good idea ... but too late I'm afraid as we've already started to restore the system disk from BACKUP. The plan now is to reload VMS 7.3-1 then VMS 7.3-2 as before, then apply PCSI V5 and UPDATE V16.

(Updates 17 & 18 seem to have been a problem for other sites, and V16 is a year old now -- hopefully old enough that it won't contain whatever has broken in V19!)

If we still get a bugcheck after applying V16 I'll definitely do the CLUE thing.

Thanks,
Jeremy Begg
Volker Halle
Honored Contributor
Solution

Re: BADVECTOR bugcheck following VMS 7.3-2 upgrade

Jeremy,

I tested an upgrade from V7.3-2 SSB + PCSI-V0300 by installing VMS732_PCSI-V0500 + VMS732_UPDATE-V1900 on my PersonalAlpha and it worked fine. No problems after reboot.

So this makes it much more likely, that this is a problem with your system disk contents. Please check for SYS$SPECIFIC:[SYS$LDR]*.EXE from older versions. Also check the contents of SYS$UPDATE:VMS$SYSTEM_IMAGES.IDX

I'm still interested in the CLUE file, to try to better diagnose such a crash by coming up with a couple of SDA commands, which would show the system service or execlet being loaded.

Volker.
Jeremy Begg
Trusted Contributor

Re: BADVECTOR bugcheck following VMS 7.3-2 upgrade

Hi Volker,

We reloaded from backup, upgraded to VMS 7.3-2, and tried the UPDATE V16 kit instead. The BUGCHECKs continued so I ran the CLUE commands you suggested, and found the crash was in a POSIX$KERNEL routine.

At this point I was advised that the POSIX product had been installed a long time ago (the files date from December 1995) and found these two images in SYS$LOADABLE_IMAGES:

POSIX$CFS_SERVICES.EXE;1 1101/1107 19-DEC-1995 09:35:51.52
POSIX$KERNEL.EXE;1 1390/1395 19-DEC-1995 09:35:49.76

I renamed them to .EXE-BAD and rebooted the system. Much to our relief it no longer crashed and so I conclude there's an inconsistency between the old POSIX software environment and one of the patch updates issued after OpenVMS V7.3-2 was shipped.

The only sign now of any lingering problem is these error messages when the system reboots:

ICES.EXE
%EXECINIT-W- Couldn't load POSIX$CFS_
%EXECINIT-E-LOADERR, error loading POSIX$CFS_SERVICES.EXE status = 00000910

RNEL.EXE
%EXECINIT-W- Couldn't load POSI
%EXECINIT-E-LOADERR, error loading POSIX$KERNEL.EXE status = 00000910

I can only assume there's a control file somewhere with references to those two POSIX images in them.

Regards,
Jeremy Begg
Volker Halle
Honored Contributor

Re: BADVECTOR bugcheck following VMS 7.3-2 upgrade

Jeremy,

I would have expected 'system version mismatch' messages when trying to load these old POSIX images. Maybe you can check the console messages from a boot with V7.2-1.

Assuming there is no (working) de-installation procedure to deinstall POSIX, you might want to do the following:

To remove these files from SYS$UPDATE:VMS$SYSTEM_IMAGES.IDX, you need to issue the following commands:

$ MC SYSMAN SYS_LOADABLE REMOVE filename
for both images and then:

$ @SYS$UPDATE:VMS$SYSTEM_IMAGES.COM

This will be rebuild the .IDX file, which OpenVMS uses during boot.

If possible, please send me the CLUE file from the crash or post it as an attachment. I'm still collecting OpenVMS crash footprints ;-)

Volker.
Jeremy Begg
Trusted Contributor

Re: BADVECTOR bugcheck following VMS 7.3-2 upgrade

Hi Volker,

I'm a little confused. The VMS$SYSTEM_IMAGES.IDX file appears to be a log of images which didn't load -- not a control file telling VMS what to load. At present this file contains three lines:

DECRAM DECRAM$EXECLET SSsystem image DECRAM$EXECLET load failed

_LOCAL_ POSIX$CFS_SERVICES IWCouldn't load POSIX$CFS_SERVICES

_LOCAL_ POSIX$KERNEL IWCouldn't load POSIX$KERNEL

Perhaps 'SYS_LOADABLE REMOVE imagname' writes an entry to the .IDX file telling VMS *not* to load that file?

And no, we can't go back to V7.2-1.
Hoff
Honored Contributor

Re: BADVECTOR bugcheck following VMS 7.3-2 upgrade

Jeremy, Volker gave you the correct answer.

Please issue the SYSMAN commands.

The SYSMAN commands write to the indexed file, and update it, and tell OpenVMS to not load those files.

That data file is read-write, and it gets updated with the status from the last bootstrap.
Volker Halle
Honored Contributor

Re: BADVECTOR bugcheck following VMS 7.3-2 upgrade

Jeremy,

the message you see in VMS$SYSTEM_IMAGES.IDX is the one OpenVMS will print, if it FAILS to load the execlet.

If you are not using a console manager application and/or did never record the OpenVMS V7.2-1 boot and startup messages, then we will never know, whether these messages had already been around with V7.2-1. No need to boot your old system again.

Before doing an OpenVMS system upgrade, it would be advisable to - at least once - record the console output from booting the system. This will always give you an idea, if errors showing up after the upgrade are NEW or OLD problems.

Make I ask again for the CLUE file from the BADVECOTR crash ?

Volker.
Jur van der Burg
Respected Contributor

Re: BADVECTOR bugcheck following VMS 7.3-2 upgrade

Minor nit:

>$ @SYS$UPDATE:VMS$SYSTEM_IMAGES.COM
>This will be rebuild the .IDX file, which >OpenVMS uses during boot.

Almost. SYSMAN writes the .IDX file, and the command procedure will write sys$common:[sys$ldr]vms$system_images.data which is the real file used during boot.

I admit, it's hard to see a mistake from Volker... :-)

Jur.