1752402 Members
5856 Online
108788 Solutions
New Discussion юеВ

Re: Memory dump agan.

 
SOLVED
Go to solution
Darijo
Frequent Advisor

Memory dump agan.

Ok, so we got "new" alpha 1000 and the application which was working with < DIGITAL 21164 PICMG SBC 5/500> configuration, now is not working with .
The whole process of installation goes well but first time I boot the machine I get memory dump when it tries to load that application???

And then CPU halts and goes back to SRM....
Can I bypass init scripts and start application manually?

Problem/output:

-----------------------------------------------

-----------------Starting MCC Version 008

Job BOOTCHECK (queue SYS$BATCH, entry 3) started on SYS$BATCH

End of AEC_STARTUP at 16:20:50.28

Job STARTUP$1 (queue SYS$BATCH, entry 4) pending
pending status caused by queue busy state

**** OpenVMS (TM) Alpha Operating System V7.1-1H2 - BUGCHECK ****
** Bugcheck code = 000001CC: INVEXCEPTN, Exception while above ASTDEL
** Crash CPU: 00 Primary CPU: 00 Active CPUs: 00000001
** Current Process = BATCH_2
** Image Name = MZR1P1$DKB0:[SYS0.SYSCOMMON.][SYSEXE]SYSMAN.EXE
**** Starting compressed selective memory dump at 7-JUN-2008 16:21...
...........................................................
...Complete ****
------------------------------
35 REPLIES 35
Andy Bustamante
Honored Contributor

Re: Memory dump agan.

>>> Can I bypass init scripts and start application manually?

That depends. We don't know the application and what it's environment should be.

An Alphaserver 1000 is fairly dated system and it appears you're running VMS 7.2-1H2. First thing I would start with is with "$ analyse/crash" and see if there's any useful data in the crash dump. Second, check patches on the "old" 1000 against the "new" 1000. User account privileges?

If all this doesn't provide a useful result, I would shut the 1000 down to SRM and running console diagnostics, memexer for example. New old hardware sometimes takes a bit of debugging.

Andy
If you don't have time to do it right, when will you have time to do it over? Reach me at first_name + "." + last_name at sysmanager net
Darijo
Frequent Advisor

Re: Memory dump agan.

>>>That depends. We don't know the application and what it's environment should be.

Actually there are more processes making that application...

>>>An Alphaserver 1000 is fairly dated system and it appears you're running VMS 7.2-1H2.

But shouldn't it depend on the processor's ISA???

>>>First thing I would start with is with "$ analyse/crash" and see if there's any useful data in the crash dump.

I cant access VMS because of that crash...

>>Second, check patches on the "old" 1000 against the "new" 1000. User account privileges?

This is the first time I have used AlphaServer 1000 to replace that newer configuration.
Thing is that we have one special fibre optic adapter(ISA/EISA) which uses none standard protocols/vendor specific protocols for communication with some machines. Therefor we cannot replace it with newer Alphas with PCI sys. bus and our supplier is asking for enormous amount of money for < DIGITAL 21164 PICMG SBC 5/500>.
So we found couple of old Alphas in the storage to replace it with <21164 PICMG SBC 5/500>

Bojan Nemec
Honored Contributor

Re: Memory dump agan.

Darijo,

You can do a "conversational boot"

>>> boot -fl 0,1
SYSBOOT> SET STARTUP_P1 "MIN"
SYSBOOT> CONTINUE

You can add the device name at the end of the boot command.
This will boot the VMS but will not execute the systartup_vms.com.

When you are finished you must return the value of the STARTUP_P1 to "":

$ RUN SYS$SYSTEM:SYSGEN
SYSGEN> USE CURRENT
SYSGEN> SET STARTUP_P1 ""
SYSGEN> WRITE CURRENT

Bojan
Darijo
Frequent Advisor

Re: Memory dump agan.

thank you Bojan...I'll try this first thing in the morning when I get back to work.

Hoff
Honored Contributor
Solution

Re: Memory dump agan.

[[[[>>>That depends. We don't know the application and what it's environment should be.

Actually there are more processes making that application...]]]]

It's a batch job which is blowing, which means it's something you're doing in or after startup.

[[[>>>An Alphaserver 1000 is fairly dated system and it appears you're running VMS 7.2-1H2.]]]]

Technically, the quite buggy V7.1-1H2 release. V7.1-2 was released as a way to roll up and install all of the ECO kits that existed for V7.1, V7.1-1H1 and V7.1-1H2, and PCSI was implemented as a way to better manage ECOs.

[[[But shouldn't it depend on the processor's ISA???]]]]

The instruction set is relevant, and so are the details of the system platform configuration; the devices and hardware that are (often uniquely) involved in the platform. There was far more to booting a new Alpha platform than which EV processor was used.

Here, the official support is "v6.2-1H3, or v7.1 or later", which means OpenVMS Alpha should work on this box.

[[[>>>First thing I would start with is with "$ analyse/crash" and see if there's any useful data in the crash dump.

I cant access VMS because of that crash...]]]

Sure you can. Swap the disk over and use the other box to analyze the crash.

>>Second, check patches on the "old" 1000 against the "new" 1000. User account privileges?

[[[[This is the first time I have used AlphaServer 1000 to replace that newer configuration.]]]]

You're technically using an AlphaServer 1000A 5/333 here, based on what I see listed in this thread -- there are some differences between the AlphaServer 1000 and AlphaServer 1000A boxes. Graphics support is better, IIRC.

[[[Thing is that we have one special fibre optic adapter(ISA/EISA) which uses none standard protocols/vendor specific protocols for communication with some machines. Therefor we cannot replace it with newer Alphas with PCI sys. bus and our supplier is asking for enormous amount of money for < DIGITAL 21164 PICMG SBC 5/500>.
So we found couple of old Alphas in the storage to replace it with <21164 PICMG SBC 5/500>]]]]

I'd be willing to bet that the device driver for that device is what is blowing up here, too. It's probably a SYSMAN I/O connect in that batch job, and if you wander up the stack, you'll find it's connecting the driver.

Do you have specifications and/or source code driver for the host view of the adapter? Or is it a complete buy-out? Debugging existing or writing a new driver can range from easy to ugly. Specs and/or samples make the process a whole lot easier.

Stephen Hoffman
HoffmanLabs LLC

Darijo
Frequent Advisor

Re: Memory dump agan.

---------------------------------------------
>>>I'd be willing to bet that the device driver for that device is what is blowing up here, too.
---------------------------------------------

I was thinking the same thing...

---------------------------------------------
>>>Do you have specifications and/or source code driver for the host view of the adapter?
---------------------------------------------

No I don't...
The whole package comes on one CD (VMS+Application Kit) and everything is installed together.
Sadly I'm not VMS expert nor we have one here.
Too bad there is an ocean between us :)

Anyway the vendor of that adapter is some german firm which doesn't provide any support for that HW. So I guess is specifically made for needs of these machines.

---------------------------------------------
>>>Debugging existing or writing a new driver can range from easy to ugly. Specs and/or samples make the process a whole lot easier.
---------------------------------------------

Ufff...I'm not paid that good ;)
...but would love to check things under the hood.

So what are my options?
I don't know does this make any difference but that adapter was on ISA slot and now is connected on EISA...and if I remember that HW made for ISA are also compatible whit 32bit EISA...

Alos I did this:

---------------------------------------------
$ dir *.dmp*

Directory SYS$SYSROOT:[SYSEXE]

SYS$ERRLOG.DMP;1 SYSDUMP.DMP;1

Total of 2 files.
$ analyse/crash
_Dump File: SYSDUMP.DMP;1



OpenVMS (TM) Alpha system dump analyzer
...analyzing a compressed selective memory dump...

%SDA-W-SDALINKMISM, link time of SYS$BASE_IMAGE built into SDA$SHARE (19-OCT-199
8 23:37) does not match link time of image in system dump (20-OCT-1998 11:36)
Dump taken on 7-JUN-2008 16:21:48.32
INVEXCEPTN, Exception while above ASTDEL

SDA>

---------------------------------------------










Darijo
Frequent Advisor

Re: Memory dump agan.

Well...if if helps:

SDA> show stack

Process Stacks (on CPU 00)
--------------------------
Current Operating Stack (KERNEL):
00000000.7FFA1C08 00000000.00001100 UCB$M_UNLOAD+0010
0
00000000.7FFA1C10 00000000.00000001
00000000.7FFA1C18 00000000.00000002
00000000.7FFA1C20 FFFFFFFF.FFE040B8
SP => 00000000.7FFA1C28 00000000.7FFA1DF8
00000000.7FFA1C30 00000000.7FFA1D48
00000000.7FFA1C38 00000000.000001C8
00000000.7FFA1C40 00000000.00000050
00000000.7FFA1C48 00000000.00000210 BUG$_MACHINECHK
00000000.7FFA1C50 00000000.002A86CC
00000000.7FFA1C58 00000000.00000000
00000000.7FFA1C60 00000000.00000009
00000000.7FFA1C68 FFFFFFFF.80C301F8 MMG$ALLOC_SVA_MAP
00000000.7FFA1C70 FFFFFFFF.00000250 BUG$_NETRCVPKT
00000000.7FFA1C78 00000000.00000001
CHF$IS_MCH_ARGS 00000000.7FFA1C80 00000000.0000002C
CHF$PH_MCH_FRAME 00000000.7FFA1C88 00000000.7FFA1E90


Press RETURN for more.
SDA>
John Travell
Valued Contributor

Re: Memory dump agan.

Darijo,
There is something you can do that will help us to help you. It is quite likely to confirm the suspicion that the crash occurs when SYSMAN configures the device driver for your fibre optic adapter.
At the SDA> prompt do:
SDA> set out sys$login:cluecrash.txt
SDA> clue crash
SDA> clue stack
SDA> clue register
SDA> clue config
SDA> set out tt:
SDA> Exit
then post cluecrash.txt as an attachment.

You may get suggestions to upgrade, but unless your vendor has shown that their driver works on later versions of VMS you may be stuck on V7.1-*.
While not a universal truth, if it works on V7.1-1H2, then generally it will also work on V7.1-2.

Question to everyone: Does anyone have any information on how the IO mapping is done on the particular SBC this fibre optic adapter works in as compared to the Alphaserver1000(a?). Could the driver be trying to access registers that are at different relative addresses in the 'new' machine ?
JT:
Darijo
Frequent Advisor

Re: Memory dump agan.

John,

I'm currently running VMS without any services/drivers...so I cant't transfer via ftp to my laptop to send an attachment.

Also I did everything like you told me to but it wont output command 'SDA>clue register' to a file since that command doesn't exist on this version of VMS.
I looked in help for clue....bur nothing.



I tried to start UCX manually to get networking working but then I also get dump...this time on some UCX$INET_ACP process???

--------------------------------------------------------------------
$ @UCX$STARTUP.COM;1
%JBC-E-JOBQUEDIS, system job queue manager is not running
%JBC-E-JOBQUEDIS, system job queue manager is not running
%UCX$PPP-I-INFO, Loading PPP Drivers and CallBack
%RUN-S-PROC_ID, identification of created process is 00000209
The Internet driver and ACP were successfully loaded.
%%%%%%%%%%% OPCOM 8-JUN-2008 12:23:06.78 %%%%%%%%%%%
Message from user INTERnet on MZR1P1
INTERnet Loaded

%UCX-I-SETLOCAL, Setting domain and/or local host
%UCX-I-SETPROTP, Setting protocol parameters
%UCX-I-STARTCOMM, Starting communication
%%%%%%%%%%% OPCOM 8-JUN-2008 12:23:07.42 %%%%%%%%%%%
Message from user INTERnet on MZR1P1
INTERnet Started

%UCX-I-DEFINTE, Defining interfaces

**** OpenVMS (TM) Alpha Operating System V7.1-1H2 - BUGCHECK ******* keyboard n.

** Bugcheck code = 000001CC: INVEXCEPTN, Exception while above ASTDEL
** Crash CPU: 00 Primary CPU: 00 Active CPUs: 00000001
** Current Process = UCX$INET_ACP
** Image Name =
**** Starting compressed selective memory dump at 8-JUN-2008 12:23...
.......................
...Complete ****

halted CPU 0

halt code = 5
HALT instruction executed
PC = ffffffff8006df00
--------------------------------------------------------------------