Operating System - OpenVMS

Re: alphaserver 255 continuosly crashing

 
SOLVED
Go to solution
Juan PiNero_1
Advisor

alphaserver 255 continuosly crashing

Hi all
My alphaserver is crashing every day with messages like this.

halted CPU 0
halt code = 1
operator initiated halt
pc = ffffffff8011ae04
>>>

If I type "b" it boot without any problem

There is no dump file at all:
SYS>SET DEF DSA0:[SYS0.SYSCOMMON.SYSERR]
SYS>DIR /SINCE=20-MAR-2009 CLUE*.*;* /DATE
%DIRECT-W-NOFILES, no files found
SYS>

Attached is the ERRLOG.CVT from yesterday (the last crash).

Cheers
Juan


22 REPLIES 22
Jess Goodman
Esteemed Contributor

Re: alphaserver 255 continuosly crashing

Are there any lines with "BUGCHECK" in them on the console? If not then your system did not really crash. Hitting on the console keypad my cause the system to halt. There is also a halt button on the front of the Server.

Dump files are not automatically created - you must create it and then reboot. The simplist way to do tihs is to run:
$ @SYS$UPDATE:SWAPFILES
Check your memory size, SYSGEN parameter DUMP_STYLE, and the VMS documentation to decide what size to make it.

Even without a dump file you may get some information after a crash with:
$ TYPE CLUE$HISTORY
(I'm not sure if that file will be updated when there is no dump file.)
I have one, but it's personal.
Jess Goodman
Esteemed Contributor

Re: alphaserver 255 continuosly crashing

Oh, just re-read your post. By default the system dump file is:
SYS$SPECIFIC:[SYSEXE]SYSDUMP.DMP
I have one, but it's personal.
cnb
Honored Contributor
Solution

Re: alphaserver 255 continuosly crashing

Check the OPERATOR.LOG files to see who or what is shutting the system down.

Could it be an environmental issue (temp, low input power, etc...)?

The error log wasn't much to go on except an excessive amount of previous device errors from devives attached to PKA0:, but the system clock was way off (check the TOD battery).

Does it always shutdown at the same time of day?

hth,
Juan PiNero_1
Advisor

Re: alphaserver 255 continuosly crashing

It really exists but was modified some time ago(27-FEB-2009). The crash begins a couples of days ago.

SYS>dir SYSDUMP.DMP /full

Directory SYS$SPECIFIC:[SYSEXE]

SYSDUMP.DMP;1 File ID: (8280,2,0)
Size: 148071/148086 Owner: [1,1]
Created: 31-MAY-2004 16:46:09.28
Revised: 27-FEB-2009 12:04:06.75 (6)
...
Total of 1 file, 148071/148086 blocks.
SYS>show def
SYS$SPECIFIC:[SYSEXE]
SYS>
Hoff
Honored Contributor

Re: alphaserver 255 continuosly crashing

ITRC conspired to eat my previous reply here, so here is a fast re-write of the key items from that vaporized reply.

AlphaStation 255/233 box with hard halt on OpenVMS Alpha V7.3-2.

This can be bad hardware or bad software.

Set the SRM console AUTO_ACTION to restart, as that gets crashdumps written on halt operations.

Apply mandatory and UPDATE ECO kits to bring to current for OpenVMS V7.3-2. Apply mandatory ECO kits for any kernel-mode software in use here other than OpenVMS. This includes TCP/IP Services ECO kits (or whatever IP stack) as well as any other third-party packages.

Use ANALYZE /ERROR /ELV here, and TRANSLATE /SINCE=YESTERDAY. That'll get you a better list of errors.

Start a new error log: RENAME ERRLOG.SYS ERRLOG-END-27-MAR-2009.OLD or such. That'll help target the error information.

Following all proper personal safety procedures and proper electrostatic protection procedures, plan to unplug the box from power and then re-seat all memory sticks and all PCI widgets, and unplug and plug all cables and connectors. Inside the box and outside. Blow out the inevitably accumulated dust, too.

Also start planning to replace at least some hardware parts here, or (probably more economical) start to plan to move to something newer than this box. This case could be anything from a bad disk to a bad memory stick to a processor error to a bad or loose or ill-terminated cable to buggy driver software... But one thing is certain: this AlphaStation 255 is ancient, and ancient parts will fail.

Juan PiNero_1
Advisor

Re: alphaserver 255 continuosly crashing

Hi all
Tanks a lot to everybody for reply, I finally check the hardware and replace the TOD battery. Since then this machine is up for more than one day.

Uptime 1 02:07:04

Let see how things going today
Juan PiNero_1
Advisor

Re: alphaserver 255 continuosly crashing

Hi cnb

Finally it crash twice since I replace the battery. The funny thing is always was at the same time(more or less).
What this really means?

Cheers
Juan
cnb
Honored Contributor

Re: alphaserver 255 continuosly crashing

Did it generate a good dump file this time? Check the operator and error log for any device/error entries before the crash. Did you generate a new error log file per Hoff's instructions? If so, please post. Crashing always at or around the same time might be a power spike indication. Is it on an islolated properly grounded circuit?

FYI, If anyone's answer(s) are helpful please assign points to those who've taken the time to help.

hth,
Kumar_Sanjay
Regular Advisor

Re: alphaserver 255 continuosly crashing

Juan,

When the problem is started, would to try to recall, what was last changes since this problem started, have you installed incompatible hardware or softwares ?


Sanjay Kumar