Operating System - OpenVMS
1752280 Members
4531 Online
108786 Solutions
New Discussion юеВ

Re: System Crash of DS 25 after CPU Upgrade

 
SOLVED
Go to solution
C.Eichert
Advisor

Re: System Crash of DS 25 after CPU Upgrade

Ian,

I made already an update VMS732_UPDATE-V0300.
Please have a look to the Clue-File attached to answer from 11:39.

Regards
Christoph
Mobeen_1
Esteemed Contributor

Re: System Crash of DS 25 after CPU Upgrade

Christoph,
Have all of your crashes so far been revolving around the PC : TCPIP ? If so, i would suggest that you first rule that out by hopping on to the latest or seek help from the vendor on your situation. I understand that in this case the vendor is HP :)

regards
Mobeen
Ian Miller.
Honored Contributor
Solution

Re: System Crash of DS 25 after CPU Upgrade

I had missed the UPDATE patch was already there. What is the value of the system parameter MULTITHREAD? If its now > 1 then you could try setting it to 1 which would prevent more than one kernel thread from being active in any process.
____________________
Purely Personal Opinion
C.Eichert
Advisor

Re: System Crash of DS 25 after CPU Upgrade

Actually I am not connected any longer to the DS25. I finished work for today (I stay in Thailand). I will continue the reply tomorrow. Thank you so far.

Christoph
John Travell
Valued Contributor

Re: System Crash of DS 25 after CPU Upgrade

Please excuse the lengthy reply. Included extracts from the CLUE file to support my conclusion.

Crash Time: 10-FEB-2005 00:51:27.29
Bugcheck Type: INVEXCEPTN,
CPU Type: AlphaServer DS25
VMS Version: V7.3-2

Crash/Primary CPU: 01/00 <<< crash on CPU #1

Signal Array:
Arg Count = 00000005
Condition = 0000000C <<< access violation
Argument #2 = 00000000
Argument #3 = 00000020 <<< failed Virtual address
Argument #4 = 8062BFE4 <<< Error PC.
Argument #5 = 00000800


Failing Instruction:
TCPIP$INTERNET_SERVICES+97FE4: SUBL R7,R17,R7

This instruction does not involve ANY memory access at all. It is NOT POSSIBLE for an ACCESS VIOLATION to occur if this code is executed correctly.
(Other types of exceptions, maybe, but not an ACCVIO!)

Current Registers: PCB: 8164E480 (CPU 1)

R7 = 00000000.00000028
R17 = 00000000.00000014

Nothing exceptional in the registers. no reason for an exception.

Cpu#1 did not correctly execute the code present in its Istream.
Looks like hardware to me.

System Information:
System Type AlphaServer DS25
Cycle Time 1.0 nsec (1000 MHz)
CPU ID 01
CPU Type EV68CB Pass 2.4 (21264C)
PAL Code 1.98-42
CPU Revision ....
Serial Number JA40701068
Console Vers V6.9-2

Based on this crash I would place CPU#1 under suspicion.
However, while unlikely please bear in mind that the problem just could be on the main system board in the slot support logic.
Try swapping over the CPU's. If the crashes then occur on CPU#0 you have a diagnosis.

Alternatively, show us some more clue files.

As always, if you cannot get HP to investigate the crashes and are willing to pay a moderate sum, send them to me.

John Travell, john@jomatech.com, http://www.jomatech.com
Volker Halle
Honored Contributor

Re: System Crash of DS 25 after CPU Upgrade

I fully agree with John !

Before you try any wild guesses about patches etc., just have a look at the crash information in the CLUE file.

If it's an exception-related crash and the failing instruction, the register contents and the data in the signal array do NOT MATCH, it must be hardware or firmware.

As John pointed out, a SUBL instruction only referencing registers CANNOT generate any access violation.

Look at the crash information in the other CLUE files from the other crashes with the same focus. Does the failing instruction make sense, given the register contents etc. ? If not, which CPU was the crashing one ?

Volker.
C.Eichert
Advisor

Re: System Crash of DS 25 after CPU Upgrade

Mobeen,

some of the crashes show the module TCPIP, but not all. Please have a look to the attached CLUE$HISTORY.


Ian,

the system parameter MULTITHREAD was set to 2. According your recommendation I changed it to 1. Does this has an influence to our application? Fortran linker flags = /THREADS_ENABLE=(MULTIPLE_KERNEL_THREADS,UPCALLS)/noinfo.


John,
Volker,

the crashes happend mostly on CPU 01, but sometimes on CPU 00. Following your proposal I will exchange them today. The complete history in attached file. If you need certain CLUE-files please let me know.


To all,

HP recommends to install following patches:
VMS732_BACKUP-V0300
VMS732_CPU270F-V0100
VMS732_MQ-V0100
VMS732_PTHREAD-V0200
VMS732_TRACE-V0200

What are your opinions?


Regards
Christoph

Volker Halle
Honored Contributor

Re: System Crash of DS 25 after CPU Upgrade

Christoph,

I have a tool to collect and evaluate CLUE files. If you don't mind, you can mail me all your CLUE$COLLECT:CLUE*.LIS files from this server (just put them in a ZIP archive).

You can find my mail address, if you carefully look at my forum profile ;-)

Volker.
Ian Miller.
Honored Contributor

Re: System Crash of DS 25 after CPU Upgrade

MULTITHREAD=1 may affect your application performance however does a system crash.

Most of the crashes are CPU1 - I would go with John's plan of swapping the CPU modules around or even just replace it.
____________________
Purely Personal Opinion
C.Eichert
Advisor

Re: System Crash of DS 25 after CPU Upgrade

Ian,

MULTITHREAD has been changed to 1, CPU boards just swapped. I will keep you informed about any news. Replacement of CPU is quite difficult. Spares are here in Thailand not available. I have to get them from overseas.
May I have your opinion whether it makes sense to make a new (fresh) installation of VMS on a spare disk?

Thanks
Christoph