Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

MACHINECHK, Machine check while in kernel mode

 
SOLVED
Go to solution
Ferdinand Macatangay
Occasional Advisor

MACHINECHK, Machine check while in kernel mode

hello all, would like to seek for help on below error...running on alphastation 255/233.


SCS NODE: D OpenVMS AXP V6.2-1H3

HW_MODEL: 00000000 Hardware Model = 0.

FATAL BUGCHECK AlphaStation 255/233

MACHINECHK, Machine check while in kernel mode

PROCESS NAME NULL
PROCESS ID 00010000

ERROR PC FFFFFFFF 80048618

Process Status = 20000000 00001F04, SW = 00, Previous Mode = KERNEL
System State = 01, Current Mode = KERNEL
VMM = 00 IPL = 31, SP Alignment = 32

STACK POINTERS

KSP FFFFFFFF 85437D20 ESP FFFFFFFF 85439000 SSP FFFFFFFF 85435000
USP FFFFFFFF 85435000

GENERAL REGISTERS



R0 00000000 00000002 R1 00000000 0000940E R2 FFFFFFFF 824ADD50
R3 FFFFFFFF 82484D98 R4 00000000 00000048 R5 00000000 00001F04
R6 00000000 00000008 R7 FFFFFFFF 824A4248 R8 00000000 00000340
R9 00000000 0000000D R10 FFFFFFFF 8008290C R11 00000000 7FFBE3E0
R12 00000000 7EE35AA8 R13 FFFFFFFF 824A6A00 R14 FFFFFFFF 82484200
R15 00000000 00000000 R16 00000000 00000215 R17 00000000 00000001
R18 00000000 00000001 R19 00000000 00000000 R20 FFFFFFFF FFFFFFF8
R21 00000000 00000017 R22 00000000 00000100 R23 FFFFFFFF 80CA2368
R24 FFFFFFFF 80CA2000 R25 00000000 00000003 R26 00000000 00000210
R27 FFFFFFFF 824B4F70 R28 FFFFFFFF 8003B9C4 FP FFFFFFFF 85437D20
SP FFFFFFFF 85437D20 PC FFFFFFFF 80048618 PS 20000000 00001F04

SYSTEM REGISTERS

PTBR 00000000 000000F2
Page Table Base Register
PCBB 00000000 00F08080
Privileged Context Block Base
PRBR FFFFFFFF 80F08000
Processor Base Register
VPTB 00000002 00000000
Virtual Page Table Base Register
SCBB 00000000 000001A1
System Control Block Base
SISR 00000000 00000000
Software Interrupt Summary Register
ASN 00000000 00000000
Address Space Number
ASTSR_ASTEN 00000000 00000000
AST Summary/AST Enable
FEN 00000000 00000000
Floating-Point Enable
IPL 00000000 0000001F
Interrupt Priority Level
MCES 00000000 00000000
Machine Check Error Summary
10 REPLIES 10
Volker Halle
Honored Contributor
Solution

Re: MACHINECHK, Machine check while in kernel mode

Ferdinand,

welcome to the OpenVMS ITRC forum.

MACHINECHK crashes are nearly always caused by hardware errors. There should have been a hardware error being logged immediately prior to the bugcheck entry.

Try this:

$ ANAL/CRASH SYS$SYSTEM:
SDA> CLUE ERRLOG
...
SDA> EXIT

This should show a short summary of the most recent errors logged before the crash and also create SYS$SCRATCH:CLUE$ERRLOG.SYS. You need DECevent to translate the errors in this file ($ DIAGNOSE clue$errlog.sys).

DECevent can be downloaded from the Analysis Service Tools page:

http://h18023.www1.hp.com/support/svctools/

Volker.
roose
Regular Advisor

Re: MACHINECHK, Machine check while in kernel mode

Ferdinand,

As Volker have mentioned, whenever we encounter MACHINE_CHECKS error, this normally points to a hardware error. As such, if your system is supported under a maintenance contract, I would suggest that you open a case with HP for them to further look into this.

By the way, are you a Pinoy? If so, glad to know a kababayan who is also a VMS person, and who's also based in Singapore.

Roose
Ferdinand Macatangay
Occasional Advisor

Re: MACHINECHK, Machine check while in kernel mode

Hello Volker and Roose, thanks for prompt reply...sorry for the late reply as been very busy lately...Yesterday night the thing reboots again...(have to come back at 2am)...below is the log for yesterday reset...I scan trough the past history of the error logs and noticed that everytime reboots is different process running but the ERROR PC is the same:
ERROR PC FFFFFFFF 80048618

Interval of reset is 192,63,4 and current one 15 days...

Roose...yup i'm also a pinoy...we dont have any contract with hp currently as warranty already over a long time...

Below is the errlog:
-----------------------------------

******************************* ENTRY 3102961. *******************************
ERROR SEQUENCE 2272. LOGGED ON: CPU_TYPE 00000006
DATE/TIME 2-MAR-2010 01:41:19.06 SYS_TYPE 0000000D
SYSTEM UPTIME: 15 DAYS 17:29:08
SCS NODE: DIFWS3 OpenVMS AXP V6.2-1H3

HW_MODEL: 000004EC Hardware Model = 1260.

TIME STAMP AlphaStation 255/233
******************************* ENTRY 3102962. *******************************
ERROR SEQUENCE 2273. LOGGED ON: CPU_TYPE 00000006
DATE/TIME 2-MAR-2010 01:47:35.03 SYS_TYPE 0000000D
SYSTEM UPTIME: 15 DAYS 17:35:24
SCS NODE: DIFWS3 OpenVMS AXP V6.2-1H3

HW_MODEL: 000004EC Hardware Model = 1260.

"UNKNOWN ENTRY" AlphaStation 255/233

ERROR LOG RECORD

ERF$L_SID 000004EC
SYSTEM ID REGISTER
ERL$W_ENTRY 0017
ERROR ENTRY TYPE
EXE$GQ_SYSTIME CB63FC05
00A99D32 64 BIT TIME WHEN ERROR LOGGED
ERL$GL_SEQUENCE 08E1
UNIQUE ERROR SEQUENCE = 2273.

BYTE <3:0> 000002E8 /è.../
BYTE <7:4> 00000000 /..../
BYTE <11:8> 00000110 /..../
BYTE <15:12> 000001A0 /..../
BYTE <19:16> 00000092 /..../
BYTE <23:20> 00000001 /..../
BYTE <27:24> 00000088 /..../
BYTE <31:28> 00000000 /..../
BYTE <35:32> 00000004 /..../
BYTE <39:36> 001044F8 /øD../
BYTE <43:40> 20202020 / /
BYTE <47:44> 30202030 /0 0/
BYTE <51:48> 7FFBF818 /.øû./
BYTE <55:52> 00000000 /..../
BYTE <59:56> 7FFBF938 /8ùû./
BYTE <63:60> 00000000 /..../
BYTE <67:64> 002AA003 /..*./
BYTE <71:68> 00000000 /..../
BYTE <75:72> 00004200 /.B../
BYTE <79:76> 00000000 /..../
BYTE <83:80> 00000400 /..../
BYTE <87:84> 00000000 /..../
BYTE <91:88> 00000803 /..../
BYTE <95:92> 00000000 /..../
BYTE <99:96> 296700D4 /Ã .g)/
BYTE <103:100> 00000087 /..../
BYTE <107:104> 00000000 /..../
BYTE <111:108> 00000000 /..../
BYTE <115:112> 7FF91BC0 /à .ù./
BYTE <119:116> 00000000 /..../
BYTE <123:120> 82518870 /p.Q./
BYTE <127:124> FFFFFFFF /..../
BYTE <131:128> 00000001 /..../
BYTE <135:132> 00000000 /..../
BYTE <139:136> 80FFC278 /xà ../
BYTE <143:140> FFFFFFFF /..../
BYTE <147:144> 80FFC45C /\Ã ../
BYTE <151:148> FFFFFFFF /..../
BYTE <155:152> 80FFBCE8 /è¼../
BYTE <159:156> FFFFFFFF /..../
BYTE <163:160> 00000000 /..../
BYTE <167:164> 00000000 /..../
BYTE <171:168> 00000000 /..../
BYTE <175:172> 00000000 /..../
BYTE <179:176> 0A0AA8C0 /à ¤../
BYTE <183:180> 66055014 /.P.f/
BYTE <187:184> 00000000 /..../
BYTE <191:188> 00000000 /..../
BYTE <195:192> 85233118 /.1#./
BYTE <199:196> FFFFFFFF /..../
BYTE <203:200> 00000041 /A.../
BYTE <207:204> 00000000 /..../
BYTE <211:208> 80F08000 /.. ./
BYTE <215:212> FFFFFFFF /..../
BYTE <219:216> 00010000 /..../
BYTE <223:220> 00000000 /..../
BYTE <227:224> 7FF92000 /. ù./
BYTE <231:228> 00000000 /..../
BYTE <235:232> 00000000 /..../
BYTE <239:236> 00000000 /..../
BYTE <243:240> 02CDA000 /..Ã ./
BYTE <247:244> 00000000 /..../
BYTE <251:248> 00000000 /..../
BYTE <255:252> 00000002 /..../
BYTE <259:256> 00342000 /. 4./
BYTE <263:260> 00000000 /..../
BYTE <267:264> 01EFC080 /.à ï./
BYTE <271:268> 00000000 /..../
BYTE <275:272> 802DD2FE / Ã -./
BYTE <279:276> FFFFFFFF /..../
BYTE <283:280> 77FF0033 /3..w/
BYTE <287:284> 76100036 /6..v/
BYTE <291:288> 4A003690 /.6.J/
BYTE <295:292> 40803524 /$5.@/
BYTE <299:296> 00000004 /..../
BYTE <303:300> 001044F8 /øD../
BYTE <307:304> 00008000 /..../
BYTE <311:308> 00000000 /..../
BYTE <315:312> FFC014E0 /à.à ./
BYTE <319:316> 00000001 /..../
BYTE <323:320> 00000000 /..../
BYTE <327:324> 00000000 /..../
BYTE <331:328> 000017B0 /°.../
BYTE <335:332> 00000000 /..../
BYTE <339:336> 00000003 /..../
BYTE <343:340> 00000000 /..../
BYTE <347:344> FFFFFFFF /..../
BYTE <351:348> 00000007 /..../
BYTE <355:352> 0000940E /..../
BYTE <359:356> 00000000 /..../
BYTE <363:360> 00001440 /@.../
BYTE <367:364> 00000000 /..../
BYTE <371:368> 00FFC940 /@Ã ../
BYTE <375:372> 00000000 /..../
BYTE <379:376> 10002225 /%"../
BYTE <383:380> 00000008 /..../
BYTE <387:384> 00000001 /..../
BYTE <391:388> 00000000 /..../
BYTE <395:392> 00FFC940 /@Ã ../
BYTE <399:396> 00000000 /..../
BYTE <403:400> 000061D0 / a../
BYTE <407:404> 00000000 /..../
BYTE <411:408> 00000F15 /..../
BYTE <415:412> 00000000 /..../
BYTE <419:416> 6D8D00B4 / ..m/
BYTE <423:420> 00000000 /..../
BYTE <427:424> 0000A1F0 / ¡../
BYTE <431:428> 00000000 /..../
BYTE <435:432> 6D8D3FF0 / ?.m/
BYTE <439:436> 00000000 /..../
BYTE <443:440> 7D81FFFF /...}/
BYTE <447:444> 00000000 /..../
BYTE <451:448> 7D811FFF /...}/
BYTE <455:452> 00000000 /..../
BYTE <459:456> 7D81022C /,..}/
BYTE <463:460> 00000000 /..../
BYTE <467:464> 7D810004 /...}/
BYTE <471:468> 00000000 /..../
BYTE <475:472> 7D810000 /...}/
BYTE <479:476> 00000000 /..../
BYTE <483:480> 7D810000 /...}/
BYTE <487:484> 00000000 /..../
BYTE <491:488> 7D810000 /...}/
BYTE <495:492> 00000000 /..../
BYTE <499:496> 7D810067 /g..}/
BYTE <503:500> 00000000 /..../
BYTE <507:504> 6D8D0000 /...m/
BYTE <511:508> 00000000 /..../
BYTE <515:512> 6D8D0000 /...m/
BYTE <519:516> 00000000 /..../
BYTE <523:520> 800E001C /..../
BYTE <527:524> FFFFFFFF /..../
BYTE <531:528> 010012C0 /Ã .../
BYTE <535:532> 00000000 /..../
BYTE <539:536> 003E19D0 / .>./
BYTE <543:540> 00000000 /..../
BYTE <547:544> 00180000 /..../
BYTE <551:548> 00000000 /..../
BYTE <555:552> 00000000 /..../
BYTE <559:556> 00000000 /..../
BYTE <563:560> 000C0000 /..../
BYTE <567:564> 00000000 /..../
BYTE <571:568> 40080000 /...@/
BYTE <575:572> 00000000 /..../
BYTE <579:576> 07F00000 /.. ./
BYTE <583:580> 00000000 /..../
BYTE <587:584> 3FF00000 /.. ?/
BYTE <591:588> 00000000 /..../
BYTE <595:592> 80000000 /..../
BYTE <599:596> FFFFFFFF /..../
BYTE <603:600> 00000000 /..../
BYTE <607:604> 00000000 /..../
BYTE <611:608> 000000FF /..../
BYTE <615:612> 00000000 /..../
BYTE <619:616> 01000000 /..../
BYTE <623:620> 00000000 /..../
BYTE <627:624> 01000000 /..../
BYTE <631:628> 00000000 /..../
BYTE <635:632> 01000000 /..../
BYTE <639:636> 00000000 /..../
BYTE <643:640> 01000000 /..../
BYTE <647:644> 00000000 /..../
BYTE <651:648> 01000000 /..../
BYTE <655:652> 00000000 /..../
BYTE <659:656> 01000000 /..../
BYTE <663:660> 00000000 /..../
BYTE <667:664> 01010000 /..../
BYTE <671:668> 00000000 /..../
BYTE <675:672> 01000000 /..../
BYTE <679:676> 00000000 /..../
BYTE <683:680> 00000F40 /@.../
BYTE <687:684> 00000000 /..../
BYTE <691:688> 00000F40 /@.../
BYTE <695:692> 00000000 /..../
BYTE <699:696> 00000F40 /@.../
BYTE <703:700> 00000000 /..../
BYTE <707:704> 00000F40 /@.../
BYTE <711:708> 00000000 /..../
BYTE <715:712> 00000F40 /@.../
BYTE <719:716> 00000000 /..../
BYTE <723:720> 00000F40 /@.../
BYTE <727:724> 00000000 /..../
BYTE <731:728> 00000F3E />.../
BYTE <735:732> 00000000 /..../
BYTE <739:736> 00000F40 /@.../
BYTE <743:740> 00000000 /..../
BYTE <747:744> 00010538 /8.../
BYTE <751:748> 00010000 /..../
BYTE <755:752> 00000700 /..../
BYTE <759:756> 00000009 /..../
BYTE <763:760> 00010538 /8.../
BYTE <767:764> 00010000 /..../
BYTE <771:768> 0002012E /..../
BYTE <775:772> 00010000 /..../
BYTE <779:776> 00000000 /..../
BYTE <783:780> 00000000 /..../
BYTE <787:784> 00000000 /..../
BYTE <791:788> 00000000 /..../
BYTE <795:792> 00000000 /..../
BYTE <799:796> 00000000 /..../
BYTE <803:800> 00000000 /..../
BYTE <807:804> 00000000 /..../
BYTE <811:808> 00000000 /..../
BYTE <815:812> 00000000 /..../
BYTE <819:816> 00000000 /..../
BYTE <823:820> 00000000 /..../
BYTE <827:824> 00000000 /..../
BYTE <831:828> 00000000 /..../
BYTE <835:832> 00000000 /..../
BYTE <839:836> 00000000 /..../
BYTE <843:840> 00000000 /..../
BYTE <847:844> 00000000 /..../
BYTE <851:848> 00000000 /..../
BYTE <855:852> 00000000 /..../
BYTE <859:856> 00000000 /..../
BYTE <863:860> 00000000 /..../
BYTE <867:864> 00000000 /..../
BYTE <871:868> 00000000 /..../
BYTE <875:872> 00000000 /..../
BYTE <879:876> 00000000 /..../
%ERF-I-UNKENTRY, unknown entry type, 37
******************************* ENTRY 3102963. *******************************
ERROR SEQUENCE 2274. LOGGED ON: CPU_TYPE 00000006
DATE/TIME 2-MAR-2010 01:47:35.03 SYS_TYPE 0000000D
SYSTEM UPTIME: 15 DAYS 17:35:24
SCS NODE: DIFWS3 OpenVMS AXP V6.2-1H3

HW_MODEL: 00000000 Hardware Model = 0.

FATAL BUGCHECK AlphaStation 255/233

MACHINECHK, Machine check while in kernel mode

PROCESS NAME FILSRV200_BG937
PROCESS ID 00120014

ERROR PC FFFFFFFF 80048618

Process Status = 20000000 00001F04, SW = 00, Previous Mode = KERNEL
System State = 01, Current Mode = KERNEL
VMM = 00 IPL = 31, SP Alignment = 32

STACK POINTERS

KSP 00000000 7FF91AA0 ESP 00000000 7FF96000 SSP 00000000 7FF9C100
USP 00000000 7EE2F860

GENERAL REGISTERS

R0 00000000 000002BC R1 00000000 00000000 R2 FFFFFFFF 824ADD50
R3 FFFFFFFF 82484D98 R4 00000000 00000048 R5 00000000 00001F04
R6 2020382E 00000000 R7 00000000 0000000C R8 00000000 000005A8
R9 FFFFFFFF 80FFC9D8 R10 00000000 00000000 R11 00000000 00000000
R12 00000000 00000000 R13 FFFFFFFF 82518870 R14 00000000 00000001
R15 FFFFFFFF 80FFC278 R16 00000000 00000215 R17 00000000 00000001
R18 00000000 00000001 R19 00000000 00000002 R20 FFFFFFFF FFFFFFF8
R21 00000000 00000017 R22 00000000 00000100 R23 FFFFFFFF 80CA2368
R24 FFFFFFFF 80CA2000 R25 00000000 00000003 R26 00000000 00000210
R27 FFFFFFFF 824B4F70 R28 FFFFFFFF 8003B9C4 FP 00000000 7FF91AA0
SP 00000000 7FF91AA0 PC FFFFFFFF 80048618 PS 20000000 00001F04

SYSTEM REGISTERS

PTBR 00000000 0000166D
Page Table Base Register
PCBB 00000000 01EFC080
Privileged Context Block Base
PRBR FFFFFFFF 80F08000
Processor Base Register
VPTB 00000002 00000000
Virtual Page Table Base Register
SCBB 00000000 000001A1
System Control Block Base
SISR 00000000 00000000
Software Interrupt Summary Register
ASN 00000000 00000020
Address Space Number
ASTSR_ASTEN 00000000 0000000F
AST Summary/AST Enable
FEN 00000000 00000001
Floating-Point Enable
IPL 00000000 0000001F
Interrupt Priority Level
MCES 00000000 00000000
Machine Check Error Summary
******************************* ENTRY 3102964. *******************************
Volker Halle
Honored Contributor

Re: MACHINECHK, Machine check while in kernel mode

Ferdinand,

as I already pointed out, you need DECevent V3.4 to translate these errlog entries.

The PAL Error Code in this entry (00000092 at offset x0010) indicates:

BIU received data with a parity error from outside the CPU while performing either a Data Cache or I Cache fill

Most likely cause: Memory or B-Cache

This is an intermittent HARDWARE problem.

Volker.
Ferdinand Macatangay
Occasional Advisor

Re: MACHINECHK, Machine check while in kernel mode

Hello Volker,

Your a genius man! I ran the diagnose and the error entry is the cache so most probably its the motherboard. I tried to change the motherboard but could not get pass trough the ewa0 error. it keeps saying connection lost i could not see any lights on the led in the network card. I think I need to configure the network card in the SRM menu. 2 questions,
which one is the ewa0 and ewb0, the one built in the motherboard or the addon card? how do i configure it ? thanks again man!!!
Volker Halle
Honored Contributor

Re: MACHINECHK, Machine check while in kernel mode

Ferdinand,

there is no need to configure the 'network card'.

You can set the speed/duplex mode using:

>>> set ewa0_mode xxx

This will give you the options, e.g. FASTFD or similar.

Volker.
Ferdinand Macatangay
Occasional Advisor

Re: MACHINECHK, Machine check while in kernel mode

we are using an AUI connection (thick wire port) connector. i already managed to set the ewa0 mode to AUI but still no light on the card that is why i'm wondering if the ewb0 also need to set as i dont know which one is the ewa0 and the ewb0, the built in or the addon card...thanks again...
Ferdinand Macatangay
Occasional Advisor

Re: MACHINECHK, Machine check while in kernel mode

Bro also would like to us if there is any test that i can run on the memory.i am running on openvms 6.2/ i searched trough google and found that srm can run the memexer but dont find it in the list of commands.thnx
Volker Halle
Honored Contributor

Re: MACHINECHK, Machine check while in kernel mode

Ferdinand,

EWA0 is the onboard card. Did you boot OpenVMS ? Sometimes the Link UP light may only come up, if the driver has been started. According to the user guide (see below), this system should have a twisted pair connector on the back (for EWA0).

On OpenVMS you can then pull the cable and look for carrier check failures to increment with ANAL/SYS and SDA> SHOW LAN/FULL/COUNT/DEV=EWx0 to find out, which card is which.

Volker.

AlphaStation 255 User Guide:

http://www.compaq.com/alphaserver/download/ek-vllxa-ui-b01.pdf
lokanath bagh
Occasional Advisor

Re: MACHINECHK, Machine check while in kernel mode

Hi Ferdinand,

It might be a workaround for your problem.

I have an Alpha test system. Machine series is COMPAQ AlphaServer DS20E 666 MHz. I faced similar problem while trying to boot it. Problem is more likely to be seen when booted with newly created system disk. For whatever reason I removed all disks and inserted the system disk that I wanted to boot with. System came up and remained up. Then I inserted remaining disks back and used command â mc sysman io autoâ to make them available to system. I never tried to debug the issue and leaving with this workaround till date as machine is a test machine. Not a good habit though.

Regard,
Lokanath