<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: System crashes every 3 weeks. in Operating System - OpenVMS</title>
    <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515501#M67851</link>
    <description>Doug,&lt;BR /&gt;&lt;BR /&gt;so the mchk code is the SAME. You have now documented the machine check logout frame from the running system. After the next MACHINECHK crash, compare the data from the crash against the data just captured from the running system. If the data is IDENTICAL (all quadwords), we can be sure, that no mchk frame is logged before the crash.&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
    <pubDate>Tue, 05 Apr 2005 11:37:12 GMT</pubDate>
    <dc:creator>Volker Halle</dc:creator>
    <dc:date>2005-04-05T11:37:12Z</dc:date>
    <item>
      <title>System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515478#M67828</link>
      <description>Here's one I've been wrestling with for a few months and would appreciate any assistance.&lt;BR /&gt;&lt;BR /&gt;Basically, I have a 2 node Alpha cluster and 1 node crashes every 3 weeks.  If I reboot this node prior to it crashing, it runs for 3 weeks from that re-boot.&lt;BR /&gt;&lt;BR /&gt;Let's call them node1 and node2.&lt;BR /&gt;&lt;BR /&gt;Node1 crashes every 3 weeks and it is a&lt;BR /&gt;AlphaServer 1000 4/233&lt;BR /&gt;Main Memory (1024.00Mb)&lt;BR /&gt;OpenVMS V7.1-1H2&lt;BR /&gt;&lt;BR /&gt;Node2 is stable - currently up for 120 days.&lt;BR /&gt;AlphaServer 1200 5/400 4MB&lt;BR /&gt;Main Memory (1024.00Mb)&lt;BR /&gt;OpenVMS V7.3&lt;BR /&gt;------------------------------&lt;BR /&gt;Here's an excerpt from the clue file:&lt;BR /&gt;&lt;BR /&gt;Bugcheck Type:     MACHINECHK, Machine check while in kernel mode&lt;BR /&gt;Failing PC:        FFFFFFFF.80066BCC    EXE$GEN_BUGCHK_C+0003C&lt;BR /&gt;Failing PS:        10000000.00001F04&lt;BR /&gt;Module:            EXCEPTION&lt;BR /&gt;Offset:            00018BCC&lt;BR /&gt;-------------------------------------&lt;BR /&gt;Unfortunately, my client has a contract with a 3rd party support company, so I can't contact HP directly to get the crash dump analyzed, and they haven't been very useful with their analysis.  So, I've come to the experts....&lt;BR /&gt;&lt;BR /&gt;The system is running OpenVMS V7.1-1H2 and all the patches (for this version) have been installed.&lt;BR /&gt;&lt;BR /&gt;I suspect that I'm running out of some resource after 3 weeks, but I can't figure out which one.&lt;BR /&gt;&lt;BR /&gt;Any ideas/sugestions?&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Doug</description>
      <pubDate>Thu, 31 Mar 2005 11:32:00 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515478#M67828</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-03-31T11:32:00Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515479#M67829</link>
      <description>can you post the clue file SYS$ERRORLOG:CLUE$*.LIS&lt;BR /&gt;&lt;BR /&gt;Anything in the errorlog?&lt;BR /&gt;&lt;BR /&gt;What layered products are you running?&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 31 Mar 2005 11:36:25 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515479#M67829</guid>
      <dc:creator>Ian Miller.</dc:creator>
      <dc:date>2005-03-31T11:36:25Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515480#M67830</link>
      <description>A machine check is almost always a hardware problem. I've once had a similar problem about 1987 on a VAX-8650 which went down every 14 days on a friday afternoon.&lt;BR /&gt;&lt;BR /&gt;After may parts replacements, swappings we had to escalate... Turned out to be bad memory. The machine was running rock-solid since then.</description>
      <pubDate>Thu, 31 Mar 2005 11:38:38 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515480#M67830</guid>
      <dc:creator>Uwe Zessin</dc:creator>
      <dc:date>2005-03-31T11:38:38Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515481#M67831</link>
      <description>100% agree with Uwe, I doubt it's a resource exhaustion. Maybe you are just happen to be hitting that bad memory after three weeks.</description>
      <pubDate>Thu, 31 Mar 2005 11:53:30 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515481#M67831</guid>
      <dc:creator>Tom O'Toole</dc:creator>
      <dc:date>2005-03-31T11:53:30Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515482#M67832</link>
      <description>I've attatched the clue file.&lt;BR /&gt;&lt;BR /&gt;Uwe:&lt;BR /&gt;Your sugestion re. memory is interesting.  A month ago our 3rd party hardware support group sugested "so far their thinking is a bad memory stick", but I haven't heard anything since.&lt;BR /&gt;&lt;BR /&gt;Are there some diagnostics I can run to check the memory?&lt;BR /&gt;Shouldn't the error log file show any memory errors?&lt;BR /&gt;&lt;BR /&gt;Here's the last entry in the error log prior to the last crash:&lt;BR /&gt;&lt;BR /&gt; ******************************* ENTRY     377. *******************************&lt;BR /&gt; ERROR SEQUENCE 19263.                           LOGGED ON:  CPU_TYPE 00000006&lt;BR /&gt; DATE/TIME 29-MAR-2005 14:54:23.14                            SYS_TYPE 00000011&lt;BR /&gt; SYSTEM UPTIME: 23 DAYS 10:28:05&lt;BR /&gt; SCS NODE: ALPHA2                                           OpenVMS AXP V7.1-1H2&lt;BR /&gt;&lt;BR /&gt; HW_MODEL: 00000000 Hardware Model = 0.&lt;BR /&gt;&lt;BR /&gt; FATAL BUGCHECK AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt; MACHINECHK, Machine check while in kernel mode&lt;BR /&gt;&lt;BR /&gt;       PROCESS NAME    INTEGRA_DF&lt;BR /&gt;       PROCESS ID      002E0152&lt;BR /&gt;&lt;BR /&gt;       ERROR PC        FFFFFFFF 80066BD0&lt;BR /&gt;&lt;BR /&gt;    Process Status = 10000000 00001F04, SW = 00, Previous Mode = KERNEL&lt;BR /&gt;    System State = 01, Current Mode = KERNEL&lt;BR /&gt;    VMM = 00 IPL = 31, SP Alignment = 16&lt;BR /&gt;&lt;BR /&gt; STACK POINTERS&lt;BR /&gt;&lt;BR /&gt; KSP 00000000 7FFA1E90  ESP 00000000 7FFA6000  SSP 00000000 7FFAC100&lt;BR /&gt; USP 00000000 7AF76DC0&lt;BR /&gt;&lt;BR /&gt; GENERAL REGISTERS&lt;BR /&gt;&lt;BR /&gt; R0  FFFFFFFF 8A0E01E8  R1  00000000 0000940E  R2  FFFFFFFF 839A6EB0&lt;BR /&gt; R3  FFFFFFFF 8A0E0000  R4  00000000 00200040  R5  FFFFFFFF FFFFFFFF&lt;BR /&gt; R6  00000000 00000001  R7  00000000 00000003  R8  00000000 0000005C&lt;BR /&gt; R9  00000000 00000000  R10 00000000 00000006  R11 00000000 00000006&lt;BR /&gt; R12 00000000 00000000  R13 00000000 0000001C  R14 00000000 00000010&lt;BR /&gt; R15 00000000 00000000  R16 00000000 00000215  R17 00000000 00000001&lt;BR /&gt; R18 00000000 00000001  R19 00000000 00000001  R20 00000000 00C42414&lt;BR /&gt; R21 FFFFFFFF 8A0E0000  R22 FFFFFFFF FFFFFFFF  R23 00000000 00000086&lt;BR /&gt; R24 00000000 00000086  R25 00000000 00000003  R26 00000000 00000210&lt;BR /&gt; R27 FFFFFFFF 839BD680  R28 00000000 00000000  FP  00000000 7FFA1E90&lt;BR /&gt; SP  00000000 7FFA1E90  PC  FFFFFFFF 80066BD0  PS  10000000 00001F04&lt;BR /&gt;&lt;BR /&gt; SYSTEM REGISTERS&lt;BR /&gt;&lt;BR /&gt;       PTBR            00000000 0000F7BF&lt;BR /&gt;                                       Page Table Base Register&lt;BR /&gt;       PCBB            00000000 11B4E080&lt;BR /&gt;                                       Privileged Context Block Base&lt;BR /&gt;       PRBR            FFFFFFFF 8100E000&lt;BR /&gt;                                       Processor Base Register&lt;BR /&gt;       VPTB            FFFFFFFC 00000000&lt;BR /&gt;                                       Virtual Page Table Base Register&lt;BR /&gt;       SCBB            00000000 000001A0&lt;BR /&gt;                                       System Control Block Base&lt;BR /&gt;       SISR            00000000 00000000&lt;BR /&gt;                                       Software Interrupt Summary Register&lt;BR /&gt;       ASN             00000000 00000006&lt;BR /&gt;                                       Address Space Number&lt;BR /&gt;&lt;BR /&gt; V M S                SYSTEM ERROR REPORT         COMPILED 31-MAR-2005 17:51:38&lt;BR /&gt;                                                                      PAGE   4.&lt;BR /&gt;&lt;BR /&gt;       ASTSR_ASTEN     00000000 0000000F&lt;BR /&gt;                                       AST Summary/AST Enable&lt;BR /&gt;       FEN             00000000 00000001&lt;BR /&gt;                                       Floating-Point Enable&lt;BR /&gt;       IPL             00000000 0000001F&lt;BR /&gt;                                       Interrupt Priority Level&lt;BR /&gt;       MCES            00000000 00000000&lt;BR /&gt;                                       Machine Check Error Summary&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 31 Mar 2005 11:56:59 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515482#M67832</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-03-31T11:56:59Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515483#M67833</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;the key to MACHINECHK crashes are the MCHK errlog entries - if there are any.&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; CLUE ERRLOG&lt;BR /&gt;&lt;BR /&gt;will list them and extract them from the dump into a file CLUE$ERRLOG.SYS in your login or default directory.&lt;BR /&gt;&lt;BR /&gt;Run this file through ANAL/ERR or - better- DECevent ($ DIAGNOSE).&lt;BR /&gt;&lt;BR /&gt;The CLUE file is not of much help, especially as the MACHINECHK stack is not correctly decoded until V7.3-2.&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Thu, 31 Mar 2005 12:00:14 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515483#M67833</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-03-31T12:00:14Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515484#M67834</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;Since Volker forgot you're new to VMS:&lt;BR /&gt;&lt;BR /&gt;Do &lt;BR /&gt;$ analyze/system&lt;BR /&gt;&lt;BR /&gt;to get to the SDA&amp;gt; prompt.&lt;BR /&gt;&lt;BR /&gt;Steve</description>
      <pubDate>Thu, 31 Mar 2005 12:11:04 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515484#M67834</guid>
      <dc:creator>Steve Nimr</dc:creator>
      <dc:date>2005-03-31T12:11:04Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515485#M67835</link>
      <description>Sorry Doug I got threads mixed up. :(&lt;BR /&gt;I guess there is no way to recall a reply once it's submitted.</description>
      <pubDate>Thu, 31 Mar 2005 12:21:26 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515485#M67835</guid>
      <dc:creator>Steve Nimr</dc:creator>
      <dc:date>2005-03-31T12:21:26Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515486#M67836</link>
      <description>Steve,&lt;BR /&gt;&lt;BR /&gt;it's ANAL/CRASH SYS$SYSTEM:SYSDUMP.DMP to access a system dump file (in it's default location).&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Thu, 31 Mar 2005 12:26:53 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515486#M67836</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-03-31T12:26:53Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515487#M67837</link>
      <description>Here's the output from diagnose around the time period of the last crash:&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  376 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19262.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 14:45:02&lt;BR /&gt;Time since reboot                    23 Day(s) 10:18:45&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       38. Time Stamp Entry&lt;BR /&gt;&lt;BR /&gt;SWI Minor class                   7. Timestamp&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  377 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19263.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 14:54:23&lt;BR /&gt;Time since reboot                    23 Day(s) 10:28:05&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       37. Crash Re-Start&lt;BR /&gt;&lt;BR /&gt;Bugcheck Minor class              1. Crash Re-start&lt;BR /&gt;&lt;BR /&gt;Bugcheck Msg                         MACHINECHK, Machine check while in kernel&lt;BR /&gt;                                     mode&lt;BR /&gt;Process ID                x002E0152&lt;BR /&gt;Process Name&lt;BR /&gt;KSP                       x000000007FFA1E90&lt;BR /&gt;ESP                       x000000007FFA6000&lt;BR /&gt;SSP                       x000000007FFAC100&lt;BR /&gt;USP                       x000000007AF76DC0&lt;BR /&gt;R0                        xFFFFFFFF8A0E01E8&lt;BR /&gt;R1                        x000000000000940E&lt;BR /&gt;R2                        xFFFFFFFF839A6EB0&lt;BR /&gt;R3                        xFFFFFFFF8A0E0000&lt;BR /&gt;R4                        x0000000000200040&lt;BR /&gt;R5                        xFFFFFFFFFFFFFFFF&lt;BR /&gt;R6                        x0000000000000001&lt;BR /&gt;R7                        x0000000000000003&lt;BR /&gt;R8                        x000000000000005C&lt;BR /&gt;R9                        x0000000000000000&lt;BR /&gt;R10                       x0000000000000006&lt;BR /&gt;R11                       x0000000000000006&lt;BR /&gt;R12                       x0000000000000000&lt;BR /&gt;R13                       x000000000000001C&lt;BR /&gt;R14                       x0000000000000010&lt;BR /&gt;R15                       x0000000000000000&lt;BR /&gt;R16                       x0000000000000215&lt;BR /&gt;R17                       x0000000000000001&lt;BR /&gt;R18                       x0000000000000001&lt;BR /&gt;R19                       x0000000000000001&lt;BR /&gt;R20                       x0000000000C42414&lt;BR /&gt;R21                       xFFFFFFFF8A0E0000&lt;BR /&gt;R22                       xFFFFFFFFFFFFFFFF&lt;BR /&gt;R23                       x0000000000000086&lt;BR /&gt;R24                       x0000000000000086&lt;BR /&gt;R25                       x0000000000000003&lt;BR /&gt;R26                       x0000000000000210&lt;BR /&gt;R27                       xFFFFFFFF839BD680&lt;BR /&gt;R28                       x0000000000000000&lt;BR /&gt;FP                        x000000007FFA1E90&lt;BR /&gt;SP                        x000000007FFA1E90&lt;BR /&gt;PC                        xFFFFFFFF80066BD0&lt;BR /&gt;PS                        x1000000000001F04&lt;BR /&gt;PTBR                      x000000000000F7BF&lt;BR /&gt;Process Ctl Block Base Re x0000000011B4E080&lt;BR /&gt;PRBR                      xFFFFFFFF8100E000&lt;BR /&gt;VPTB                      xFFFFFFFC00000000&lt;BR /&gt;System Ctl Block Base Reg x00000000000001A0&lt;BR /&gt;Software Interrupt Summar x0000000000000000&lt;BR /&gt;ASN                       x0000000000000006&lt;BR /&gt;ASTSR ASTEN               x000000000000000F&lt;BR /&gt;FEN                       x0000000000000001&lt;BR /&gt;ASN                       x0000000000000006&lt;BR /&gt;IPL                       x000000000000001F&lt;BR /&gt;MCES                      x0000000000000000&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  378 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19263.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 15:00:11&lt;BR /&gt;Time since reboot                    0 Day(s) 0:00:17&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       32. Cold Start (ie: System Boot)&lt;BR /&gt;&lt;BR /&gt;SWI Minor class                   2. System startup&lt;BR /&gt;&lt;BR /&gt;TODR                      x3D202445&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  379 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19264.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 15:00:12&lt;BR /&gt;Time since reboot                    0 Day(s) 0:00:17&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       64. Volume Mount&lt;BR /&gt;&lt;BR /&gt;SWI Minor class                   4. Volume mount&lt;BR /&gt;&lt;BR /&gt;Owner UIC                 x00010001&lt;BR /&gt;Error count                       0.&lt;BR /&gt;OP count                        517.&lt;BR /&gt;Unit Number                     100.&lt;BR /&gt;Unit Name                            ALPHA2$DKA&lt;BR /&gt;Volume number                     0.&lt;BR /&gt;Volumes in set                    0.&lt;BR /&gt;Volume Label                         ALPHA2SYS&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  380 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19265.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 15:01:15&lt;BR /&gt;Time since reboot                    0 Day(s) 0:01:21&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       98. Asynchronous Device Attention&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;---- Device Profile ----&lt;BR /&gt;Unit                                 ALPHA2$PEA0&lt;BR /&gt;Product Name                         NI-SCA Port&lt;BR /&gt;&lt;BR /&gt;---- NISCA Port Data ----&lt;BR /&gt;Error Type and SubType        x0700  Device Error, Fatal Error Detected by&lt;BR /&gt;                                     Datalink&lt;BR /&gt;Status                    x0000120100000500&lt;BR /&gt;Datalink Device Name                 FWA2:&lt;BR /&gt;Remote Node Name&lt;BR /&gt;Remote Address            x0000000000000000&lt;BR /&gt;Local Address             x00000405000400AA&lt;BR /&gt;Error Count                       1. Error Occurrences This Entry&lt;BR /&gt;&lt;BR /&gt;----- Software Info -----&lt;BR /&gt;UCB$x_ERRCNT                      1. Errors This Unit&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  381 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19266.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 15:01:16&lt;BR /&gt;Time since reboot                    0 Day(s) 0:01:22&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       98. Asynchronous Device Attention&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;---- Device Profile ----&lt;BR /&gt;Unit                                 ALPHA2$PEA0&lt;BR /&gt;Product Name                         NI-SCA Port&lt;BR /&gt;&lt;BR /&gt;---- NISCA Port Data ----&lt;BR /&gt;Error Type and SubType        x0700  Device Error, Fatal Error Detected by&lt;BR /&gt;                                     Datalink&lt;BR /&gt;Status                    x0000120000000400&lt;BR /&gt;Datalink Device Name                 FWA2:&lt;BR /&gt;Remote Node Name&lt;BR /&gt;Remote Address            x0000000000000000&lt;BR /&gt;Local Address             x00000405000400AA&lt;BR /&gt;Error Count                       1. Error Occurrences This Entry&lt;BR /&gt;&lt;BR /&gt;----- Software Info -----&lt;BR /&gt;UCB$x_ERRCNT                      2. Errors This Unit&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  382 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19267.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 15:01:23&lt;BR /&gt;Time since reboot                    0 Day(s) 0:01:30&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       64. Volume Mount&lt;BR /&gt;&lt;BR /&gt;SWI Minor class                   4. Volume mount&lt;BR /&gt;&lt;BR /&gt;Owner UIC                 x00010004&lt;BR /&gt;Error count                       0.&lt;BR /&gt;OP count                         15.&lt;BR /&gt;Unit Number                       1.&lt;BR /&gt;Unit Name                            213260$DUA&lt;BR /&gt;Volume number                     0.&lt;BR /&gt;Volumes in set                    0.&lt;BR /&gt;Volume Label                         USER2&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 31 Mar 2005 12:46:31 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515487#M67837</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-03-31T12:46:31Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515488#M67838</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;bad luck - OpenVMS V7.1-1H2 did NOT log any machine check entry.&lt;BR /&gt;&lt;BR /&gt;This is the SAME machine/problem as already discussed in previous thread:&lt;BR /&gt;&lt;BR /&gt;&lt;A href="http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=808549" target="_blank"&gt;http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=808549&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;I keep a database of all crashes, that's why I know ;-)&lt;BR /&gt;&lt;BR /&gt;Could you please try to provide the stack data as requested in the previous thread:&lt;BR /&gt;&lt;BR /&gt;$ ANAL/CRASH SYS$SYSTEM:SYSDUMP.DMP&lt;BR /&gt;SDA&amp;gt; READ/EXEC&lt;BR /&gt;SDA&amp;gt; SHOW STACK/QUAD 7FFA1FC0;40&lt;BR /&gt;&lt;BR /&gt;It may also be possible to find the machine check logout frame in the dump.&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Thu, 31 Mar 2005 12:54:27 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515488#M67838</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-03-31T12:54:27Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515489#M67839</link>
      <description>Thanks for the link Volker.&lt;BR /&gt;You're absolutely right.  Adrian is my hardware support contact and I'm that "sysadmin is in west coast Canada" he referred to.&lt;BR /&gt;&lt;BR /&gt;In-any-case, I was not aware that they were using this forum to trouble-shoot the problem. I thought I'd try as I'm not getting anywhere following the official channels.&lt;BR /&gt;&lt;BR /&gt;Here's the output from the SHOW STACK/QUAD 7FFA1FC0;40 command:&lt;BR /&gt;&lt;BR /&gt;Specified Stack Range&lt;BR /&gt;---------------------&lt;BR /&gt;                       00000000.7FFA1FC0    00000000.0002F030&lt;BR /&gt;                       00000000.7FFA1FC8    00000000.010E0019&lt;BR /&gt;                       00000000.7FFA1FD0    00000000.7AF77A5C&lt;BR /&gt;                       00000000.7FFA1FD8    00000000.7AF78AA0&lt;BR /&gt;                       00000000.7FFA1FE0    00000000.00000001&lt;BR /&gt;                       00000000.7FFA1FE8    00000000.00000003&lt;BR /&gt;                       00000000.7FFA1FF0    00000000.0030F080&lt;BR /&gt;                       00000000.7FFA1FF8    00000000.0000001B&lt;BR /&gt;</description>
      <pubDate>Thu, 31 Mar 2005 13:17:49 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515489#M67839</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-03-31T13:17:49Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515490#M67840</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;Just curious--just how precisely do you mean "every 3 weeks":&lt;BR /&gt;&lt;BR /&gt;1) every 3 weeks, within a few milliseconds&lt;BR /&gt;2) every 3 weeks, within a couple of hours&lt;BR /&gt;3) Every 3 weeks, within a few days &lt;BR /&gt;&lt;BR /&gt;I'll bet your answer is 3. :-)&lt;BR /&gt;&lt;BR /&gt;To hazard a little speculation around each possibility:&lt;BR /&gt;  &lt;BR /&gt;1) would be pretty strange, to me at least. Perhaps a flaw in the fabric of space-time. :-)&lt;BR /&gt;&lt;BR /&gt;2) might suggest a link to some calendar-related activity. Perhaps a procedure or device that is used at every couple of weeks? But you'd probably have noticed that.&lt;BR /&gt;&lt;BR /&gt;3) suggests something a lot more random or at least aperiodic, which is why I guessed you'd pick this answer.&lt;BR /&gt;&lt;BR /&gt;Just a few thoughts which may at least stimulate some thought, if they're of any use at all...&lt;BR /&gt;&lt;BR /&gt;Galen</description>
      <pubDate>Thu, 31 Mar 2005 13:44:42 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515490#M67840</guid>
      <dc:creator>Galen Tackett</dc:creator>
      <dc:date>2005-03-31T13:44:42Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515491#M67841</link>
      <description>Volker, &lt;BR /&gt;"I keep a database of all crashes, that's why I know"&lt;BR /&gt;and I thought you just remembered them all rather than having a private copy of canasta :-)&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 31 Mar 2005 15:11:31 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515491#M67841</guid>
      <dc:creator>Ian Miller.</dc:creator>
      <dc:date>2005-03-31T15:11:31Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515492#M67842</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;the interrupt/exception stack frame shows, that the current PC at the time of the MACHINECHK is in P0 space and the PS shows user-mode IPL 0:&lt;BR /&gt;&lt;BR /&gt;00000000.7FFA1FF0 00000000.0030F080 &amp;lt;&amp;lt;&amp;lt; PC&lt;BR /&gt;00000000.7FFA1FF8 00000000.0000001B &amp;lt;&amp;lt;&amp;lt; PS&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; eva/ps 0000001B&lt;BR /&gt;         MBZ SPAL      MBZ    IPL VMM MBZ CURMOD INT PRVMOD&lt;BR /&gt;         0   00   00000000000 00  0   0   USER   0   USER&lt;BR /&gt;&lt;BR /&gt;so whatever the instruction is&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; EXA/INS 30F080&lt;BR /&gt;&lt;BR /&gt;it CANNOT have caused a MACHINECHK through a programming error (i.e. access into IO-space), because you can't do that in USER mode. It could have caused access to a bad memory page, but that would be pure speculation !!&lt;BR /&gt;&lt;BR /&gt;Please issue the following commands in SDA:&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; EXA/INS 30F080-30;40&lt;BR /&gt;&lt;BR /&gt;to examine the instruction stream. If the current instruction include a memory access and you're able to figure out the address, also do&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; SHOW PROC/PAGE address;1000&lt;BR /&gt;&lt;BR /&gt;Otherwise, I'll help you to figure out the page number...&lt;BR /&gt;&lt;BR /&gt;To get an overview of the last couple of crashes on this node, just try TYPE CLUE$HISTORY - if there is something timing related, you might be able to spot a pattern.&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Fri, 01 Apr 2005 02:13:46 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515492#M67842</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-04-01T02:13:46Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515493#M67843</link>
      <description>Doug, &lt;BR /&gt;&lt;BR /&gt;If you realy suspect the memory, then try to shut down the machine and bring it to SRM console. Then start 2 memexers per CPU and let them run for a few hours. If there is realy bad RAM it should show on console. To stop the memexer give the kill_diag command (or init the system). To show the status of memexter type show_diag. &lt;BR /&gt;&lt;BR /&gt;(I could be a litle of with the commands, look in the manual or try help or man for exact commands). &lt;BR /&gt;&lt;BR /&gt;It could be possible that the RAM has gone bad. At my current site we have had several issue's with bad RAM.</description>
      <pubDate>Fri, 01 Apr 2005 02:28:00 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515493#M67843</guid>
      <dc:creator>DICTU OpenVMS</dc:creator>
      <dc:date>2005-04-01T02:28:00Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515494#M67844</link>
      <description>Volker:&lt;BR /&gt;SDA&amp;gt; EXA/INS 30F080&lt;BR /&gt;00000000.0030F080:      BIS             R31,#X1D,R7&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; EXA/INS 30F080-30;40&lt;BR /&gt;00000000.0030F050:      CVTDG                   F3,F3&lt;BR /&gt;00000000.0030F054:      ADDG            F4,F3,F3&lt;BR /&gt;00000000.0030F058:      CVTGD                   F3,F3&lt;BR /&gt;00000000.0030F05C:      STD             F3,#X0CF8(FP)&lt;BR /&gt;00000000.0030F060:      TRAPB&lt;BR /&gt;00000000.0030F064:      LDA             R16,#X0008(FP)&lt;BR /&gt;00000000.0030F068:      BIS             R31,#X01,R25&lt;BR /&gt;00000000.0030F06C:      LDQ             R26,#XFF60(R2)&lt;BR /&gt;00000000.0030F070:      LDQ             R27,#XFF68(R2)&lt;BR /&gt;00000000.0030F074:      JSR             R26,(R26)&lt;BR /&gt;00000000.0030F078:      JMP             R31,(R0)&lt;BR /&gt;00000000.0030F07C:      TRAPB&lt;BR /&gt;00000000.0030F080:      BIS             R31,#X1D,R7&lt;BR /&gt;00000000.0030F084:      STL             R7,#X0020(FP)&lt;BR /&gt;00000000.0030F088:      LDL             R3,#X0CE0(FP)&lt;BR /&gt;00000000.0030F08C:      ADDL/V          R3,#X01,R3&lt;BR /&gt;00000000.0030F090:      LDA             R16,#X8000(R31)&lt;BR /&gt;&lt;BR /&gt;I looked at the clue$history file and there doesn't appear to be any pattern other than approx every 3 weeks.&lt;BR /&gt;e.g. The previous 4 crashes are:&lt;BR /&gt;Date         Uptime&lt;BR /&gt;========     ==========&lt;BR /&gt;Dec 29       22 days&lt;BR /&gt;Jan 20       25 days&lt;BR /&gt;Feb 14       25 days&lt;BR /&gt;Mar 29       23 days&lt;BR /&gt;&lt;BR /&gt;Sorry, I don't know what address to put in the SHOW PROC/PAGE address;1000 command.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 01 Apr 2005 14:24:42 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515494#M67844</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-04-01T14:24:42Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515495#M67845</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;the exception PC points to a BIS R31,#X1D,R7 instruction, so there are no memory accesses involved executing this instruction - except access to the page, where this instruction is stored. Please remember to repeat these steps against the next crash(es).&lt;BR /&gt;&lt;BR /&gt;Now let's try to find the machinecheck logout frame in the dump:&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; READ SYSDEF&lt;BR /&gt;SDA&amp;gt; SHOW STACK @(@smp$gl_cpu_data+CPU$L_PROC_MCHK_ABORT_SVAPTE+4);2F0&lt;BR /&gt;&lt;BR /&gt;You have to enter the command in one line.&lt;BR /&gt;(above command only applies to single-CPU system - which this node is).&lt;BR /&gt;&lt;BR /&gt;Try to include the output as a text file attachment in your next reply (or mail it to me - see my forum profile).&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Sat, 02 Apr 2005 01:12:46 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515495#M67845</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-04-02T01:12:46Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515496#M67846</link>
      <description>Thanks for your help Volker.&lt;BR /&gt;I've attached a text file with the output.</description>
      <pubDate>Mon, 04 Apr 2005 10:30:01 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515496#M67846</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-04-04T10:30:01Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515497#M67847</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;thanks for the data:&lt;BR /&gt;&lt;BR /&gt;8A0E0058    00000001.00000205 = mchk code&lt;BR /&gt;&lt;BR /&gt;Could you please compare the data with the same SDA command in the running system ? Sometimes mchk data is left in this buffer from 'expected' machinechecks (like during SYSMAN IO AUTOCONFIGURE when scanning the device configuration).&lt;BR /&gt;&lt;BR /&gt;If the same data exists in the running system, we know that no machine check frame has been logged and need to try to find out, why OpenVMS has crashes with a MACHINECHK crash.&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Tue, 05 Apr 2005 10:39:38 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515497#M67847</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-04-05T10:39:38Z</dc:date>
    </item>
  </channel>
</rss>

