<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: System crashes every 3 weeks. in Operating System - OpenVMS</title>
    <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515492#M67842</link>
    <description>Doug,&lt;BR /&gt;&lt;BR /&gt;the interrupt/exception stack frame shows, that the current PC at the time of the MACHINECHK is in P0 space and the PS shows user-mode IPL 0:&lt;BR /&gt;&lt;BR /&gt;00000000.7FFA1FF0 00000000.0030F080 &amp;lt;&amp;lt;&amp;lt; PC&lt;BR /&gt;00000000.7FFA1FF8 00000000.0000001B &amp;lt;&amp;lt;&amp;lt; PS&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; eva/ps 0000001B&lt;BR /&gt;         MBZ SPAL      MBZ    IPL VMM MBZ CURMOD INT PRVMOD&lt;BR /&gt;         0   00   00000000000 00  0   0   USER   0   USER&lt;BR /&gt;&lt;BR /&gt;so whatever the instruction is&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; EXA/INS 30F080&lt;BR /&gt;&lt;BR /&gt;it CANNOT have caused a MACHINECHK through a programming error (i.e. access into IO-space), because you can't do that in USER mode. It could have caused access to a bad memory page, but that would be pure speculation !!&lt;BR /&gt;&lt;BR /&gt;Please issue the following commands in SDA:&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; EXA/INS 30F080-30;40&lt;BR /&gt;&lt;BR /&gt;to examine the instruction stream. If the current instruction include a memory access and you're able to figure out the address, also do&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; SHOW PROC/PAGE address;1000&lt;BR /&gt;&lt;BR /&gt;Otherwise, I'll help you to figure out the page number...&lt;BR /&gt;&lt;BR /&gt;To get an overview of the last couple of crashes on this node, just try TYPE CLUE$HISTORY - if there is something timing related, you might be able to spot a pattern.&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
    <pubDate>Fri, 01 Apr 2005 02:13:46 GMT</pubDate>
    <dc:creator>Volker Halle</dc:creator>
    <dc:date>2005-04-01T02:13:46Z</dc:date>
    <item>
      <title>System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515478#M67828</link>
      <description>Here's one I've been wrestling with for a few months and would appreciate any assistance.&lt;BR /&gt;&lt;BR /&gt;Basically, I have a 2 node Alpha cluster and 1 node crashes every 3 weeks.  If I reboot this node prior to it crashing, it runs for 3 weeks from that re-boot.&lt;BR /&gt;&lt;BR /&gt;Let's call them node1 and node2.&lt;BR /&gt;&lt;BR /&gt;Node1 crashes every 3 weeks and it is a&lt;BR /&gt;AlphaServer 1000 4/233&lt;BR /&gt;Main Memory (1024.00Mb)&lt;BR /&gt;OpenVMS V7.1-1H2&lt;BR /&gt;&lt;BR /&gt;Node2 is stable - currently up for 120 days.&lt;BR /&gt;AlphaServer 1200 5/400 4MB&lt;BR /&gt;Main Memory (1024.00Mb)&lt;BR /&gt;OpenVMS V7.3&lt;BR /&gt;------------------------------&lt;BR /&gt;Here's an excerpt from the clue file:&lt;BR /&gt;&lt;BR /&gt;Bugcheck Type:     MACHINECHK, Machine check while in kernel mode&lt;BR /&gt;Failing PC:        FFFFFFFF.80066BCC    EXE$GEN_BUGCHK_C+0003C&lt;BR /&gt;Failing PS:        10000000.00001F04&lt;BR /&gt;Module:            EXCEPTION&lt;BR /&gt;Offset:            00018BCC&lt;BR /&gt;-------------------------------------&lt;BR /&gt;Unfortunately, my client has a contract with a 3rd party support company, so I can't contact HP directly to get the crash dump analyzed, and they haven't been very useful with their analysis.  So, I've come to the experts....&lt;BR /&gt;&lt;BR /&gt;The system is running OpenVMS V7.1-1H2 and all the patches (for this version) have been installed.&lt;BR /&gt;&lt;BR /&gt;I suspect that I'm running out of some resource after 3 weeks, but I can't figure out which one.&lt;BR /&gt;&lt;BR /&gt;Any ideas/sugestions?&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Doug</description>
      <pubDate>Thu, 31 Mar 2005 11:32:00 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515478#M67828</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-03-31T11:32:00Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515479#M67829</link>
      <description>can you post the clue file SYS$ERRORLOG:CLUE$*.LIS&lt;BR /&gt;&lt;BR /&gt;Anything in the errorlog?&lt;BR /&gt;&lt;BR /&gt;What layered products are you running?&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 31 Mar 2005 11:36:25 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515479#M67829</guid>
      <dc:creator>Ian Miller.</dc:creator>
      <dc:date>2005-03-31T11:36:25Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515480#M67830</link>
      <description>A machine check is almost always a hardware problem. I've once had a similar problem about 1987 on a VAX-8650 which went down every 14 days on a friday afternoon.&lt;BR /&gt;&lt;BR /&gt;After may parts replacements, swappings we had to escalate... Turned out to be bad memory. The machine was running rock-solid since then.</description>
      <pubDate>Thu, 31 Mar 2005 11:38:38 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515480#M67830</guid>
      <dc:creator>Uwe Zessin</dc:creator>
      <dc:date>2005-03-31T11:38:38Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515481#M67831</link>
      <description>100% agree with Uwe, I doubt it's a resource exhaustion. Maybe you are just happen to be hitting that bad memory after three weeks.</description>
      <pubDate>Thu, 31 Mar 2005 11:53:30 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515481#M67831</guid>
      <dc:creator>Tom O'Toole</dc:creator>
      <dc:date>2005-03-31T11:53:30Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515482#M67832</link>
      <description>I've attatched the clue file.&lt;BR /&gt;&lt;BR /&gt;Uwe:&lt;BR /&gt;Your sugestion re. memory is interesting.  A month ago our 3rd party hardware support group sugested "so far their thinking is a bad memory stick", but I haven't heard anything since.&lt;BR /&gt;&lt;BR /&gt;Are there some diagnostics I can run to check the memory?&lt;BR /&gt;Shouldn't the error log file show any memory errors?&lt;BR /&gt;&lt;BR /&gt;Here's the last entry in the error log prior to the last crash:&lt;BR /&gt;&lt;BR /&gt; ******************************* ENTRY     377. *******************************&lt;BR /&gt; ERROR SEQUENCE 19263.                           LOGGED ON:  CPU_TYPE 00000006&lt;BR /&gt; DATE/TIME 29-MAR-2005 14:54:23.14                            SYS_TYPE 00000011&lt;BR /&gt; SYSTEM UPTIME: 23 DAYS 10:28:05&lt;BR /&gt; SCS NODE: ALPHA2                                           OpenVMS AXP V7.1-1H2&lt;BR /&gt;&lt;BR /&gt; HW_MODEL: 00000000 Hardware Model = 0.&lt;BR /&gt;&lt;BR /&gt; FATAL BUGCHECK AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt; MACHINECHK, Machine check while in kernel mode&lt;BR /&gt;&lt;BR /&gt;       PROCESS NAME    INTEGRA_DF&lt;BR /&gt;       PROCESS ID      002E0152&lt;BR /&gt;&lt;BR /&gt;       ERROR PC        FFFFFFFF 80066BD0&lt;BR /&gt;&lt;BR /&gt;    Process Status = 10000000 00001F04, SW = 00, Previous Mode = KERNEL&lt;BR /&gt;    System State = 01, Current Mode = KERNEL&lt;BR /&gt;    VMM = 00 IPL = 31, SP Alignment = 16&lt;BR /&gt;&lt;BR /&gt; STACK POINTERS&lt;BR /&gt;&lt;BR /&gt; KSP 00000000 7FFA1E90  ESP 00000000 7FFA6000  SSP 00000000 7FFAC100&lt;BR /&gt; USP 00000000 7AF76DC0&lt;BR /&gt;&lt;BR /&gt; GENERAL REGISTERS&lt;BR /&gt;&lt;BR /&gt; R0  FFFFFFFF 8A0E01E8  R1  00000000 0000940E  R2  FFFFFFFF 839A6EB0&lt;BR /&gt; R3  FFFFFFFF 8A0E0000  R4  00000000 00200040  R5  FFFFFFFF FFFFFFFF&lt;BR /&gt; R6  00000000 00000001  R7  00000000 00000003  R8  00000000 0000005C&lt;BR /&gt; R9  00000000 00000000  R10 00000000 00000006  R11 00000000 00000006&lt;BR /&gt; R12 00000000 00000000  R13 00000000 0000001C  R14 00000000 00000010&lt;BR /&gt; R15 00000000 00000000  R16 00000000 00000215  R17 00000000 00000001&lt;BR /&gt; R18 00000000 00000001  R19 00000000 00000001  R20 00000000 00C42414&lt;BR /&gt; R21 FFFFFFFF 8A0E0000  R22 FFFFFFFF FFFFFFFF  R23 00000000 00000086&lt;BR /&gt; R24 00000000 00000086  R25 00000000 00000003  R26 00000000 00000210&lt;BR /&gt; R27 FFFFFFFF 839BD680  R28 00000000 00000000  FP  00000000 7FFA1E90&lt;BR /&gt; SP  00000000 7FFA1E90  PC  FFFFFFFF 80066BD0  PS  10000000 00001F04&lt;BR /&gt;&lt;BR /&gt; SYSTEM REGISTERS&lt;BR /&gt;&lt;BR /&gt;       PTBR            00000000 0000F7BF&lt;BR /&gt;                                       Page Table Base Register&lt;BR /&gt;       PCBB            00000000 11B4E080&lt;BR /&gt;                                       Privileged Context Block Base&lt;BR /&gt;       PRBR            FFFFFFFF 8100E000&lt;BR /&gt;                                       Processor Base Register&lt;BR /&gt;       VPTB            FFFFFFFC 00000000&lt;BR /&gt;                                       Virtual Page Table Base Register&lt;BR /&gt;       SCBB            00000000 000001A0&lt;BR /&gt;                                       System Control Block Base&lt;BR /&gt;       SISR            00000000 00000000&lt;BR /&gt;                                       Software Interrupt Summary Register&lt;BR /&gt;       ASN             00000000 00000006&lt;BR /&gt;                                       Address Space Number&lt;BR /&gt;&lt;BR /&gt; V M S                SYSTEM ERROR REPORT         COMPILED 31-MAR-2005 17:51:38&lt;BR /&gt;                                                                      PAGE   4.&lt;BR /&gt;&lt;BR /&gt;       ASTSR_ASTEN     00000000 0000000F&lt;BR /&gt;                                       AST Summary/AST Enable&lt;BR /&gt;       FEN             00000000 00000001&lt;BR /&gt;                                       Floating-Point Enable&lt;BR /&gt;       IPL             00000000 0000001F&lt;BR /&gt;                                       Interrupt Priority Level&lt;BR /&gt;       MCES            00000000 00000000&lt;BR /&gt;                                       Machine Check Error Summary&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 31 Mar 2005 11:56:59 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515482#M67832</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-03-31T11:56:59Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515483#M67833</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;the key to MACHINECHK crashes are the MCHK errlog entries - if there are any.&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; CLUE ERRLOG&lt;BR /&gt;&lt;BR /&gt;will list them and extract them from the dump into a file CLUE$ERRLOG.SYS in your login or default directory.&lt;BR /&gt;&lt;BR /&gt;Run this file through ANAL/ERR or - better- DECevent ($ DIAGNOSE).&lt;BR /&gt;&lt;BR /&gt;The CLUE file is not of much help, especially as the MACHINECHK stack is not correctly decoded until V7.3-2.&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Thu, 31 Mar 2005 12:00:14 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515483#M67833</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-03-31T12:00:14Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515484#M67834</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;Since Volker forgot you're new to VMS:&lt;BR /&gt;&lt;BR /&gt;Do &lt;BR /&gt;$ analyze/system&lt;BR /&gt;&lt;BR /&gt;to get to the SDA&amp;gt; prompt.&lt;BR /&gt;&lt;BR /&gt;Steve</description>
      <pubDate>Thu, 31 Mar 2005 12:11:04 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515484#M67834</guid>
      <dc:creator>Steve Nimr</dc:creator>
      <dc:date>2005-03-31T12:11:04Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515485#M67835</link>
      <description>Sorry Doug I got threads mixed up. :(&lt;BR /&gt;I guess there is no way to recall a reply once it's submitted.</description>
      <pubDate>Thu, 31 Mar 2005 12:21:26 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515485#M67835</guid>
      <dc:creator>Steve Nimr</dc:creator>
      <dc:date>2005-03-31T12:21:26Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515486#M67836</link>
      <description>Steve,&lt;BR /&gt;&lt;BR /&gt;it's ANAL/CRASH SYS$SYSTEM:SYSDUMP.DMP to access a system dump file (in it's default location).&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Thu, 31 Mar 2005 12:26:53 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515486#M67836</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-03-31T12:26:53Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515487#M67837</link>
      <description>Here's the output from diagnose around the time period of the last crash:&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  376 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19262.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 14:45:02&lt;BR /&gt;Time since reboot                    23 Day(s) 10:18:45&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       38. Time Stamp Entry&lt;BR /&gt;&lt;BR /&gt;SWI Minor class                   7. Timestamp&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  377 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19263.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 14:54:23&lt;BR /&gt;Time since reboot                    23 Day(s) 10:28:05&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       37. Crash Re-Start&lt;BR /&gt;&lt;BR /&gt;Bugcheck Minor class              1. Crash Re-start&lt;BR /&gt;&lt;BR /&gt;Bugcheck Msg                         MACHINECHK, Machine check while in kernel&lt;BR /&gt;                                     mode&lt;BR /&gt;Process ID                x002E0152&lt;BR /&gt;Process Name&lt;BR /&gt;KSP                       x000000007FFA1E90&lt;BR /&gt;ESP                       x000000007FFA6000&lt;BR /&gt;SSP                       x000000007FFAC100&lt;BR /&gt;USP                       x000000007AF76DC0&lt;BR /&gt;R0                        xFFFFFFFF8A0E01E8&lt;BR /&gt;R1                        x000000000000940E&lt;BR /&gt;R2                        xFFFFFFFF839A6EB0&lt;BR /&gt;R3                        xFFFFFFFF8A0E0000&lt;BR /&gt;R4                        x0000000000200040&lt;BR /&gt;R5                        xFFFFFFFFFFFFFFFF&lt;BR /&gt;R6                        x0000000000000001&lt;BR /&gt;R7                        x0000000000000003&lt;BR /&gt;R8                        x000000000000005C&lt;BR /&gt;R9                        x0000000000000000&lt;BR /&gt;R10                       x0000000000000006&lt;BR /&gt;R11                       x0000000000000006&lt;BR /&gt;R12                       x0000000000000000&lt;BR /&gt;R13                       x000000000000001C&lt;BR /&gt;R14                       x0000000000000010&lt;BR /&gt;R15                       x0000000000000000&lt;BR /&gt;R16                       x0000000000000215&lt;BR /&gt;R17                       x0000000000000001&lt;BR /&gt;R18                       x0000000000000001&lt;BR /&gt;R19                       x0000000000000001&lt;BR /&gt;R20                       x0000000000C42414&lt;BR /&gt;R21                       xFFFFFFFF8A0E0000&lt;BR /&gt;R22                       xFFFFFFFFFFFFFFFF&lt;BR /&gt;R23                       x0000000000000086&lt;BR /&gt;R24                       x0000000000000086&lt;BR /&gt;R25                       x0000000000000003&lt;BR /&gt;R26                       x0000000000000210&lt;BR /&gt;R27                       xFFFFFFFF839BD680&lt;BR /&gt;R28                       x0000000000000000&lt;BR /&gt;FP                        x000000007FFA1E90&lt;BR /&gt;SP                        x000000007FFA1E90&lt;BR /&gt;PC                        xFFFFFFFF80066BD0&lt;BR /&gt;PS                        x1000000000001F04&lt;BR /&gt;PTBR                      x000000000000F7BF&lt;BR /&gt;Process Ctl Block Base Re x0000000011B4E080&lt;BR /&gt;PRBR                      xFFFFFFFF8100E000&lt;BR /&gt;VPTB                      xFFFFFFFC00000000&lt;BR /&gt;System Ctl Block Base Reg x00000000000001A0&lt;BR /&gt;Software Interrupt Summar x0000000000000000&lt;BR /&gt;ASN                       x0000000000000006&lt;BR /&gt;ASTSR ASTEN               x000000000000000F&lt;BR /&gt;FEN                       x0000000000000001&lt;BR /&gt;ASN                       x0000000000000006&lt;BR /&gt;IPL                       x000000000000001F&lt;BR /&gt;MCES                      x0000000000000000&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  378 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19263.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 15:00:11&lt;BR /&gt;Time since reboot                    0 Day(s) 0:00:17&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       32. Cold Start (ie: System Boot)&lt;BR /&gt;&lt;BR /&gt;SWI Minor class                   2. System startup&lt;BR /&gt;&lt;BR /&gt;TODR                      x3D202445&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  379 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19264.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 15:00:12&lt;BR /&gt;Time since reboot                    0 Day(s) 0:00:17&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       64. Volume Mount&lt;BR /&gt;&lt;BR /&gt;SWI Minor class                   4. Volume mount&lt;BR /&gt;&lt;BR /&gt;Owner UIC                 x00010001&lt;BR /&gt;Error count                       0.&lt;BR /&gt;OP count                        517.&lt;BR /&gt;Unit Number                     100.&lt;BR /&gt;Unit Name                            ALPHA2$DKA&lt;BR /&gt;Volume number                     0.&lt;BR /&gt;Volumes in set                    0.&lt;BR /&gt;Volume Label                         ALPHA2SYS&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  380 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19265.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 15:01:15&lt;BR /&gt;Time since reboot                    0 Day(s) 0:01:21&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       98. Asynchronous Device Attention&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;---- Device Profile ----&lt;BR /&gt;Unit                                 ALPHA2$PEA0&lt;BR /&gt;Product Name                         NI-SCA Port&lt;BR /&gt;&lt;BR /&gt;---- NISCA Port Data ----&lt;BR /&gt;Error Type and SubType        x0700  Device Error, Fatal Error Detected by&lt;BR /&gt;                                     Datalink&lt;BR /&gt;Status                    x0000120100000500&lt;BR /&gt;Datalink Device Name                 FWA2:&lt;BR /&gt;Remote Node Name&lt;BR /&gt;Remote Address            x0000000000000000&lt;BR /&gt;Local Address             x00000405000400AA&lt;BR /&gt;Error Count                       1. Error Occurrences This Entry&lt;BR /&gt;&lt;BR /&gt;----- Software Info -----&lt;BR /&gt;UCB$x_ERRCNT                      1. Errors This Unit&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  381 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19266.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 15:01:16&lt;BR /&gt;Time since reboot                    0 Day(s) 0:01:22&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       98. Asynchronous Device Attention&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;---- Device Profile ----&lt;BR /&gt;Unit                                 ALPHA2$PEA0&lt;BR /&gt;Product Name                         NI-SCA Port&lt;BR /&gt;&lt;BR /&gt;---- NISCA Port Data ----&lt;BR /&gt;Error Type and SubType        x0700  Device Error, Fatal Error Detected by&lt;BR /&gt;                                     Datalink&lt;BR /&gt;Status                    x0000120000000400&lt;BR /&gt;Datalink Device Name                 FWA2:&lt;BR /&gt;Remote Node Name&lt;BR /&gt;Remote Address            x0000000000000000&lt;BR /&gt;Local Address             x00000405000400AA&lt;BR /&gt;Error Count                       1. Error Occurrences This Entry&lt;BR /&gt;&lt;BR /&gt;----- Software Info -----&lt;BR /&gt;UCB$x_ERRCNT                      2. Errors This Unit&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;******************************** ENTRY  382 ********************************&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Logging OS                        1. OpenVMS&lt;BR /&gt;System Architecture               2. Alpha&lt;BR /&gt;OS version                           V7.1-1H2&lt;BR /&gt;Event sequence number         19267.&lt;BR /&gt;Timestamp of occurrence              29-MAR-2005 15:01:23&lt;BR /&gt;Time since reboot                    0 Day(s) 0:01:30&lt;BR /&gt;Host name                            ALPHA2&lt;BR /&gt;&lt;BR /&gt;System Model                         AlphaServer 1000 4/233&lt;BR /&gt;&lt;BR /&gt;Entry type                       64. Volume Mount&lt;BR /&gt;&lt;BR /&gt;SWI Minor class                   4. Volume mount&lt;BR /&gt;&lt;BR /&gt;Owner UIC                 x00010004&lt;BR /&gt;Error count                       0.&lt;BR /&gt;OP count                         15.&lt;BR /&gt;Unit Number                       1.&lt;BR /&gt;Unit Name                            213260$DUA&lt;BR /&gt;Volume number                     0.&lt;BR /&gt;Volumes in set                    0.&lt;BR /&gt;Volume Label                         USER2&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 31 Mar 2005 12:46:31 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515487#M67837</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-03-31T12:46:31Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515488#M67838</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;bad luck - OpenVMS V7.1-1H2 did NOT log any machine check entry.&lt;BR /&gt;&lt;BR /&gt;This is the SAME machine/problem as already discussed in previous thread:&lt;BR /&gt;&lt;BR /&gt;&lt;A href="http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=808549" target="_blank"&gt;http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=808549&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;I keep a database of all crashes, that's why I know ;-)&lt;BR /&gt;&lt;BR /&gt;Could you please try to provide the stack data as requested in the previous thread:&lt;BR /&gt;&lt;BR /&gt;$ ANAL/CRASH SYS$SYSTEM:SYSDUMP.DMP&lt;BR /&gt;SDA&amp;gt; READ/EXEC&lt;BR /&gt;SDA&amp;gt; SHOW STACK/QUAD 7FFA1FC0;40&lt;BR /&gt;&lt;BR /&gt;It may also be possible to find the machine check logout frame in the dump.&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Thu, 31 Mar 2005 12:54:27 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515488#M67838</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-03-31T12:54:27Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515489#M67839</link>
      <description>Thanks for the link Volker.&lt;BR /&gt;You're absolutely right.  Adrian is my hardware support contact and I'm that "sysadmin is in west coast Canada" he referred to.&lt;BR /&gt;&lt;BR /&gt;In-any-case, I was not aware that they were using this forum to trouble-shoot the problem. I thought I'd try as I'm not getting anywhere following the official channels.&lt;BR /&gt;&lt;BR /&gt;Here's the output from the SHOW STACK/QUAD 7FFA1FC0;40 command:&lt;BR /&gt;&lt;BR /&gt;Specified Stack Range&lt;BR /&gt;---------------------&lt;BR /&gt;                       00000000.7FFA1FC0    00000000.0002F030&lt;BR /&gt;                       00000000.7FFA1FC8    00000000.010E0019&lt;BR /&gt;                       00000000.7FFA1FD0    00000000.7AF77A5C&lt;BR /&gt;                       00000000.7FFA1FD8    00000000.7AF78AA0&lt;BR /&gt;                       00000000.7FFA1FE0    00000000.00000001&lt;BR /&gt;                       00000000.7FFA1FE8    00000000.00000003&lt;BR /&gt;                       00000000.7FFA1FF0    00000000.0030F080&lt;BR /&gt;                       00000000.7FFA1FF8    00000000.0000001B&lt;BR /&gt;</description>
      <pubDate>Thu, 31 Mar 2005 13:17:49 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515489#M67839</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-03-31T13:17:49Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515490#M67840</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;Just curious--just how precisely do you mean "every 3 weeks":&lt;BR /&gt;&lt;BR /&gt;1) every 3 weeks, within a few milliseconds&lt;BR /&gt;2) every 3 weeks, within a couple of hours&lt;BR /&gt;3) Every 3 weeks, within a few days &lt;BR /&gt;&lt;BR /&gt;I'll bet your answer is 3. :-)&lt;BR /&gt;&lt;BR /&gt;To hazard a little speculation around each possibility:&lt;BR /&gt;  &lt;BR /&gt;1) would be pretty strange, to me at least. Perhaps a flaw in the fabric of space-time. :-)&lt;BR /&gt;&lt;BR /&gt;2) might suggest a link to some calendar-related activity. Perhaps a procedure or device that is used at every couple of weeks? But you'd probably have noticed that.&lt;BR /&gt;&lt;BR /&gt;3) suggests something a lot more random or at least aperiodic, which is why I guessed you'd pick this answer.&lt;BR /&gt;&lt;BR /&gt;Just a few thoughts which may at least stimulate some thought, if they're of any use at all...&lt;BR /&gt;&lt;BR /&gt;Galen</description>
      <pubDate>Thu, 31 Mar 2005 13:44:42 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515490#M67840</guid>
      <dc:creator>Galen Tackett</dc:creator>
      <dc:date>2005-03-31T13:44:42Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515491#M67841</link>
      <description>Volker, &lt;BR /&gt;"I keep a database of all crashes, that's why I know"&lt;BR /&gt;and I thought you just remembered them all rather than having a private copy of canasta :-)&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 31 Mar 2005 15:11:31 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515491#M67841</guid>
      <dc:creator>Ian Miller.</dc:creator>
      <dc:date>2005-03-31T15:11:31Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515492#M67842</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;the interrupt/exception stack frame shows, that the current PC at the time of the MACHINECHK is in P0 space and the PS shows user-mode IPL 0:&lt;BR /&gt;&lt;BR /&gt;00000000.7FFA1FF0 00000000.0030F080 &amp;lt;&amp;lt;&amp;lt; PC&lt;BR /&gt;00000000.7FFA1FF8 00000000.0000001B &amp;lt;&amp;lt;&amp;lt; PS&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; eva/ps 0000001B&lt;BR /&gt;         MBZ SPAL      MBZ    IPL VMM MBZ CURMOD INT PRVMOD&lt;BR /&gt;         0   00   00000000000 00  0   0   USER   0   USER&lt;BR /&gt;&lt;BR /&gt;so whatever the instruction is&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; EXA/INS 30F080&lt;BR /&gt;&lt;BR /&gt;it CANNOT have caused a MACHINECHK through a programming error (i.e. access into IO-space), because you can't do that in USER mode. It could have caused access to a bad memory page, but that would be pure speculation !!&lt;BR /&gt;&lt;BR /&gt;Please issue the following commands in SDA:&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; EXA/INS 30F080-30;40&lt;BR /&gt;&lt;BR /&gt;to examine the instruction stream. If the current instruction include a memory access and you're able to figure out the address, also do&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; SHOW PROC/PAGE address;1000&lt;BR /&gt;&lt;BR /&gt;Otherwise, I'll help you to figure out the page number...&lt;BR /&gt;&lt;BR /&gt;To get an overview of the last couple of crashes on this node, just try TYPE CLUE$HISTORY - if there is something timing related, you might be able to spot a pattern.&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Fri, 01 Apr 2005 02:13:46 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515492#M67842</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-04-01T02:13:46Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515493#M67843</link>
      <description>Doug, &lt;BR /&gt;&lt;BR /&gt;If you realy suspect the memory, then try to shut down the machine and bring it to SRM console. Then start 2 memexers per CPU and let them run for a few hours. If there is realy bad RAM it should show on console. To stop the memexer give the kill_diag command (or init the system). To show the status of memexter type show_diag. &lt;BR /&gt;&lt;BR /&gt;(I could be a litle of with the commands, look in the manual or try help or man for exact commands). &lt;BR /&gt;&lt;BR /&gt;It could be possible that the RAM has gone bad. At my current site we have had several issue's with bad RAM.</description>
      <pubDate>Fri, 01 Apr 2005 02:28:00 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515493#M67843</guid>
      <dc:creator>DICTU OpenVMS</dc:creator>
      <dc:date>2005-04-01T02:28:00Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515494#M67844</link>
      <description>Volker:&lt;BR /&gt;SDA&amp;gt; EXA/INS 30F080&lt;BR /&gt;00000000.0030F080:      BIS             R31,#X1D,R7&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; EXA/INS 30F080-30;40&lt;BR /&gt;00000000.0030F050:      CVTDG                   F3,F3&lt;BR /&gt;00000000.0030F054:      ADDG            F4,F3,F3&lt;BR /&gt;00000000.0030F058:      CVTGD                   F3,F3&lt;BR /&gt;00000000.0030F05C:      STD             F3,#X0CF8(FP)&lt;BR /&gt;00000000.0030F060:      TRAPB&lt;BR /&gt;00000000.0030F064:      LDA             R16,#X0008(FP)&lt;BR /&gt;00000000.0030F068:      BIS             R31,#X01,R25&lt;BR /&gt;00000000.0030F06C:      LDQ             R26,#XFF60(R2)&lt;BR /&gt;00000000.0030F070:      LDQ             R27,#XFF68(R2)&lt;BR /&gt;00000000.0030F074:      JSR             R26,(R26)&lt;BR /&gt;00000000.0030F078:      JMP             R31,(R0)&lt;BR /&gt;00000000.0030F07C:      TRAPB&lt;BR /&gt;00000000.0030F080:      BIS             R31,#X1D,R7&lt;BR /&gt;00000000.0030F084:      STL             R7,#X0020(FP)&lt;BR /&gt;00000000.0030F088:      LDL             R3,#X0CE0(FP)&lt;BR /&gt;00000000.0030F08C:      ADDL/V          R3,#X01,R3&lt;BR /&gt;00000000.0030F090:      LDA             R16,#X8000(R31)&lt;BR /&gt;&lt;BR /&gt;I looked at the clue$history file and there doesn't appear to be any pattern other than approx every 3 weeks.&lt;BR /&gt;e.g. The previous 4 crashes are:&lt;BR /&gt;Date         Uptime&lt;BR /&gt;========     ==========&lt;BR /&gt;Dec 29       22 days&lt;BR /&gt;Jan 20       25 days&lt;BR /&gt;Feb 14       25 days&lt;BR /&gt;Mar 29       23 days&lt;BR /&gt;&lt;BR /&gt;Sorry, I don't know what address to put in the SHOW PROC/PAGE address;1000 command.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 01 Apr 2005 14:24:42 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515494#M67844</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-04-01T14:24:42Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515495#M67845</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;the exception PC points to a BIS R31,#X1D,R7 instruction, so there are no memory accesses involved executing this instruction - except access to the page, where this instruction is stored. Please remember to repeat these steps against the next crash(es).&lt;BR /&gt;&lt;BR /&gt;Now let's try to find the machinecheck logout frame in the dump:&lt;BR /&gt;&lt;BR /&gt;SDA&amp;gt; READ SYSDEF&lt;BR /&gt;SDA&amp;gt; SHOW STACK @(@smp$gl_cpu_data+CPU$L_PROC_MCHK_ABORT_SVAPTE+4);2F0&lt;BR /&gt;&lt;BR /&gt;You have to enter the command in one line.&lt;BR /&gt;(above command only applies to single-CPU system - which this node is).&lt;BR /&gt;&lt;BR /&gt;Try to include the output as a text file attachment in your next reply (or mail it to me - see my forum profile).&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Sat, 02 Apr 2005 01:12:46 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515495#M67845</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-04-02T01:12:46Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515496#M67846</link>
      <description>Thanks for your help Volker.&lt;BR /&gt;I've attached a text file with the output.</description>
      <pubDate>Mon, 04 Apr 2005 10:30:01 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515496#M67846</guid>
      <dc:creator>Doug_81</dc:creator>
      <dc:date>2005-04-04T10:30:01Z</dc:date>
    </item>
    <item>
      <title>Re: System crashes every 3 weeks.</title>
      <link>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515497#M67847</link>
      <description>Doug,&lt;BR /&gt;&lt;BR /&gt;thanks for the data:&lt;BR /&gt;&lt;BR /&gt;8A0E0058    00000001.00000205 = mchk code&lt;BR /&gt;&lt;BR /&gt;Could you please compare the data with the same SDA command in the running system ? Sometimes mchk data is left in this buffer from 'expected' machinechecks (like during SYSMAN IO AUTOCONFIGURE when scanning the device configuration).&lt;BR /&gt;&lt;BR /&gt;If the same data exists in the running system, we know that no machine check frame has been logged and need to try to find out, why OpenVMS has crashes with a MACHINECHK crash.&lt;BR /&gt;&lt;BR /&gt;Volker.</description>
      <pubDate>Tue, 05 Apr 2005 10:39:38 GMT</pubDate>
      <guid>https://community.hpe.com/t5/operating-system-openvms/system-crashes-every-3-weeks/m-p/3515497#M67847</guid>
      <dc:creator>Volker Halle</dc:creator>
      <dc:date>2005-04-05T10:39:38Z</dc:date>
    </item>
  </channel>
</rss>

