Operating System - OpenVMS
1748157 Members
3932 Online
108758 Solutions
New Discussion юеВ

Re: Alpha ES80 OpenVMS 8.3 crash on reboot.

 
jpd252
Occasional Advisor

Alpha ES80 OpenVMS 8.3 crash on reboot.

can anyone help decipher this error log.  this machine crashed and then tried to reboot numerous times before finally completing.  took over an hour to boot.

 


 

9 REPLIES 9
abrsvc
Respected Contributor

Re: Alpha ES80 OpenVMS 8.3 crash on reboot.

In order for us to help, you need to post the errorlog. Please post this when you can.

Thanks,
Dan
Volker Halle
Honored Contributor

Re: Alpha ES80 OpenVMS 8.3 crash on reboot.

If this was a real system crash (OpenVMS bugcheck), try to provide the CLUE file (CLUE$COLLECT:CLUE$node_ddmmyy.LIS) from the time of the crash, if there is one.

 

If a crashdump has been written, extract the errlog information from that dump with:

 

$ ANALYZE/CRAS dumpfile

SDA> CLUE ERRLOG

... will show the most recent errlog (and therefore relevant) entries ...

SDA> EXIT

 

You'll find the errlog extract in CLUE$$ERRLOG.SYS

 

To format the errorlog, you'll need DECevent or WEBES SEA

 

The console output (if it has been captured) may also help find out, what kind of problem has caused this behaviour.

 

Volker.

jpd252
Occasional Advisor

Re: Alpha ES80 OpenVMS 8.3 crash on reboot.

sorry I tried to and it didnt post I have it in a text file it doesnt see to be able to attach

jpd252
Occasional Advisor

Re: Alpha ES80 OpenVMS 8.3 crash on reboot.

Output for SYS$SYSROOT:[SYSMGR]CLUE$ERRLOG.SYS;1

EVENT  EVENT_TYPE_____________________________  TIMESTAMP______________  NODE__
 EVENT_CLASS____________________________
1      System Configuration                     17-JUL-2013 13:59:28.29  ASOCC2
 CONFIGURATION

DESCRIPTION__________________________________   RANGE___   VALUE_____________
TRANSLATED_VALUE_______________________
   Hardware Architecture                                   4
Alpha
   Hardware System Type                                    39

   Logging CPU                                             0

   Number of CPU's in Active Set                           0

   System Marketing Model                                  2031
hp AlphaServer ES80 Series
   Seconds Since Boot                                      103

   Chip Type                                               15
EV7 (21364)
   Error Sequence Number                                   1

   DSR String                                              hp AlphaServer ES80 7
/1150
   Operating System Version                                V8.3


%ELV-E-B2TNOTFND, valid bit-to-text translation data not found
-ELV-W-NODNOTFND, bit-to-text node not found

EVENT  EVENT_TYPE_____________________________  TIMESTAMP______________  NODE__
 EVENT_CLASS____________________________
2      Machine Check 6A0/6B0 - RUE              17-JUL-2013 14:00:53.08  ASOCC2
 MACHINE_CHECKS

DESCRIPTION__________________________________   RANGE___   VALUE_____________
TRANSLATED_VALUE_______________________
   Hardware Architecture                                   4
Alpha
   Hardware System Type                                    39

   Logging CPU                                             0

   Number of CPU's in Active Set                           4

   Device Class                                            1

   System Marketing Model                                  2031
hp AlphaServer ES80 Series
   Seconds Since Boot                                      103

   Chip Type                                               15
EV7 (21364)
   Error Sequence Number                                   61

   DSR String                                              hp AlphaServer ES80 7
/1150
   Operating System Version                                V8.3


%ELV-E-B2TNOTFND, valid bit-to-text translation data not found
-ELV-W-ENTNOTFND, bit-to-text entry not found

EVENT  EVENT_TYPE_____________________________  TIMESTAMP______________  NODE__
 EVENT_CLASS____________________________
3      Crash Restart                            17-JUL-2013 14:00:53.08  ASOCC2
 BUGCHECKS

DESCRIPTION__________________________________   RANGE___   VALUE_____________
TRANSLATED_VALUE_______________________
   Hardware Architecture                                   4
Alpha
   Hardware System Type                                    39

   Logging CPU                                             0

   Number of CPU's in Active Set                           0

   System Marketing Model                                  2031
hp AlphaServer ES80 Series
   Seconds Since Boot                                      103

   Chip Type                                               15
EV7 (21364)
   Error Sequence Number                                   62

   DSR String                                              hp AlphaServer ES80 7
/1150
   Operating System Version                                V8.3


   Kernel Stack Pointer                                    0x000000007FF87B60

   Executive Stack Pointer                                 0x000000007FF8B910

   Supervisor Stack Pointer                                0x000000007FF9CC80

   User Stack Pointer                                      0x000000007AE17530

   Register R0                                             0x0000000000000001

   Register R1                                             0xFFFFFFFF8643E000

   Register R2                                             0x0000000000000210

   Register R3                                             0x0000000000000001

   Register R4                                             0x0000000000000000

   Register R5                                             0x0000000000002000

   Register R6                                             0xFFFFFFFF827B2400

   Register R7                                             0xFFFFFFFF81F5AD30

   Register R8                                             0xFFFFFFFF827B26CC

   Register R9                                             0x0000000000000001

   Register R10                                            0x0000000000000000

   Register R11                                            0x0000000000000000

   Register R12                                            0x0000000000000020

   Register R13                                            0xFFFFFFFF88FE0980

   Register R14                                            0xFFFFFFFF81F5ADD0

   Register R15                                            0x0000000000000000

   Register R16                                            0x0000000000000215

   Register R17                                            0xFFFFFFFF81C41A58

   Register R18                                            0x0000000000000002

   Register R19                                            0x0000000000000036

   Register R20                                            0x0000000000000036

   Register R21                                            0x0000000000000036

   Register R22                                            0x0000000000000000

   Register R23                                            0xFFFFFFFF8643E000

   Register R24                                            0xFFFFFFFF8643E000

   Register R25                                            0x0000000000000002

   Register R26                                            0xFFFFFFFF8001964C

   Register R27                                            0xFFFFFFFF81C3F7B0

   Register R28                                            0x0000000000000000

   Frame Pointer                                           0x000000007FF87B60

   Current Stack Pointer                                   0x000000007FF87B60

   Program Counter                                         0xFFFFFFFF80019654

   Processor Status                             <63:00>:   0x2000000000001F04

      Interrupt Pending                            <02>:   0x1

      Current Mode                              <04:03>:   0x0
Kernel
      Interrupt Priority Level (IPL)                       0x1F

      Stack Alignment                           <61:56>:   0x20

   Page Table Base Register (PTBR)                         0x0000000000007CAC

   Privileged Context Block Base (PCBB)                    0x000000000F95A080

   Processor Base Register (PRBR)                          0xFFFFFFFF82038000

   Virtual Page Table Base (VPTB)                          0xFFFFFEFA00000000

   System Control Block Base (SCBB)                        0x000000000000096C

   Software Interrupt Summary (SISR)            <63:00>:   0x0000000000000000

   Address Space Number (ASN)                              120

   AST Enable/AST Summary (ASTEN/ASTSR)         <63:00>:   0x000000000000000F

      Kernel Mode AST Enabled/Pending

      Executive Mode AST Enabled/Pending

      Supervisor Mode AST Enabled/Pending

      User Mode AST Enabled/Pending

   Floating-Point Enable (FEN)                  <63:00>:   0x0000000000000001

      Floating Point Enabled

   Interrupt Priority Level (IPL)                          31

   Machine Check Error Summary (MCES)           <63:00>:   0x0000000000000000

   Bugcheck/Crash Code                          <31:00>:   0x00000215

      Reboot Type                                  <00>:   0x1
COLD
      Severity                                     <02>:   0x1
FATAL
      Type                                      <31:03>:   0x00000042
MACHINECHK
   Current Process ID                                      0x00010005

   Current Process Name                                    .STARTUP


ERROR_LOG_SUMMARY______________________________________________________

Total number of events:                         3
Number of the first event:                      1
Number of the last event:                       3
Earliest event occurred:                        17-JUL-2013 13:59:28.29
Latest event occurred:                          17-JUL-2013 14:00:53.08
Number of events by event class:
        BUGCHECKS                               1
        CONFIGURATION                           1
        MACHINE_CHECKS                          1

jpd252
Occasional Advisor

Re: Alpha ES80 OpenVMS 8.3 crash on reboot.

ASOCC2$ type CLUE$STARTUP_ASOCC2.LOG;105
$!
$!*************************************************************************
$!*                                                                       *
$!*    ┬й    Digital Equipment Corporation, 1993                           *
$!*              All Rights Reserved.                                     *
$!* Unpublished rights reserved under the copyright laws  of  the  United *
$!* States.                                                               *
$!*                                                                       *
$!* The software contained on this media is proprietary to  and  embodies *
$!* the   confidential   technology  of  Digital  Equipment  Corporation. *
$!* Possession, use, duplication or dissemination  of  the  software  and *
$!* media  is  authorized  only  pursuant to a valid written license from *
$!* Digital Equipment Corporation.                                        *
$!*                                                                       *
$!* RESTRICTED RIGHTS LEGEND Use, duplication, or disclosure by the  U.S. *
$!* Government  is  subject  to restrictions as set forth in Subparagraph *
$!* (c)(1)(ii) of DFARS 252.227-7013, or in FAR 52.227-19, as applicable. *
$!*                                                                       *
$!*************************************************************************
$!
$!
$! ...  Facility:       CLUE$SDA
$!
$! ...  Abstract:       This procedure has two purposes.  It is called by
$!                      VMS$DEVICE_STARTUP during system startup to define
$!                      CLUE related logical names. It will run a detached
$!                      process that uses SDA and CLUE$SDA commands to
$!                      collect and save specific dump information.
$!
$! ...  Environment:    DCL
$!
$! ...  Author:         Christian Moser / Digital Equipment Corporation (DEC)
$!
$! ...  Date:           22-MAY-1992
$!
$! ...  Modified by:
$!
$!      X-10    RAB             Richard A. Bishop       14-Oct-2004
$!              Re-enable on IA64
$!
$!      X-9     GHJ             Gregory H. Jordan       24-Apr-2003
$!              Disable on IA64 for now
$!
$!      X-8     CMOS            Christian Moser         23-MAR-2001
$!              Remove CLUE$HELP, this is obsolete, as SDA> CLUE HELP will
$!              use the SDA help library.
$!
$!      X-7     RAB             Richard A. Bishop       02-Aug-2000
$!              Fix up file attributes as necessary for DOSD dump file
$!              (protection, caching, nobackup, nomove)
$!
$!      X-6     CMOS            Christian Moser         24-AUG-1998
$!              Add a default file extension to clue$errlog.
$!
$!      X-5     CMOS            Christian Moser         30-APR-1997
$!              Add support for DOSD, i.e. dumpfile off system disk.
$!
$!      X-4     CMOS            Christian Moser         20-JAN-1997
$!              Use SYS$SCRATCH instead of SYS$ERRORLOG for the CLUE
$!              error logfile.
$!
$!      X-3     CMOS036         Christian Moser         22-SEP-1994
$!              Remove comment from SDA Command READ/EXEC to avoid any
$!              error messages. SDA doesn't like this.
$!
$!      X-2     CMOS027         Christian Moser          2-MAY-1994
$!              Read in symbol tables from execlets and drivers
$!              before running any CLUE commands.
$!
$!      X-1     CMOS001         Christian Moser         22-JUN-1993
$!              Adapted header and copyright, integrate it into
$!              SYS$STARTUP:VMS$DEVICE_STARTUP.COM
$!
$!--------------------------------------------------------------------------
$!
$!
$ set noon
$ prcnam = f$getjpi("","prcnam")
$ if (f$mode().eqs."OTHER") .and. (prcnam.nes."STARTUP") then goto check
$check:
$!
$!      Perform some sanity checks and find the file where the dump is stored.
$!
$ if .not. f$getsyi("dumpbug")
$ endif
$!
$!      DOSD: check if bit 2 is set in DUMPSTYLE, and if true verify there
$!      is indeed a dumpfile on the device pointed to by the logical
$!      CLUE$DOSD_DEVICE
$!
$ dump_file = ""
$ if ( f$int(f$getsyi("dumpstyle")) .and. 4 ) .ne. 0
$ endif
$!
$ if dump_file .eqs. ""
$ then
$   dump_file = f$search("sys$specific:[sysexe]sysdump.dmp")
$   if dump_file .eqs. ""
$   endif
$ endif
$!
$ if dump_file .nes. ""
$ then
$   analyze/crash SYS$SPECIFIC:[SYSEXE]SYSDUMP.DMP;1

OpenVMS system dump analyzer
...analyzing an Alpha compressed selective memory dump...

Dump taken on 17-JUL-2013 13:56:32.12 using version V8.3
MACHINECHK, Machine check while in kernel mode

        read /executive /nolog
        clue history            ! crashdump summary and history information
        exit
$ else
$ endif
$!
$!      Perform some housekeeping as per clue$max_blocks
$!
$ analyze/system

OpenVMS system analyzer

        clue cleanup            ! perform housekeeping
%CLUE-I-CLEANUP, housekeeping started...
%CLUE-I-MAXBLOCK, maximum blocks allowed 5000 blocks
%CLUE-I-STAT, total of 63 CLUE files, 4016 blocks.
        exit
$!
$ exit
  SYSTEM       job terminated at 17-JUL-2013 14:00:36.15

  Accounting information:
  Buffered I/O count:                919      Peak working set size:      25152
  Direct I/O count:                  920      Peak virtual size:         193072
  Page faults:                      1739      Mounted volumes:                0
  Charged CPU time:        0 00:00:01.70      Elapsed time:       0 00:00:01.94

Volker Halle
Honored Contributor

Re: Alpha ES80 OpenVMS 8.3 crash on reboot.

I didn't ask for CLUE$STARTUP.LOG, but for CLUE$node_ddmmyy.LIS - but nevertheless:

 

MACHINECHK - this is a hardware-related crash, so you MUST decode the errorlog entries. OpenVMS ELV cannot decode errorlog entries. For an ES80, you need WEBES SEA (System Event Analyzer).

 

2      Machine Check 6A0/6B0 - RUE              17-JUL-2013 14:00:53.08  ASOCC2  <<< this is the hardware error !

 

Volker.

jpd252
Occasional Advisor

Re: Alpha ES80 OpenVMS 8.3 crash on reboot.


Reporting Node:
ASOCC2

Full Description:
 South Port has detected a address to a Non-existent CSR
 The error occurred during a WRITE.
 Latched PCI Command:  South Port 1 - Command Not Valid

 There was no North Port Errors latched in the P07 Error
 Summary Register, however the Error Summary Register for
 South Port 1 is latched and valid.

 This condition invokes the collection of PCI configuration data from
 the PCI adapters which, is appended to the errorlog for analysis.

 This callout is reporting that there was no error bits detected in any of
 the collected PCI configuration subpackets.  Please review the Bit-to-text
 report for any additional information.

 Check the devices on South Port 1

 Reported Condition: South Port 1 initiated an UnCorrectable Interrupt.

FRU List:
Probability       : High
Fru Manufacturer  : Not available
Fru Model         : Not available
Fru Part Number   : Not available
Fru Serial Number : Not available
Fru Firmware Rev  : Not available
Fru Description   : IO Adapter
Fru Information   : Cabinet 0
                  : 2P - Drawer 0
                  : No Bus Master Detected
Fru Assembly      : IO Adapter


Probability       : Medium
Fru Manufacturer  : -
Fru Model         : -
Fru Part Number   : -
Fru Serial Number : -
Fru Firmware Rev  : -
Fru Description   : IO7 ASIC
Fru Information   : Cabinet 0
                  : 2P - Drawer 0
                  : The IO7 is in a ES Series Platform
Fru Assembly      : SBB Backplane

 


Evidence:
Rule Set  : GS1280 IO7 Rule 051127
Qualifiers: Release Version
Event Id  : 0.16
Event Time: Wed 17 Jul 2013 13:29:46 GMT+00:00
Additional Information:    South Port 1 - Address Not Valid

Notifications:
  Console


Analysis Mode:
  Manual


SEA Version:
  System Event Analyzer for OpenVMS V4.4-3 (Build 34)


WCC Version:
  Web-Based Enterprise Services Common Components for
  OpenVMS V4.4-3 (Build 34), member of Web-Based
  Enterprise Services Suite for OpenVMS V4.4.3 (Build
  34)


---------- Problem Found:  South Port detected a NXM during a Write  at Thu 18 J
ul 2013 18:31:40 GMT+00:00 ----------

Problem Report Times:
    Event Time:      Wed 17 Jul 2013 13:50:51 GMT+00:00
    Report Time:     Thu 18 Jul 2013 18:31:40 GMT+00:00
    Expiration Time: Thu 18 Jul 2013 13:50:51 GMT+00:00

Managed Entity:
 System Type :hp AlphaServer ES80 7/1150
 Computer Name :ASOCC2
 System Serial Number :AY72800153
 Operating System Version :V8.3

Service Obligation Data&colon;

   Service Obligation:            Valid
   Service Obligation Number:     AY73000596
   System Serial Number:          AY73000596
   Service Provider Company Name: Hewlett-Packard Company


Brief Description:
 South Port detected a NXM during a Write

Callout ID:
x311286000007EF05

Severity:
2

Reporting Node:
ASOCC2

Full Description:
 South Port has detected a address to a Non-existent CSR
 The error occurred during a WRITE.
 Latched PCI Command:  South Port 1 - Command Not Valid

 There was no North Port Errors latched in the P07 Error
 Summary Register, however the Error Summary Register for
 South Port 1 is latched and valid.

 This callout is contained within a Console Data Log (CDL), or also
 knowned as a entry type 113.  In a 113, PALcode builds the errorlog
 entry and Servermanagement firmware writes the enty to the MBM.

 There are no PCI configuration subpackets in this entry, hence
 callout information is provided strictly by analyzing the CSR's
 in the IO7 ASIC.

 Check the devices on South Port 1

 Reported Condition: South Port 1 initiated an UnCorrectable Interrupt.

FRU List:
Probability       : High
Fru Manufacturer  : Not available
Fru Model         : Not available
Fru Part Number   : Not available
Fru Serial Number : Not available
Fru Firmware Rev  : Not available
Fru Description   : IO Adapter
Fru Information   : Cabinet 0
                  : 2P - Drawer 0
                  : No Bus Master Detected
Fru Assembly      : IO Adapter


Probability       : Medium
Fru Manufacturer  : -
Fru Model         : -
Fru Part Number   : -
Fru Serial Number : -
Fru Firmware Rev  : -
Fru Description   : IO7 ASIC
Fru Information   : Cabinet 0
                  : 2P - Drawer 0
                  : The IO7 is in a ES Series Platform
Fru Assembly      : SBB Backplane

 


Evidence:
Rule Set  : GS1280 IO7 Rule 051127
Qualifiers: Release Version
Event Id  : 0.46
Event Time: Wed 17 Jul 2013 13:47:31 GMT+00:00
Additional Information:    South Port 1 - Address Not Valid

Notifications:
  Console


Analysis Mode:
  Manual


SEA Version:
  System Event Analyzer for OpenVMS V4.4-3 (Build 34)


WCC Version:
  Web-Based Enterprise Services Common Components for
  OpenVMS V4.4-3 (Build 34), member of Web-Based
  Enterprise Services Suite for OpenVMS V4.4.3 (Build
  34)


---------- Problem Found: D-Stream Error Response packet was received on CPU #0
at Thu 18 Jul 2013 18:31:46 GMT+00:00 ----------

Problem Report Times:
    Event Time:      Wed 17 Jul 2013 14:23:37 GMT+00:00
    Report Time:     Thu 18 Jul 2013 18:31:46 GMT+00:00
    Expiration Time: Thu 18 Jul 2013 14:23:37 GMT+00:00

Managed Entity:
System Name   : ASOCC2
System Type   : hp AlphaServer ES80 7/1150
System Serial : AY72800153
OS Type       : OpenVMS/V8.3

Service Obligation Data&colon;

   Service Obligation:            Valid
   Service Obligation Number:     AY73000596
   System Serial Number:          AY73000596
   Service Provider Company Name: Hewlett-Packard Company


Brief Description:
D-Stream Error Response packet was received on CPU #0

Callout ID:
x483983000007EF05

Severity:
1

Reporting Node:
ASOCC2

Full Description:
A D-Stream Error Response packet was received on CPU #0.  This errorcondition
is the result of an uncorrectable error occurring on this or another CPU or
I/O subsystem in response to a D-Stream request made from this CPU.
It is unlikely that this error is the root cause of the failure and for that rea
son
no FRU call-out is provided this event.  It is recommended that manual analysis
be performed on the event log to isolate the cause for the Error Response packet
being generated.
Common causes for this condition are link, memory, cache, I/O errors, or in the
case.
of I/O related events, possibly the driver software or I/O adapter.
The additional information section below will contain further information regard
ing
the possible source of the error

FRU List:


Evidence:
Rule Set  : GS1280 EV7 Core 051127
Qualifiers: Released Version
Event Id  : 0.8
Event Time: Wed 17 Jul 2013 14:09:02 GMT+00:00
Additional Information:
The IO7 being addressed at the time of this error is attached to EV7 CPU: 0
The offset address is: x010FFFC0
The offset address references Port/Bus 1 (PCI-X bus 1)

Notifications:
  Console


Analysis Mode:
  Manual


SEA Version:
  System Event Analyzer for OpenVMS V4.4-3 (Build 34)


WCC Version:
  Web-Based Enterprise Services Common Components for
  OpenVMS V4.4-3 (Build 34), member of Web-Based
  Enterprise Services Suite for OpenVMS V4.4.3 (Build
  34)


---------- Problem Found:  IO7 Exceeded the maximum number of retries as PCI Mas
ter  at Thu 18 Jul 2013 18:31:46 GMT+00:00 ----------

Problem Report Times:
    Event Time:      Wed 17 Jul 2013 14:23:37 GMT+00:00
    Report Time:     Thu 18 Jul 2013 18:31:46 GMT+00:00
    Expiration Time: Thu 18 Jul 2013 14:23:37 GMT+00:00

Managed Entity:
 System Type :hp AlphaServer ES80 7/1150
 Computer Name :ASOCC2
 System Serial Number :AY72800153
 Operating System Version :V8.3

Service Obligation Data&colon;

   Service Obligation:            Valid
   Service Obligation Number:     AY73000596
   System Serial Number:          AY73000596
   Service Provider Company Name: Hewlett-Packard Company


Brief Description:
 IO7 Exceeded the maximum number of retries as PCI Master

Callout ID:
x310683000007EF05

Severity:
2

Reporting Node:
ASOCC2

Full Description:
 The IO7 ASIC has retry counters and time-out limit values
 to control transactions on the south port buses.  In an attempt
 to complete a transaction, the IO7 exceeded the maximum
 allowable retry count and set this error condition.
 PCI Bus Master:  The IO7 ASIC was Bus Master

 Latched PCI Command:  Memory Read

 There was no North Port Errors latched in the P07 Error
 Summary Register, however the Error Summary Register for
 South Port 1 is latched and valid.

 This callout is contained within a Console Data Log (CDL), or also
 knowned as a entry type 113.  In a 113, PALcode builds the errorlog
 entry and Servermanagement firmware writes the enty to the MBM.

 There are no PCI configuration subpackets in this entry, hence
 callout information is provided strictly by analyzing the CSR's
 in the IO7 ASIC.

 Check the devices on South Port 1

 Reported Condition: South Port 1 initiated an UnCorrectable Interrupt.

FRU List:
Probability       : High
Fru Manufacturer  : Not available
Fru Model         : Not available
Fru Part Number   : Not available
Fru Serial Number : Not available
Fru Firmware Rev  : Not available
Fru Description   : IO Adapter
Fru Information   : Cabinet 0
                  : 2P - Drawer 0
                  :
Fru Assembly      : IO Adapter


Probability       : Medium
Fru Manufacturer  : -
Fru Model         : -
Fru Part Number   : -
Fru Serial Number : -
Fru Firmware Rev  : -
Fru Description   : IO7 ASIC
Fru Information   : Cabinet 0
                  : 2P - Drawer 0
                  : The IO7 is in a ES Series Platform
Fru Assembly      : SBB Backplane

 


Evidence:
Rule Set  : GS1280 IO7 Rule 051127
Qualifiers: Release Version
Event Id  : 0.8
Event Time: Wed 17 Jul 2013 14:09:02 GMT+00:00
Additional Information: The PCI address was:x010FFFFC

Notifications:
  Console


Analysis Mode:
  Manual


SEA Version:
  System Event Analyzer for OpenVMS V4.4-3 (Build 34)


WCC Version:
  Web-Based Enterprise Services Common Components for
  OpenVMS V4.4-3 (Build 34), member of Web-Based
  Enterprise Services Suite for OpenVMS V4.4.3 (Build
  34)


---------- Problem Found:  IO7 Received Master Abort on PCI Bus  at Thu 18 Jul 2
013 18:31:46 GMT+00:00 ----------

Problem Report Times:
    Event Time:      Wed 17 Jul 2013 14:23:37 GMT+00:00
    Report Time:     Thu 18 Jul 2013 18:31:46 GMT+00:00
    Expiration Time: Thu 18 Jul 2013 14:23:37 GMT+00:00

Managed Entity:
 System Type :hp AlphaServer ES80 7/1150
 Computer Name :ASOCC2
 System Serial Number :AY72800153
 Operating System Version :V8.3

Service Obligation Data&colon;

   Service Obligation:            Valid
   Service Obligation Number:     AY73000596
   System Serial Number:          AY73000596
   Service Provider Company Name: Hewlett-Packard Company


Brief Description:
 IO7 Received Master Abort on PCI Bus

Callout ID:
x331586000007EF05

Severity:
2

Reporting Node:
ASOCC2

Full Description:
 The IO7 as the initiator of the transaction received a Master
 Abort on the PCI bus.
 PCI Bus Master:  The IO7 ASIC was Bus Master

 Latched PCI Command:  Memory Write

 There was no North Port Errors latched in the P07 Error
 Summary Register, however the Error Summary Register for
 South Port 3 is latched and valid.

 This callout is contained within a Console Data Log (CDL), or also
 knowned as a entry type 113.  In a 113, PALcode builds the errorlog
 entry and Servermanagement firmware writes the enty to the MBM.

 There are no PCI configuration subpackets in this entry, hence
 callout information is provided strictly by analyzing the CSR's
 in the IO7 ASIC.

 This detected condition is indicating that the SRM was already in
 console mode.  This implies that the system was trying to handle a
 error condition when a second error occurred.

 Check the devices on South Port 3

 Reported Condition: South Port 3 initiated an UnCorrectable Interrupt.

FRU List:
Probability       : High
Fru Manufacturer  : -
Fru Model         : -
Fru Part Number   : Radeon 7500 AGP
Fru Serial Number : -
Fru Firmware Rev  : -
Fru Description   : IO Adapter - AGP Module
Fru Information   : Cabinet 0
                  : 2P - Drawer 0
                  : The IO7 ASIC was Bus Master
Fru Assembly      : IO Adapter - AGP Module in South Port 3


Probability       : Medium
Fru Manufacturer  : -
Fru Model         : -
Fru Part Number   : -
Fru Serial Number : -
Fru Firmware Rev  : -
Fru Description   : IO7 ASIC
Fru Information   : Cabinet 0
                  : 2P - Drawer 0
                  : The IO7 is in a ES Series Platform
Fru Assembly      : SBB Backplane

 


Evidence:
Rule Set  : GS1280 IO7 Rule 051127
Qualifiers: Release Version
Event Id  : 0.8
Event Time: Wed 17 Jul 2013 14:09:39 GMT+00:00
Additional Information: The PCI address was:x000B8F00

Notifications:
  Console


Analysis Mode:
  Manual


SEA Version:
  System Event Analyzer for OpenVMS V4.4-3 (Build 34)


WCC Version:
  Web-Based Enterprise Services Common Components for
  OpenVMS V4.4-3 (Build 34), member of Web-Based
  Enterprise Services Suite for OpenVMS V4.4.3 (Build
  34)


---------- Problem Found:  IO7 Received Master Abort on PCI Bus  at Thu 18 Jul 2
013 18:31:46 GMT+00:00 ----------

Problem Report Times:
    Event Time:      Wed 17 Jul 2013 14:23:37 GMT+00:00
    Report Time:     Thu 18 Jul 2013 18:31:46 GMT+00:00
    Expiration Time: Thu 18 Jul 2013 14:23:37 GMT+00:00

Managed Entity:
 System Type :hp AlphaServer ES80 7/1150
 Computer Name :ASOCC2
 System Serial Number :AY72800153
 Operating System Version :V8.3

Service Obligation Data&colon;

   Service Obligation:            Valid
   Service Obligation Number:     AY73000596
   System Serial Number:          AY73000596
   Service Provider Company Name: Hewlett-Packard Company


Brief Description:
 IO7 Received Master Abort on PCI Bus

Callout ID:
x331586000007EF05

Severity:
2

Reporting Node:
ASOCC2

Full Description:
 The IO7 as the initiator of the transaction received a Master
 Abort on the PCI bus.
 PCI Bus Master:  The IO7 ASIC was Bus Master

 Latched PCI Command:  IO Write

 There was no North Port Errors latched in the P07 Error
 Summary Register, however the Error Summary Register for
 South Port 3 is latched and valid.

 This callout is contained within a Console Data Log (CDL), or also
 knowned as a entry type 113.  In a 113, PALcode builds the errorlog
 entry and Servermanagement firmware writes the enty to the MBM.

 There are no PCI configuration subpackets in this entry, hence
 callout information is provided strictly by analyzing the CSR's
 in the IO7 ASIC.

 This detected condition is indicating that the SRM was already in
 console mode.  This implies that the system was trying to handle a
 error condition when a second error occurred.

 Check the devices on South Port 3

 Reported Condition: South Port 3 initiated an UnCorrectable Interrupt.

FRU List:
Probability       : High
Fru Manufacturer  : -
Fru Model         : -
Fru Part Number   : Radeon 7500 AGP
Fru Serial Number : -
Fru Firmware Rev  : -
Fru Description   : IO Adapter - AGP Module
Fru Information   : Cabinet 0
                  : 2P - Drawer 0
                  : The IO7 ASIC was Bus Master
Fru Assembly      : IO Adapter - AGP Module in South Port 3


Probability       : Medium
Fru Manufacturer  : -
Fru Model         : -
Fru Part Number   : -
Fru Serial Number : -
Fru Firmware Rev  : -
Fru Description   : IO7 ASIC
Fru Information   : Cabinet 0
                  : 2P - Drawer 0
                  : The IO7 is in a ES Series Platform
Fru Assembly      : SBB Backplane

 


Evidence:
Rule Set  : GS1280 IO7 Rule 051127
Qualifiers: Release Version
Event Id  : 0.8
Event Time: Wed 17 Jul 2013 14:09:39 GMT+00:00
Additional Information: The PCI address was:x000003D4

Notifications:
  Console


Analysis Mode:
  Manual


SEA Version:
  System Event Analyzer for OpenVMS V4.4-3 (Build 34)


WCC Version:
  Web-Based Enterprise Services Common Components for
  OpenVMS V4.4-3 (Build 34), member of Web-Based
  Enterprise Services Suite for OpenVMS V4.4.3 (Build

jpd252
Occasional Advisor

Re: Alpha ES80 OpenVMS 8.3 crash on reboot.

first off, I would like to thank you all for all the assistance,  I have learned quite a bit just posting here.  am I reading this correctly ,  it appears to be calling out the Radeon 7500 AGP card,  in other research I have done, I thought I saw that there have been issues with video cards in ES80's running OpenVMS, any truth to this ?  we have had this issue about every 3 months or so,  with more than one ES80.  does this mean the video card needs replacing? 

Dennis Handly
Acclaimed Contributor

Re: Alpha ES80 OpenVMS 8.3 crash on reboot.

>in a text file it doesn't see to be able to attach

 

You need to have a valid suffix, like .txt or .zip.