HPE 9000 and HPE e3000 Servers
1752407 Members
5784 Online
108788 Solutions
New Discussion юеВ

Re: RP3440 (A7137A) Sudden Reboot

 
SOLVED
Go to solution
nibble
Super Advisor

RP3440 (A7137A) Sudden Reboot

Hi everyone,

One of our rp3440 suddenly reboots and i could not find any logs in the syslog. it didnt even registered any reboot log in /etc/shutdownlog.

I suspect that it has something to do with the CPU though this server has long been with scsi bus reset error issue (about 2 yrs) with its disk enclosure. But it never rebooted/crashed eversince.

My assumpltion is:
1. Rule out the scsi bus error issue as the cause of the reboot

2. Concentrate on the CPU or the board (ive attached the infolog from cstm and ts99 which I found some entries that Ive never seen before with previous infolog)

As for the disk enclosure, were planning to migrate the data enclosure to an EVA4400 cause we found out that the DS2300 could no longer accomodate the wio requirement of the application and database.

Any inputs will highly be appreciated. Thanks.



-- Information Tool Log for CPU on path 128 --

Log creation time: Tue Apr 15 05:15:27 2008

Hardware path: 128


Product ID: CPU Module Type: 0
Hardware Model: 0x896 Software Model: 0x4
Hardware Revision: 0 Software Revision: 0
Hardware ID: 0 Software ID: 1568112656
Boot ID: 0x2 Software Option: 0x91
Processor Number: 0 Path: 128
Hard Physical Address: 0xfffffffffe780000 Soft Physical Address: 0

Slot Number: 0 Software Capability: 0x100000f0
PDC Firmware Revision: 45.11 IODC Revision: 0
Instruction Cache [Kbyte]: 768 Processor Speed: 3000000
Processor State: CPU Present Configured
Monarch: Yes Active: Yes
Data Cache [Kbyte]: 768
Instruction TLB [entry]: 240 Processor Chip Revisions: 3.2
Data TLB Size [entry]: 240 2nd Level Cache Size:[KB] 65536
Serial Number: 4434872de43f0608


----------------- Processor 0 HPMC Information - PDC Version: 45.11 ------

Timestamp = Mon Apr 14 19:02:18 GMT 2008 (20:08:04:14:19:02:18)

HPMC Chassis Codes

Chassis Code Extension
------------ ---------
0xe800035c00e00000 0x0000000000288848
0x57000f7300e00000 0x8040004000000000
0xf600105e00e00000 0x000000003e900000
0x140003b200e00000 0x000000000000000b
0x5600109b00e00000 0x000000000002e024


General Registers 0 - 31
00-03 0000000000000000 0000000000c75b58 0000000000288680 0000000041ece800
04-07 000000004812b000 0000000041ef9800 0000000003f8e64c 0000000000000000
08-11 0000000000400190 0000000000288630 0000000000030000 ffffffffffffffff
12-15 ffffffffffffffff ffffffffffffffff 0000000000000000 0000000000000000
16-19 00000000025a1000 00000000025a0640 00000000025a1238 0000000000000276
20-23 0000000000000001 0000000000000000 0000000002575000 0000000000030000
24-27 0000000000000004 0000000002575000 0000000041eb5f80 0000000001046b58
28-31 ffffff00ffffffff 0000000000c6aa00 0000000000c6aa30 ffffffffa0022000



Control Registers 0 - 31
00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000
04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
08-11 0000cff900000000 0000000000000000 00000000000000c0 000000000000003f
12-15 0000000000000000 00008dda00000000 000000000002e000 c400000000000000
16-19 00025e3db9cc2966 0000000000000000 0000000000288848 0000000043e80028
20-23 00000000a607fe80 c000000008832014 000000ff0804fb0b 0000000800000000
24-27 0000000002575000 000000000465d1d6 0000000041eb5980 0000000000000000
28-31 0000000002575000 00025e3db9bdd519 400000000085e03f 0000000000c6aa30


Space Registers 0 - 7
00-03 000000000b0e9800 0000000000000000 0000000004f87000 0000000000058c28
04-07 0000000000000000 0000000004f87000 0000000008326000 0000000000000000



IIA Space (back entry) = 0x0000000000000000
IIA Offset (back entry) = 0x000000000028884c
Check Type = 0x20000000
Cpu State = 0x9e000000
Cache Check = 0x00000000
TLB Check = 0x00000000
Bus Check = 0x00000000
Assists Check = 0x00058c28
Assist State = 0x00000000
Path Info = 0x00000000
System Responder Address = 0x0000000000000000
System Requestor Address = 0x0000000000000000



Floating Point Registers 0 - 31
00-03 0820002000000000 0000000000000000 0000000000000000 0000000000000000
04-07 3ff0000000000005 3ff0000000000000 0000006400000005 4059000000000000
08-11 00000001a052bf61 3ff0000000000005 4059000000000008 0000000000000064
12-15 05f5e10000000000 3fb99999999999a2 0000000a00000000 0000000000000000
16-19 0000000000000000 0000000000000000 0000000000000000 0000000000000000
20-23 0000000000000000 0000000000000000 00000000000003e8 4058c00000000000
24-27 0000000000000000 0000000000000014 3f90ecdbf5257500 3fce980343e2cee9
28-31 3fca7697a9b0da85 0000000000000000 3ff8b4b3ad55396d bfc1c9e9c46e595f


PIM Revision = 0x0000000000000001
CPU ID = 0x0000000000000014
CPU Revision = 0x0000000000000032
Cpu Serial Number = 0x4434872de43f0608
Check Summary = 0x8040004000000000
SAL Timestamp = 0x000000004803aa3a
System Firmware Rev. = 0x00000b4b0000119f
PDC Relocation Address = 0x000000003e900000
Available Memory = 0x00000001ffe00000
CPU Diagnose Register 2 = 0x3212026000002228
MIB_STAT = 0x0040000000200000
MIB_LOG1 = 0x000000000055e000
MIB_LOG2 = 0x0000800000000000
MIB_ECC_DATA = 0x1010a6c41010aac0
ICache Info = 0x0000000000000000
DCache Info = 0x0000000000000000

Sharedcache Info1 = 0x0000000000000000
Sharedcache Info2 = 0x0000000000000040
MIB_RSLOG1 = 0x0000000000000004
MIB_RSLOG2 = 0x0000010000000000
MIB_RQLOG = 0x06908800bffe9510
MIB_REQLOGa = 0x8000000000000200
MIB_REQLOGb = 0x01000aa400000000
Reserved = 0x0000000000000000
Cache Repair Detail = 0x0000000000000000

PIM Detail Text:



-------------- Memory Error Log Information --------------

No errors logged for this bus

------------ I/O Module Error Log Information ------------

IO Subsystem Log Entries

Found 1 IOC error
Found 1 PCI Bus error
------------------------------------------------

Detail display of IO subsystem log entries
------------------------------------------

IOC Error information

IOC Error 1
--- Section Header ---
GUID
data1 0xe429faf7
data2 0x3cb7
data3 0x11d4
data4 0xbc a7 0 80 c7 3c 88 81
REVISION 0x0200
ERROR_RECOVERY_INFO 0x80
SECTION_LENGTH 0x000000b8
VALIDATION_BITS 0x000000000000004f
ERROR_STATUS 0x0000000000314000
REQUESTOR_ID 0x00000000fed24000
RESPONDER_ID 0xfffffffffed00000
TARGET_ID 0x0000000040010000
BUS_DATA 0x0000000000000000
OEM_COMPONENT_ID 0x0000000000edaf30
HP_DEV_PATH 0x0000000000000000

HP IOC Information
CEC Header:
--- Section Header ---
GUID
data1 0x13276c76
data2 0x37de
data3 0x42e9
data4 0x a5 2f 41 89 ba 10 dd ed
SECTION_LENGTH 0x00000058
CELL_NUMBER 0
SBA_NUMBER 0
ROPE_NUMBER 2
CEC Data:
CEC Pluto Data:
ROPE_CONFIG 0x0000000000000400
ROPE2_ERROR 0x0000000000000240
ROPE2_REQ_ERR_LOG 0x8100000040010002
ROPE2_IBF_ERR_LOG 0x001fff013999fd00
ROPE2_DID_LOG 0x0000000000100000
ROPE2_ENABLE 0x00000000000006f7
ROPE2_CONFIG 0x001200055ffff8a4
HP Mercury Rope Data
CEC Mercury Data:
ROPE_ERROR_LOG 0x0000000000000400


End of IOC Error Information for Error 1

End of IOC Error Information
PCI Bus Error information

PCI Bus Error 1

--- Section Header ---
GUID
data1 0xe429faf4
data2 0x3cb7
data3 0x11d4
data4 0xbc a7 0 80 c7 3c 88 81
REVISION 0x0200
ERROR_RECOVERY_INFO 0x84
SECTION_LENGTH 0x00000108
VALIDATION_BITS 0x0000000000000604
PCI_BUS_ERROR_STATUS 0x0000000000000000
PCI_BUS_ERROR_TYPE 0x0000000000000000
PCI_BUS_ID 0x0000000000000040
PCI_BUS_ADDRESS 0x0000000000000000
PCI_BUS_DATA 0x0000000000000000
PCI_BUS_CMD 0x0000000000000000
PCI_BUS_REQUESTOR_ID 0x0000000000000000
PCI_BUS_COMPLETER_ID 0x0000000000000000
PCI_BUS_TARGET_ID 0x0000000000000000
PCI_BUS_OEM_ID 0x0000000000edc158
Bus OEM Data
CEC Header:
--- OEM Data Header ---

GUID
data1 0x9fe64482
data2 0xa02d
data3 0x4ef7
data4 0xad e6 c6 63 59 62 53 99

--- OEM Data Body ---

CELL_NUMBER 0
SBA_NUMBER 0
ROPE_NUMBER 2
--- Mercury Info ---
ERROR_STATUS 0x0000000000000000
ERROR_MASTER_ID_LOG 0x0000000000000000
INBOUND_ERR_ADDRESS 0x0000000000000000
INBOUND_ERR_ATTRIBUTE 0x0000000000000000
COMPLETION_MESSAGE_LOG 0x0000000000000000
OUTBOUND_ERR_ADDRESS 0x0000000000000000
ERROR_CONFIG 0x0000000000000030
STATUS_INFO_CONTROL 0x0000000100000000
FUNC_ID 0x02b00146122e103c
CAPABILITIES_LIST 0x0f00023700200002
AGP_COMMAND 0x0000000000000000
PCIX_CAPABILITIES 0x0013ff0000010007
OLR_CONTROL 0x00023f1d00032403
CLOCK_CONTROL 0x0000000000000038
BUS_MODE 0xa599656d2f2504e0

End of PCI Bus Error Information for Error 1

End of PCI Bus Error Information

FRU INFORMATION

Module Revision
------ --------
PA 8900 CPU Module 3.2
PA 8900 CPU Module 3.2

Board Info!
Format Version : 0x1 Language Code : 0x0
Mfg Date : Mfg Name : JABIL
Product Name : rp3440 SYSTEM BOARD
Serial Number : 52JAPE4547001132
Part Number : A7136-60001
Fru File Tp/Len : 0x1 Fru File : ^P
Revision : A1 Eng Date Code : 4531
Artwork Rev : D Fru Info :



=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=-+-=

-- Information Tool Log for CPU on path 129 --

Log creation time: Tue Apr 15 05:15:02 2008

Hardware path: 129


Product ID: CPU Module Type: 0
Hardware Model: 0x896 Software Model: 0x4
Hardware Revision: 0 Software Revision: 0
Hardware ID: 0 Software ID: 1568112656
Boot ID: 0x2 Software Option: 0x91
Processor Number: 1 Path: 129
Hard Physical Address: 0xfffffffffe781000 Soft Physical Address: 0

Slot Number: 0 Software Capability: 0x100000f0
PDC Firmware Revision: 45.11 IODC Revision: 0
Instruction Cache [Kbyte]: 768 Processor Speed: N/A
Processor State: CPU Present Configured
Monarch: No Active: Yes
Data Cache [Kbyte]: 768
Instruction TLB [entry]: 240 Processor Chip Revisions: 3.2
Data TLB Size [entry]: 240 2nd Level Cache Size:[KB] 65536
Serial Number: 4434872de43f0608


----------------- Processor 1 HPMC Information - PDC Version: 45.11 ------

Timestamp = Mon Apr 14 19:02:19 GMT 2008 (20:08:04:14:19:02:19)

HPMC Chassis Codes

Chassis Code Extension
------------ ---------
0xe800035c00e00000 0x000000000012e388
0x57000f7300e00000 0x8040004000000000
0x5600109b00e00000 0x000000000002e024


General Registers 0 - 31
00-03 0000000000000000 000000000104f358 00000000025a1000 0000000002576970
04-07 0000000000000020 0000000000000001 000000000101e358 000000000101cb58
08-11 0000000000000000 0000000003f8e64c 0000000000000001 0000000000000003
12-15 ffffffffffffffff 0000000003f8e62c 0000000000000000 0000000000000000
16-19 00000000025a1000 00000000025a0640 0000000002575000 0000000000000000
20-23 0000000000001970 00000000025a0040 0000000002576970 0000000000000000
24-27 0000000000000000 0000000000000000 0000000000000000 0000000001046b58
28-31 0000000000000001 00000000025a0648 000000000d0aa3a0 0000000000000000



Control Registers 0 - 31
00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000
04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
08-11 0000000000000000 0000000000000000 00000000000000c0 000000000000001a
12-15 0000000000000000 0000000000000000 000000000002e000 fffffff0ffffffff
16-19 00025e3ddf6262ee 0000000000000000 000000000012e388 000000004ab60008
20-23 0000000000000000 0000000000000000 000000ff0806f90f 0000000100000000
24-27 0000000002576970 00000000de640320 0000000000000000 0000000000000000
28-31 00000000025a1000 00025e3ddee901ae 0000000001005b58 000000000d0aa3a0


Space Registers 0 - 7
00-03 000000000b0e9800 0000000004f87000 000000000acd0800 0000000000000000
04-07 0000000000000000 00000000ffffffff 0000000008326000 0000000000000000


IIA Space (back entry) = 0x0000000000000000
IIA Offset (back entry) = 0x000000000012e38c

Check Type = 0x00000000
Cpu State = 0x9e000000
Cache Check = 0x00000000
TLB Check = 0x00000000
Bus Check = 0x00000000
Assists Check = 0x00000000
Assist State = 0x00000000
Path Info = 0x00000000
System Responder Address = 0x0000000000000000
System Requestor Address = 0x0000000000000000



Floating Point Registers 0 - 31
00-03 1814080000000000 0000000000000000 0000000000000000 0000000000000000
04-07 0000000500000000 3fc99999999999a2 000a290600020834 4124520c00000000
08-11 410041a333333339 00000000000a2904 0000000000000002 3feda7005384d787
12-15 40c0000000000000 4100000000000000 4100000000000000 4080000000000000
16-19 4080000000000000 4080000000000000 4100000000000000 4100000000000000
20-23 5555555555555555 5555555555555555 0000000064c70000 0000000000198000
24-27 0000000000000000 0000000000000014 3f90ecdbf5257500 3fce980343e2cee9
28-31 3fca7697a9b0da85 0000000000000000 3ff8b4b3ad55396d bfc1c9e9c46e595f


PIM Revision = 0x0000000000000001
CPU ID = 0x0000000000000014
CPU Revision = 0x0000000000000032
Cpu Serial Number = 0x4434872de43f0608
Check Summary = 0x8040004000000000
SAL Timestamp = 0x000000004803aa3b
System Firmware Rev. = 0x00000b4b0000119f
PDC Relocation Address = 0x000000003e900000
Available Memory = 0x00000001ffe00000
CPU Diagnose Register 2 = 0x3252020008082228
MIB_STAT = 0x0000000000000000
MIB_LOG1 = 0x0000000000000000
MIB_LOG2 = 0x0000000000000000
MIB_ECC_DATA = 0x0000000000000000
ICache Info = 0x0000000000000000
DCache Info = 0x0000000000000000
Sharedcache Info1 = 0x0000000000000000
Sharedcache Info2 = 0x0000000000000000

MIB_RSLOG1 = 0x0000000000000000
MIB_RSLOG2 = 0x0000000000000000
MIB_RQLOG = 0x0000000000000000
MIB_REQLOGa = 0x0000000000000000
MIB_REQLOGb = 0x0000000000000000
Reserved = 0x0000000000000000
Cache Repair Detail = 0x0000000000000000

PIM Detail Text:



-------------- Memory Error Log Information --------------

No errors logged for this bus

------------ I/O Module Error Log Information ------------

IO Subsystem Log Entries

Found 1 IOC error
Found 1 PCI Bus error
------------------------------------------------

Detail display of IO subsystem log entries
------------------------------------------

IOC Error information

IOC Error 1
--- Section Header ---
GUID
data1 0xe429faf7
data2 0x3cb7
data3 0x11d4
data4 0xbc a7 0 80 c7 3c 88 81
REVISION 0x0200
ERROR_RECOVERY_INFO 0x80
SECTION_LENGTH 0x000000b8
VALIDATION_BITS 0x000000000000004f
ERROR_STATUS 0x0000000000314000
REQUESTOR_ID 0x00000000fed24000
RESPONDER_ID 0xfffffffffed00000
TARGET_ID 0x0000000040010000
BUS_DATA 0x0000000000000000
OEM_COMPONENT_ID 0x0000000000edaf30
HP_DEV_PATH 0x0000000000000000

HP IOC Information
CEC Header:
--- Section Header ---
GUID
data1 0x13276c76
data2 0x37de
data3 0x42e9
data4 0x a5 2f 41 89 ba 10 dd ed
SECTION_LENGTH 0x00000058
CELL_NUMBER 0
SBA_NUMBER 0
ROPE_NUMBER 2
CEC Data:
CEC Pluto Data:
ROPE_CONFIG 0x0000000000000400
ROPE2_ERROR 0x0000000000000240
ROPE2_REQ_ERR_LOG 0x8100000040010002
ROPE2_IBF_ERR_LOG 0x001fff013999fd00
ROPE2_DID_LOG 0x0000000000100000
ROPE2_ENABLE 0x00000000000006f7
ROPE2_CONFIG 0x001200055ffff8a4
HP Mercury Rope Data
CEC Mercury Data:
ROPE_ERROR_LOG 0x0000000000000400


End of IOC Error Information for Error 1

End of IOC Error Information
PCI Bus Error information

PCI Bus Error 1
--- Section Header ---
GUID
data1 0xe429faf4
data2 0x3cb7
data3 0x11d4

data4 0xbc a7 0 80 c7 3c 88 81
REVISION 0x0200
ERROR_RECOVERY_INFO 0x84
SECTION_LENGTH 0x00000108
VALIDATION_BITS 0x0000000000000604
PCI_BUS_ERROR_STATUS 0x0000000000000000
PCI_BUS_ERROR_TYPE 0x0000000000000000
PCI_BUS_ID 0x0000000000000040
PCI_BUS_ADDRESS 0x0000000000000000
PCI_BUS_DATA 0x0000000000000000
PCI_BUS_CMD 0x0000000000000000
PCI_BUS_REQUESTOR_ID 0x0000000000000000
PCI_BUS_COMPLETER_ID 0x0000000000000000
PCI_BUS_TARGET_ID 0x0000000000000000
PCI_BUS_OEM_ID 0x0000000000edc158
Bus OEM Data
CEC Header:
--- OEM Data Header ---

GUID
data1 0x9fe64482
data2 0xa02d
data3 0x4ef7
data4 0xad e6 c6 63 59 62 53 99

--- OEM Data Body ---

CELL_NUMBER 0
SBA_NUMBER 0
ROPE_NUMBER 2
--- Mercury Info ---
ERROR_STATUS 0x0000000000000000
ERROR_MASTER_ID_LOG 0x0000000000000000
INBOUND_ERR_ADDRESS 0x0000000000000000
INBOUND_ERR_ATTRIBUTE 0x0000000000000000
COMPLETION_MESSAGE_LOG 0x0000000000000000
OUTBOUND_ERR_ADDRESS 0x0000000000000000
ERROR_CONFIG 0x0000000000000030
STATUS_INFO_CONTROL 0x0000000100000000
FUNC_ID 0x02b00146122e103c
CAPABILITIES_LIST 0x0f00023700200002
AGP_COMMAND 0x0000000000000000
PCIX_CAPABILITIES 0x0013ff0000010007
OLR_CONTROL 0x00023f1d00032403
CLOCK_CONTROL 0x0000000000000038
BUS_MODE 0xa599656d2f2504e0

End of PCI Bus Error Information for Error 1

End of PCI Bus Error Information

FRU INFORMATION

Module Revision
------ --------
PA 8900 CPU Module 3.2
PA 8900 CPU Module 3.2

Board Info!
Format Version : 0x1 Language Code : 0x0
Mfg Date : Mfg Name : JABIL
Product Name : rp3440 SYSTEM BOARD
Serial Number : 52JAPE4547001132
Part Number : A7136-60001
Fru File Tp/Len : 0x1 Fru File : ^P
Revision : A1 Eng Date Code : 4531
Artwork Rev : D Fru Info :



==============================================================

tombstone ts99

clyde:/var/tombstones:# cat ts99
HP-UX clyde B.11.23 U 9000/800 1568112656

Return Errors:
Non Cellullar System
----------------- Processor 0 HPMC Information - PDC Version: 45.11 ------

Timestamp = Mon Apr 14 19:02:18 GMT 2008 (20:08:04:14:19:02:18)

HPMC Chassis Codes

Chassis Code Extension
------------ ---------
0xe800035c00e00000 0x0000000000288848
0x57000f7300e00000 0x8040004000000000
0xf600105e00e00000 0x000000003e900000
0x140003b200e00000 0x000000000000000b
0x5600109b00e00000 0x000000000002e024


General Registers 0 - 31
00-03 0000000000000000 0000000000c75b58 0000000000288680 0000000041ece800
04-07 000000004812b000 0000000041ef9800 0000000003f8e64c 0000000000000000
08-11 0000000000400190 0000000000288630 0000000000030000 ffffffffffffffff
12-15 ffffffffffffffff ffffffffffffffff 0000000000000000 0000000000000000
16-19 00000000025a1000 00000000025a0640 00000000025a1238 0000000000000276
20-23 0000000000000001 0000000000000000 0000000002575000 0000000000030000
24-27 0000000000000004 0000000002575000 0000000041eb5f80 0000000001046b58
28-31 ffffff00ffffffff 0000000000c6aa00 0000000000c6aa30 ffffffffa0022000



Control Registers 0 - 31
00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000
04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
08-11 0000cff900000000 0000000000000000 00000000000000c0 000000000000003f
12-15 0000000000000000 00008dda00000000 000000000002e000 c400000000000000
16-19 00025e3db9cc2966 0000000000000000 0000000000288848 0000000043e80028
20-23 00000000a607fe80 c000000008832014 000000ff0804fb0b 0000000800000000
24-27 0000000002575000 000000000465d1d6 0000000041eb5980 0000000000000000
28-31 0000000002575000 00025e3db9bdd519 400000000085e03f 0000000000c6aa30


Space Registers 0 - 7
00-03 000000000b0e9800 0000000000000000 0000000004f87000 0000000000058c28
04-07 0000000000000000 0000000004f87000 0000000008326000 0000000000000000


IIA Space (back entry) = 0x0000000000000000
IIA Offset (back entry) = 0x000000000028884c
Check Type = 0x20000000
Cpu State = 0x9e000000
Cache Check = 0x00000000
TLB Check = 0x00000000
Bus Check = 0x00000000
Assists Check = 0x00058c28
Assist State = 0x00000000
Path Info = 0x00000000
System Responder Address = 0x0000000000000000
System Requestor Address = 0x0000000000000000



Floating Point Registers 0 - 31
00-03 0820002000000000 0000000000000000 0000000000000000 0000000000000000
04-07 3ff0000000000005 3ff0000000000000 0000006400000005 4059000000000000
08-11 00000001a052bf61 3ff0000000000005 4059000000000008 0000000000000064
12-15 05f5e10000000000 3fb99999999999a2 0000000a00000000 0000000000000000
16-19 0000000000000000 0000000000000000 0000000000000000 0000000000000000
20-23 0000000000000000 0000000000000000 00000000000003e8 4058c00000000000
24-27 0000000000000000 0000000000000014 3f90ecdbf5257500 3fce980343e2cee9
28-31 3fca7697a9b0da85 0000000000000000 3ff8b4b3ad55396d bfc1c9e9c46e595f


PIM Revision = 0x0000000000000001
CPU ID = 0x0000000000000014
CPU Revision = 0x0000000000000032
Cpu Serial Number = 0x4434872de43f0608
Check Summary = 0x8040004000000000
SAL Timestamp = 0x000000004803aa3a
System Firmware Rev. = 0x00000b4b0000119f
PDC Relocation Address = 0x000000003e900000
Available Memory = 0x00000001ffe00000
CPU Diagnose Register 2 = 0x3212026000002228
MIB_STAT = 0x0040000000200000
MIB_LOG1 = 0x000000000055e000
MIB_LOG2 = 0x0000800000000000
MIB_ECC_DATA = 0x1010a6c41010aac0
ICache Info = 0x0000000000000000
DCache Info = 0x0000000000000000
Sharedcache Info1 = 0x0000000000000000
Sharedcache Info2 = 0x0000000000000040
MIB_RSLOG1 = 0x0000000000000004
MIB_RSLOG2 = 0x0000010000000000
MIB_RQLOG = 0x06908800bffe9510
MIB_REQLOGa = 0x8000000000000200
MIB_REQLOGb = 0x01000aa400000000
Reserved = 0x0000000000000000
Cache Repair Detail = 0x0000000000000000

PIM Detail Text:



-------------- Memory Error Log Information --------------

No errors logged for this bus

------------ I/O Module Error Log Information ------------

IO Subsystem Log Entries

Found 1 IOC error
Found 1 PCI Bus error
------------------------------------------------

Detail display of IO subsystem log entries
------------------------------------------

IOC Error information

IOC Error 1
--- Section Header ---
GUID
data1 0xe429faf7
data2 0x3cb7
data3 0x11d4
data4 0xbc a7 0 80 c7 3c 88 81
REVISION 0x0200
ERROR_RECOVERY_INFO 0x80
SECTION_LENGTH 0x000000b8
VALIDATION_BITS 0x000000000000004f
ERROR_STATUS 0x0000000000314000
REQUESTOR_ID 0x00000000fed24000
RESPONDER_ID 0xfffffffffed00000
TARGET_ID 0x0000000040010000
BUS_DATA 0x0000000000000000
OEM_COMPONENT_ID 0x0000000000edaf30
HP_DEV_PATH 0x0000000000000000

HP IOC Information
CEC Header:
--- Section Header ---
GUID
data1 0x13276c76
data2 0x37de
data3 0x42e9
data4 0x a5 2f 41 89 ba 10 dd ed
SECTION_LENGTH 0x00000058
CELL_NUMBER 0
SBA_NUMBER 0
ROPE_NUMBER 2
CEC Data:
CEC Pluto Data:
ROPE_CONFIG 0x0000000000000400
ROPE2_ERROR 0x0000000000000240
ROPE2_REQ_ERR_LOG 0x8100000040010002
ROPE2_IBF_ERR_LOG 0x001fff013999fd00
ROPE2_DID_LOG 0x0000000000100000
ROPE2_ENABLE 0x00000000000006f7
ROPE2_CONFIG 0x001200055ffff8a4
HP Mercury Rope Data
CEC Mercury Data:
ROPE_ERROR_LOG 0x0000000000000400


End of IOC Error Information for Error 1

End of IOC Error Information
PCI Bus Error information

PCI Bus Error 1
--- Section Header ---
GUID
data1 0xe429faf4
data2 0x3cb7
data3 0x11d4
data4 0xbc a7 0 80 c7 3c 88 81
REVISION 0x0200
ERROR_RECOVERY_INFO 0x84
SECTION_LENGTH 0x00000108
VALIDATION_BITS 0x0000000000000604
PCI_BUS_ERROR_STATUS 0x0000000000000000
PCI_BUS_ERROR_TYPE 0x0000000000000000
PCI_BUS_ID 0x0000000000000040
PCI_BUS_ADDRESS 0x0000000000000000
PCI_BUS_DATA 0x0000000000000000
PCI_BUS_CMD 0x0000000000000000
PCI_BUS_REQUESTOR_ID 0x0000000000000000
PCI_BUS_COMPLETER_ID 0x0000000000000000
PCI_BUS_TARGET_ID 0x0000000000000000
PCI_BUS_OEM_ID 0x0000000000edc158
Bus OEM Data
CEC Header:
--- OEM Data Header ---

GUID
data1 0x9fe64482
data2 0xa02d
data3 0x4ef7
data4 0xad e6 c6 63 59 62 53 99

--- OEM Data Body ---

CELL_NUMBER 0
SBA_NUMBER 0
ROPE_NUMBER 2
--- Mercury Info ---
ERROR_STATUS 0x0000000000000000
ERROR_MASTER_ID_LOG 0x0000000000000000
INBOUND_ERR_ADDRESS 0x0000000000000000
INBOUND_ERR_ATTRIBUTE 0x0000000000000000
COMPLETION_MESSAGE_LOG 0x0000000000000000
OUTBOUND_ERR_ADDRESS 0x0000000000000000
ERROR_CONFIG 0x0000000000000030
STATUS_INFO_CONTROL 0x0000000100000000
FUNC_ID 0x02b00146122e103c
CAPABILITIES_LIST 0x0f00023700200002
AGP_COMMAND 0x0000000000000000
PCIX_CAPABILITIES 0x0013ff0000010007
OLR_CONTROL 0x00023f1d00032403
CLOCK_CONTROL 0x0000000000000038
BUS_MODE 0xa599656d2f2504e0

End of PCI Bus Error Information for Error 1

End of PCI Bus Error Information

----------------- Processor 1 HPMC Information - PDC Version: 45.11 ------

Timestamp = Mon Apr 14 19:02:19 GMT 2008 (20:08:04:14:19:02:19)

HPMC Chassis Codes

Chassis Code Extension
------------ ---------
0xe800035c00e00000 0x000000000012e388
0x57000f7300e00000 0x8040004000000000
0x5600109b00e00000 0x000000000002e024


General Registers 0 - 31
00-03 0000000000000000 000000000104f358 00000000025a1000 0000000002576970
04-07 0000000000000020 0000000000000001 000000000101e358 000000000101cb58
08-11 0000000000000000 0000000003f8e64c 0000000000000001 0000000000000003
12-15 ffffffffffffffff 0000000003f8e62c 0000000000000000 0000000000000000
16-19 00000000025a1000 00000000025a0640 0000000002575000 0000000000000000
20-23 0000000000001970 00000000025a0040 0000000002576970 0000000000000000
24-27 0000000000000000 0000000000000000 0000000000000000 0000000001046b58
28-31 0000000000000001 00000000025a0648 000000000d0aa3a0 0000000000000000



Control Registers 0 - 31
00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000
04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
08-11 0000000000000000 0000000000000000 00000000000000c0 000000000000001a
12-15 0000000000000000 0000000000000000 000000000002e000 fffffff0ffffffff
16-19 00025e3ddf6262ee 0000000000000000 000000000012e388 000000004ab60008
20-23 0000000000000000 0000000000000000 000000ff0806f90f 0000000100000000
24-27 0000000002576970 00000000de640320 0000000000000000 0000000000000000
28-31 00000000025a1000 00025e3ddee901ae 0000000001005b58 000000000d0aa3a0


Space Registers 0 - 7
00-03 000000000b0e9800 0000000004f87000 000000000acd0800 0000000000000000
04-07 0000000000000000 00000000ffffffff 0000000008326000 0000000000000000


IIA Space (back entry) = 0x0000000000000000
IIA Offset (back entry) = 0x000000000012e38c
Check Type = 0x00000000
Cpu State = 0x9e000000
Cache Check = 0x00000000
TLB Check = 0x00000000
Bus Check = 0x00000000
Assists Check = 0x00000000
Assist State = 0x00000000
Path Info = 0x00000000
System Responder Address = 0x0000000000000000
System Requestor Address = 0x0000000000000000



Floating Point Registers 0 - 31
00-03 1814080000000000 0000000000000000 0000000000000000 0000000000000000
04-07 0000000500000000 3fc99999999999a2 000a290600020834 4124520c00000000
08-11 410041a333333339 00000000000a2904 0000000000000002 3feda7005384d787
12-15 40c0000000000000 4100000000000000 4100000000000000 4080000000000000
16-19 4080000000000000 4080000000000000 4100000000000000 4100000000000000
20-23 5555555555555555 5555555555555555 0000000064c70000 0000000000198000
24-27 0000000000000000 0000000000000014 3f90ecdbf5257500 3fce980343e2cee9
28-31 3fca7697a9b0da85 0000000000000000 3ff8b4b3ad55396d bfc1c9e9c46e595f


PIM Revision = 0x0000000000000001
CPU ID = 0x0000000000000014
CPU Revision = 0x0000000000000032
Cpu Serial Number = 0x4434872de43f0608
Check Summary = 0x8040004000000000
SAL Timestamp = 0x000000004803aa3b
System Firmware Rev. = 0x00000b4b0000119f
PDC Relocation Address = 0x000000003e900000
Available Memory = 0x00000001ffe00000
CPU Diagnose Register 2 = 0x3252020008082228
MIB_STAT = 0x0000000000000000
MIB_LOG1 = 0x0000000000000000
MIB_LOG2 = 0x0000000000000000
MIB_ECC_DATA = 0x0000000000000000
ICache Info = 0x0000000000000000
DCache Info = 0x0000000000000000
Sharedcache Info1 = 0x0000000000000000
Sharedcache Info2 = 0x0000000000000000
MIB_RSLOG1 = 0x0000000000000000
MIB_RSLOG2 = 0x0000000000000000
MIB_RQLOG = 0x0000000000000000
MIB_REQLOGa = 0x0000000000000000
MIB_REQLOGb = 0x0000000000000000
Reserved = 0x0000000000000000
Cache Repair Detail = 0x0000000000000000

PIM Detail Text:



-------------- Memory Error Log Information --------------

No errors logged for this bus

------------ I/O Module Error Log Information ------------

IO Subsystem Log Entries

Found 1 IOC error
Found 1 PCI Bus error
------------------------------------------------

Detail display of IO subsystem log entries
------------------------------------------

IOC Error information

IOC Error 1
--- Section Header ---
GUID
data1 0xe429faf7
data2 0x3cb7
data3 0x11d4
data4 0xbc a7 0 80 c7 3c 88 81
REVISION 0x0200
ERROR_RECOVERY_INFO 0x80
SECTION_LENGTH 0x000000b8
VALIDATION_BITS 0x000000000000004f
ERROR_STATUS 0x0000000000314000
REQUESTOR_ID 0x00000000fed24000
RESPONDER_ID 0xfffffffffed00000
TARGET_ID 0x0000000040010000
BUS_DATA 0x0000000000000000
OEM_COMPONENT_ID 0x0000000000edaf30
HP_DEV_PATH 0x0000000000000000

HP IOC Information
CEC Header:
--- Section Header ---
GUID
data1 0x13276c76
data2 0x37de
data3 0x42e9
data4 0x a5 2f 41 89 ba 10 dd ed
SECTION_LENGTH 0x00000058
CELL_NUMBER 0
SBA_NUMBER 0
ROPE_NUMBER 2
CEC Data:
CEC Pluto Data:
ROPE_CONFIG 0x0000000000000400
ROPE2_ERROR 0x0000000000000240
ROPE2_REQ_ERR_LOG 0x8100000040010002
ROPE2_IBF_ERR_LOG 0x001fff013999fd00
ROPE2_DID_LOG 0x0000000000100000
ROPE2_ENABLE 0x00000000000006f7
ROPE2_CONFIG 0x001200055ffff8a4
HP Mercury Rope Data
CEC Mercury Data:
ROPE_ERROR_LOG 0x0000000000000400


End of IOC Error Information for Error 1

End of IOC Error Information
PCI Bus Error information

PCI Bus Error 1
--- Section Header ---
GUID
data1 0xe429faf4
data2 0x3cb7
data3 0x11d4
data4 0xbc a7 0 80 c7 3c 88 81
REVISION 0x0200
ERROR_RECOVERY_INFO 0x84
SECTION_LENGTH 0x00000108
VALIDATION_BITS 0x0000000000000604
PCI_BUS_ERROR_STATUS 0x0000000000000000
PCI_BUS_ERROR_TYPE 0x0000000000000000
PCI_BUS_ID 0x0000000000000040
PCI_BUS_ADDRESS 0x0000000000000000
PCI_BUS_DATA 0x0000000000000000
PCI_BUS_CMD 0x0000000000000000
PCI_BUS_REQUESTOR_ID 0x0000000000000000
PCI_BUS_COMPLETER_ID 0x0000000000000000
PCI_BUS_TARGET_ID 0x0000000000000000
PCI_BUS_OEM_ID 0x0000000000edc158
Bus OEM Data
CEC Header:
--- OEM Data Header ---

GUID
data1 0x9fe64482
data2 0xa02d
data3 0x4ef7
data4 0xad e6 c6 63 59 62 53 99

--- OEM Data Body ---

CELL_NUMBER 0
SBA_NUMBER 0
ROPE_NUMBER 2
--- Mercury Info ---
ERROR_STATUS 0x0000000000000000
ERROR_MASTER_ID_LOG 0x0000000000000000
INBOUND_ERR_ADDRESS 0x0000000000000000
INBOUND_ERR_ATTRIBUTE 0x0000000000000000
COMPLETION_MESSAGE_LOG 0x0000000000000000
OUTBOUND_ERR_ADDRESS 0x0000000000000000
ERROR_CONFIG 0x0000000000000030
STATUS_INFO_CONTROL 0x0000000100000000
FUNC_ID 0x02b00146122e103c
CAPABILITIES_LIST 0x0f00023700200002
AGP_COMMAND 0x0000000000000000
PCIX_CAPABILITIES 0x0013ff0000010007
OLR_CONTROL 0x00023f1d00032403
CLOCK_CONTROL 0x0000000000000038
BUS_MODE 0xa599656d2f2504e0

End of PCI Bus Error Information for Error 1

End of PCI Bus Error Information
Processor [2]: Not Present.
Processor [3]: Not Present.
Processor [4]: Not Present.
Processor [5]: Not Present.
Processor [6]: Not Present.
Processor [7]: Not Present.

Return Warnings:



Return Revisions:

FRU INFORMATION

Module Revision
------ --------
PA 8900 CPU Module 3.2
PA 8900 CPU Module 3.2

Board Info!
Format Version : 0x1 Language Code : 0x0
Mfg Date : Mfg Name : JABIL
Product Name : rp3440 SYSTEM BOARD
Serial Number : 52JAPE4547001132
Part Number : A7136-60001
Fru File Tp/Len : 0x1 Fru File :
Revision : A1 Eng Date Code : 4531
Artwork Rev : D Fru Info :

4 REPLIES 4
Jeeshan
Honored Contributor

Re: RP3440 (A7137A) Sudden Reboot

Hi

Is there any crash dump generated?

check in /var/adm/crash folder.

a warrior never quits
YAQUB_1
Respected Contributor

Re: RP3440 (A7137A) Sudden Reboot

Can U send us system MP event log....it will better analysis this issue....

As per Ur log I have found one PCI error:-
"PCI Bus Error Information for Error 1"

nibble
Super Advisor

Re: RP3440 (A7137A) Sudden Reboot

There's no crash dump generated. As of the moment, im organizing a LAN console port to be connected to the server to have access to the MP.
Stefan Stechemesser
Honored Contributor
Solution

Re: RP3440 (A7137A) Sudden Reboot

Hi,

the HPMC (high priority machine check) log is from April, 14th and is a tricky one.

The PCI bus error was:
During a DMA transaction that was done by the PCI card in slot 3 (HW path 0/2/1), an (I/O) virtual to physical address translation (which is done by the ZX1 chip) has failed (there was no entry in the I/O page directory for the requested I/O address).
This PCI bus error is not recoverable and to avoid silent data corruption, the system chooses to crash by sending a "hard fail response" to a CPU that does the next I/O access after that DMA error.
Unfortunately, a lot of possible causes can be responsible for something like that and the HPMC does not tell what was the cause:
-software (bad OS driver, missing patches)
-a miscalculating CPU producing bad addresses in the translation table
-bad ZX1 chip (systemboard)
-bad PCI bus controller (PCI backplane)
-bad PCI card

first, I would recomend to update all hardware relevant patches and firmware. If the error then still happens, I would strongly recomend to get in contact with the HP support. Sometimes, the offline diagnostics (makodiag, plutodiag) are able to find the cause, but sometimes not, then try & error is the only troubleshooting strategy.

best regards

Stefan