Integrity Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

RX2620 - Itanium hardware crash / Machine Check

 
Luke Walker
Occasional Contributor

RX2620 - Itanium hardware crash / Machine Check

Hi,

Our production RX2620 crashed last night with what looks to be a hardware exception. I think it might have been caused by a bad disk, however everything is up and running after a power off/on cycle.

Does anyone have any pointers as to whether this would be a disk failure or where we should start looking?

The iLO logs show:
Keyword: Type-02 137001 1273857
Machine Check initiated
Sensor: Critical Interrupt
Data2: OEM Code2: 0x00
0xC14BA003E5020010 003FA17000130300

# Location|Alert| Encoded Field | Data Field | Keyword / Timestamp
-------------------------------------------------------------------------------
0 SFW *7 0xC14BA003E5020010 003FA17000130300 Type-02 137001 1273857
16 Mar 2010 22:19:17
1 SFW 0 *7 0xF480009800E00020 000000000000000B MC_INITIATED
16 Mar 2010 22:19:17


I've attached the full output and the output from Shell> errdump mca

Any pointers would be most appreciated.
3 REPLIES 3
Thane M. Larson
Occasional Visitor

Re: RX2620 - Itanium hardware crash / Machine Check

Hi Luke,

MCAs can be caused by a number of things including bad processor transactions, IO errors, and memory errors.

The MCA dump can only be analyzed by HP support and not directly with any customer tools. HP Support can be reached at: +1 (800) 633-3600

Best Regards,
TML
Robert_Jewell
Honored Contributor

Re: RX2620 - Itanium hardware crash / Machine Check

A disk failure causing an MCA would likely record errors prior to failing (unless the bad disk is the root disk and it simply hard failed/powered off/removed).

Check /var/adm/syslog/OLDsyslog for any errors at the end of the file.

The diagnostics daemon may have also captured errors. You can view these logs using STM as follows:

1) Start STM in command line mode
# cstm

2) From CSTM run utility and select logtool
CSTM> ru
-- Run Utility --
Select Utility
1 MOutil
2 logtool
Enter selection : logtool

3) Format the current raw log file for viewing and create a summary.
Logtool> fr
You can take the default location of the file placement as this file will be used by logtool only.
Allow the processing of entries. When finished the logtool will produce a summary of logs.
Enter 'q' to quit and then 'sa' to save file. Choose a name such as 'logsummary' in a location you can recall.
Enter 'done' to go back to logtool prompt.

4) Display the formatted log file.
Logtool> fl
Now press 'q' to quit and the 'sa' to save the file. Choose a name such as 'logdetails' in a location you can recall.
Enter 'done' to get back to prompt.

5) Review the files you saved to see if there are any relevant entries.

Regards,
Bob
----------------
Was this helpful? Like this post by giving me a thumbs up below!
S.N.S
Valued Contributor

Re: RX2620 - Itanium hardware crash / Machine Check

Hi Luke,

I would second Thane - best to log a HW case with HP. Those folk have the tools to analyse such cases -( I think its called p4 tool - HP Internal)

My server had a similar issue - it would go down once every 3rd day, and bootup after manual intervention. Finally HP HW advice Motherboard replacement; now it works like a gem.

This link would reduce mail exchange (& time) with HP Support:
Reporting Your Problems to HP

http://g4u0420c.houston.hp.com/en/AB587-96012/ch05s10.html

HTH
SNS


"Genius is 1% inspiration, 99% Perspiration" - Edison