cancel
Showing results for 
Search instead for 
Did you mean: 

Cell halt

 
Khalil Hesam
Advisor

Cell halt

Dear all,
I have an rx8620 with 2 cells. when I put both cells in, but reconfigure a partition with just one cell and start the system, everything is ok and no error occurs. in my hpux I add the 2nd cell to my partition with 'parmodify -p 0 -a 1:base:y:ri:100%' and it gives no error. but when I 'shutdown -R' the system for the 2nd cell to join, I get this errors while booting and before getting to EFI boot menu:

Log Entry 9418: 10/15/2007 13:44:26
Alert level 7: Fatal
Keyword: HALT
Forward Progress is stopping. The Cell or System will not boot further.
Reporting Entity: System Firmware located in cabinet 0, slot 0, cpu 0
System State Change: Cell halt
0xf48001fd00e0009a 0x00000000000b000c
0xeb0001fd00e0009b 0x0100000047136eba

Log Entry 9417: 10/15/2007 13:44:26
Alert level 7: Fatal
Keyword: HALT
Forward Progress is stopping. The Cell or System will not boot further.
Reporting Entity: System Firmware located in cabinet 0, slot 0, cpu 0
Physical Address: 0x00000000ffe1e550
0xe18001fd00e00098 0x00000000ffe1e550
0xeb0001fd00e00099 0x0100000047136eba

Log Entry 9414: 10/15/2007 13:44:26
Alert level 3: Warning
Keyword: OS_MCA_NOT_REGISTERED
OS MCA address not registered
Reporting Entity: System Firmware located in cabinet 0, slot 0, cpu 0
Implementation Dependent: 0x0000000000000000
0x7680010700e00093 0x0000000000000000
0x6b00010700e00094 0x0100000047136eba

Log Entry 9413: 10/15/2007 13:44:26
Alert level 3: Warning
Keyword: MC_PAL_CANT_ESCALATE_TO_BINIT
MC: MCA to BINIT escalation not supported by PAL
Reporting Entity: System Firmware located in cabinet 0, slot 0, cpu 0
0x6b00029000e00092 0x0100000047136eba

Log Entry 9340: 10/15/2007 13:44:13
Alert level 7: Fatal
Keyword: MC_INITIATED
Machine Check initiated
Reporting Entity: System Firmware located in cabinet 0, slot 0, cpu 0
System State Change: HPMC processing
0xf480009800e00002 0x000000000000000b
0xeb00009800e00003 0x0100000047136ead

-----------------------
sysrev:

Cabinet firmware revision report

PROGRAMMABLE HARDWARE :

System Backplane : GPM FM OSP
------- ------- -------
1.003 1.002 1.002

PCI-X Backplane : LPM HS
------- -------
2.000 1.000

Core IO : Master Slave (not installed)
-------- -------
2.011 0.000

LPM PDHC
------- -------
Cell 0 : 1.002 1.010
Cell 1 : 1.002 1.010
Cell 2 : 0.000 0.000 - not installed
Cell 3 : 0.000 0.000 - not installed


FIRMWARE:

Core IO
Master : A.008.005
Event Dict. : 1.020
Slave : A.000.000 - not installed
Event Dict. : 0.000

Cell 0
PDHC : A.003.031
Pri SFW : 6.044 (IA)
Sec SFW : 3.096 (IA)

Cell 1
PDHC : A.003.031
Pri SFW : 6.044 (IA)
Sec SFW : 3.096 (IA)

Cell 2 - not installed
PDHC : A.000.000
Pri SFW : 0.000
Sec SFW : 0.000

Cell 3 - not installed
PDHC : A.000.000
Pri SFW : 0.000
Sec SFW : 0.000


console output:
Initializing SAL_ABI ...
Processing firmware memory tables ... Complete
Relocating PAL ... Complete
Processing platform I/O information ... Complete
Building ESI Table ... Complete
Loading ACPI tables ... Complete
Building EFI memory table ... Complete
Building PCI cache structure ... MCA event occurred ....

Unable to load SAL Data Area; cannot call OS_MCA.

sequencer: The MCA was NOT corrected. A cold boot is required.

anyone knows what I should do?
4 REPLIES 4
tkc
Esteemed Contributor

Re: Cell halt

try running the PS command from the MP prompt to check status of the cell, cabinet, etc. make sure cell 1 is ok. if everything seems fine, pull out the cell 1 again try booting up with only cell 0. it looks like cell 1 problem to me.
Phil uk
Honored Contributor

Re: Cell halt

Hi,

Can you confirm that the cells are of the same type and have the same CPU's etc.
Is the cell powered on?

Phil

Phil uk
Honored Contributor

Re: Cell halt

You could also try to get the machine back to a working state by removing that new cell and then try to obtain the MCA information from EFI and post it here.

SHELL>errdump mca

This may help to indicate what went wrong during bootup.
Khalil Hesam
Advisor

Re: Cell halt

The Cell 0 was the problem and changing it solved everything.