HPE 9000 and HPE e3000 Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

rp 3440 flashing red light and can't boot to Unix

 
Sakui
Frequent Advisor

rp 3440 flashing red light and can't boot to Unix

hi Guys

I have a rp 3440 cluster.This node restarted twice and became normal. It restarted last night and it now blinking red light.

Below are logs from the console

After doing Cl command.

firmware Version 45.11

Duplex Console IO Dependent Code (IODC)

revision 1


******* Unexpected HPMC. Processor HPA FFFFFFFF'FE780000 *******
GENERAL REGISTERS:
r00/03 00000000'00000000 00000000'01600000 00000000'0069E548 00000000'00000000
r04/07 00000000'40D2E6D0 00000000'434C38C0 00000000'00000000 00000000'00000000
r08/11 00000000'00000000 00000000'00000000 00000000'00000000 00000000'00000000
r12/15 00000000'00000000 00000000'00000000 00000000'00000000 00000000'00000000
r16/19 00000000'00000000 00000000'00000000 00000000'00000000 00000000'00000000
r20/23 00000000'00000001 00000000'434C3880 00000000'01600000 00000000'000008C0
r24/27 C4000000'00000000 FFFFFFFF'FFFFFFFF 00000000'40D6A580 00000000'00E4AE50
r28/31 FFFFFFFF'FFFFFFFF 400003FF'FFFF0F48 400003FF'FFFF0F78 00000000'40D2E6D0
CONTROL REGISTERS:
sr0/3 00000000'01451800 00000000'00000000 00000000'0BB4E400 00000000'00000000
sr4/7 00000000'00000000 00000000'03BC2400 00000000'03C8A800 00000000'00000000
pcq = 00000000'00000000.00000000'00034BE4 00000000'00000000.00000000'00034BE8
isr = 00000000'10240005 ior = 00000000'802010BC iir = 6AD40178 rctr = 7C139485




id reg cr8/cr9 00000000'00000000 00000000'00000000
pid reg cr12/cr13 00000000'00000000 00000000'00000000
ipsw = 00000000'0804001F iva = 00000000'0002A000 sar = 35 ccr = C0
tr0/3 00000000'01600000 00000000'00000000 00000000'00000004 00000000'7B01D028
tr4/7 00000000'00000003 0000007D'095D0780 00000000'001728C8 400003FF'FFFF0C78
eiem = C4000000'00000000 eirr = 00000000'00000000 itmr = 0000007D'09736006
cr1/4 00000000'00000000 00000000'00000000 00000000'00000000 00000000'00000000
cr5/7 00000000'00000000 00000000'00000000 00000000'00000000
MACHINE CHECK PARAMETERS:
Check Type = 80000000 CPU STATE = 9E000000 Cache Check = 54000000
TLB Check = 00000000 Bus Check = 00000000 PIM State = ? SIU Status = ????????
Assists = 00000003 Processor = 00000000
Slave Addr = 00000000'00000000 Master Addr = 00000000'00000000
HPMC, pcsq.pcoq = 0'0.0'34be4 , isr.ior = 0'10240005.0'802010bc
@(#) $Revision: vmunix: vw: -proj selectors: CUPI80_BL2000_1108 -c 'Vw
for CUPI80_BL2000_1108 build' -- cupi80_bl2000_1108 'CUPI80_BL2000_1108' Wed No
v 8 19:24:56 PST 2000 $High-priority machine check: (display==0xd910, flags==0x
0)
* CPU State 0x9e000000

This command SL -E output

FW 1 0 0x160012D001E00000 010003000A000000 MPS_CPU_WAIT_OVER
SFW 1 1 0x3400093901E00000 000000000004000C BOOT_INVOKE_EARLY_SELFTESTS
SFW 1 0 0x160012D101E00000 0000000A0EEBB000 MPS_SET_SLAVE_TIMEOUT
SFW 1 0 0x160012CB01E00000 0100000000000000 MPS_SLAVE_WAKEUP_ALL
SFW 1 0 0x1600097C01E00000 01001000020000FF BOOT_MONARCH_SLAVE_CHECK
SFW 1 0 0x160012CF01E00000 010010000200000D MPS_CPU_WAITING
SFW 1 0 0x160012D001E00000 0100100002000000 MPS_CPU_WAIT_OVER
SFW 1 0 0x160012F201E00000 0000000000000001 CPU_BUS_INIT
SFW 1 0 0x160012D101E00000 00000005F5E10000 MPS_SET_SLAVE_TIMEOUT
SFW 1 0 0x160012D101E00000 0000000FA56EA000 MPS_SET_SLAVE_TIMEOUT
SFW 1 0 0x160012CB01E00000 0100000000000000 MPS_SLAVE_WAKEUP_ALL
SFW 1 0 0x16000A8701E00000 0000000000000001 EST_START
SFW 1 0 0x16000CC601E00000 C0DE000000000001 CPU_API_DEBUG_INFO
SFW 1 0 0x16000CC601E00000 0000000000000000 CPU_API_DEBUG_INFO
SFW 1 0 0x16000CC601E00000 0000000000000000 CPU_API_DEBUG_INFO
SFW 1 0 0x16000CC601E00000 0000000000000000 CPU_API_DEBUG_INFO
SFW 1 0 0x16000CC601E00000 0000000000000000 CPU_API_DEBUG_INFO
SFW 1 0 0x16000CC601E00000 C0DE000000000003 CPU_API_DEBUG_INFO
SFW 1 0 0x16000CC601E00000 F7E0800043C80003 CPU_API_DEBUG_INFO
SFW 1 0 0x16000CC601E00000 0C10000000000000 CPU_API_DEBUG_INFO
SFW 1 0 0x16000E3501E00000 0000000000000000 CPU_API_EARLY_TEST
SFW 1 0 0x16000E3501E00000 0000000000000001 CPU_API_EARLY_TEST
SFW 1 0 0x16000E3501E00000 0000000000000002 CPU_API_EARLY_TEST


# Location|Alert| Encoded Field | Data Field | Keyword / Timestamp
-------------------------------------------------------------------------------
534 BMC *3 0x2047B47962023FF0 641F647000100300 Type-02 107004 1077252
14 Feb 2008 17:24:50
533 SFW 0 *7 0xF680105E00E03FD0 000000003F900000 MC_HPMC_MONARCH_SELECTED
14 Feb 2008 17:24:49
532 SFW 0 2 0x57800F7300E03FB0 8400000001700000 ERR_CPU_CHECK_SUMMARY
14 Feb 2008 17:24:49
531 SFW 0 *7 0xE880035C00E03F90 000000003FA97C00 ERR_CHECK_HPMC
14 Feb 2008 17:24:49
530 SFW 0 2 0x5680109B00E03F70 000000000002A024 MC_BR_TO_OS_HPMC
14 Feb 2008 17:24:49
529 SFW 0 *7 0xF680105E00E03F50 000000003F900000 MC_HPMC_MONARCH_SELECTED
14 Feb 2008 17:24:49
528 SFW 0 2 0x57800F7300E03F30 8400000001700000 ERR_CPU_CHECK_SUMMARY
14 Feb 2008 17:24:49
527 SFW 0 *7 0xE880035C00E03F10 000000003FA97C00 ERR_CHECK_HPMC
14 Feb 2008 17:24:49
526 SFW 0 2 0x5680109B00E03EF0 000000000002A024 MC_BR_TO_OS_HPMC
14 Feb 2008 17:24:48




lloc_pdc_pages: Relocating PDC from 0xfffffff0f0c00000 to 0x3f900000.
gate64: sysvec_vaddr = 0xc0002000 for 2 pages
NOTICE: autofs_link(): File system was registered at index 3.
NOTICE: cachefs_link(): File system was registered at index 5.
NOTICE: nfs3_link(): File system was registered at index 6.
[Osxinit]: Allocated ots_timeout_lock [401e0740]
igelan0: INITIALIZING HP PCI 1000Base-T Core at hardware path 0/1/2/0

System Console is on the Built-In Serial Interface
Logical volume 64, 0x3 configured as ROOT
Logical volume 64, 0x2 configured as SWAP
Logical volume 64, 0x2 configured as DUMP
Swap device table: (start & size given in 512-byte blocks)
entry 0 - major is 64, minor is 0x2; start = 0, size = 4194304
Starting the STREAMS daemons-phase 1
Checking root file system.
log replay in progress
replay complete - marking super-block as CLEAN


Output for the power supply and processor


[lanc-cg1] MP:CM> ps


PS
System Power state: On
Temperature : Normal


Power supplies | Fan
# State Type | States
-----------------------------------------------------------
0 Normal Type 0 | Normal
1 Normal Type 0 | Normal
2 - - | Normal
3 - - | Normal
4 - - | Normal
5 - - | Normal
6 - - | Normal



SS

System Processor Status:

Monarch Processor: 0

Processor Module 0: Installed and Configured
Processor Module 1: Not Installed

I have try to updating the firmware but to no avail. At the BCH prompt, when i type sea to search all the bootable device , when i type the device file for the Cdrom i see red light flashing on the server. I also try to boot into Single user mode but get the same red light flashing What should i do next.











6 REPLIES 6
Grayh
Trusted Contributor

Re: rp 3440 flashing red light and can't boot to Unix

Could you attach the following files

/var/tombstones/ts99

/var/tombstones/ts98
Grayh
Trusted Contributor

Re: rp 3440 flashing red light and can't boot to Unix

Hi,

Have you upgraded the following system firmware....

Also if the server is up and running at the minute could you send the o/p of the attached script

Procedure for script execution.

Use the binary mode to transfer the files. In case you use ftp mode , Copy the script in the working directory and execute the following command.

# sh getsysinfo.sh -t

The output of this script would be 'sysinfo.tgz' in '/tmp' directory.

Sakui
Frequent Advisor

Re: rp 3440 flashing red light and can't boot to Unix


The server is still flashing red light.It can't boot to Unix.
This is where it usually stop and give the error light LED.

lloc_pdc_pages: Relocating PDC from 0xfffffff0f0c00000 to 0x3f900000.
gate64: sysvec_vaddr = 0xc0002000 for 2 pages
NOTICE: autofs_link(): File system was registered at index 3.
NOTICE: cachefs_link(): File system was registered at index 5.
NOTICE: nfs3_link(): File system was registered at index 6.
[Osxinit]: Allocated ots_timeout_lock [401e0740]
igelan0: INITIALIZING HP PCI 1000Base-T Core


at hardware path 0/1/2/0



lanc-cg1] MP:CM> sysrev


SYSREV

Current firmware revisions

MP FW : E.03.15
BMC FW : 03.50
System FW : 45.11
Grayh
Trusted Contributor

Re: rp 3440 flashing red light and can't boot to Unix

As seen from the chasis logs it reports as the Reporting Entity ID is System Firmware - cabinet 0, cell slot 0, cpu 0

And the latest Firmware versions are as follows:

MP:03.30
BMC:03.52
PDC:46.34

Fixes in BMC 3.52:
- Lower overall fan fail threshold percentage from 80 to 75 percent of expected speed.
- Lower fan fail RPM for Turbo Coolers to ~1000 RPM regardless of expected speed.
- BMC713: BMC stops responding to chassis, sensor commands.

****Recommended Action Plan*****
1. Upgrade to the latest version of firmware
2.If it dosen't resolve the issue then add the following patches and monitor the Server.

11.11 Patch 11.23 Patch
PHSS_36165 PHSS_36166


Sakui
Frequent Advisor

Re: rp 3440 flashing red light and can't boot to Unix


Thanks for the time and effort. I can't upgrade the firmware. At the BCH prompt when i try to search for all bootable devices i get an error with A red light flashing then the server hang. I have restarted the server over and again but still stops at the same place.I can't boot from the CD to installed the firmware.
Below is the link to the firmware.
I am suspecting a Hardware problem but which device do u think is causing the problem.
I can't also boot to unix.

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=en&cc=us&prodNameId=1828449&prodTypeId=15351&prodSeriesId=2512013&swLang=13&taskId=135&swEnvOID=7#2950
Vamsidin
Frequent Advisor

Re: rp 3440 flashing red light and can't boot to Unix

Hi,

As I have suspected you will have to replace CPU 0(AB354A, not sure of the part number)and for this if u still have an contract with HP, log a call with HP to replace CPU 0 @ +4487084292990.

Downtime Required: Approx 1 hour.

Regards,
Mohan