HPE 9000 and HPE e3000 Servers
1751930 Members
5011 Online
108783 Solutions
New Discussion юеВ

Re: CPU Replace & Re-configure

 
SOLVED
Go to solution
Steven Chen_1
Super Advisor

CPU Replace & Re-configure

Our L2000 server CPU sequence is 0,3,1. After one CPU deconfigured one day I replaced CPU1, but I see CPU3 deconfigured instead.

Then I replaced a good one to CPU3 and still see CPU3 deconfigured. I want to know if that indicates the problem is on system board, or psm.

Thanks.
Steve
8 REPLIES 8
Thayanidhi
Honored Contributor
Solution

Re: CPU Replace & Re-configure

Once if a processor de-configured due to some problem, even after replacement it doesn't get configured.
You have to manually enable them again in BDC.

Go to COnfiguration menu and there is command to re-configure the processor. You have to do hard restart (power cycle) to recognize the new processor.

First you have identify the correct processor, Based on the logs.

Hope you are replacing same model/speed!

Do you have 2 PSM and 4 Processor?

Revert
Regds
TT

Attitude (not aptitude) determines altitude.
Steven Chen_1
Super Advisor

Re: CPU Replace & Re-configure

Thanks for the help.

How can we find out what cause the CPU deconfigured, even though we can manually re-start it?

Steve
Thayanidhi
Honored Contributor

Re: CPU Replace & Re-configure

Mostly there is recoverable errors (LPMC)from that CPU. I have seen in several instances on-chip Cache goes bad.

See your /var/adm/syslog/syslog.log
(Hope you have not disabled EMS!)

Post the syslog if you are not able to find out which processor.

Regds
TT
Attitude (not aptitude) determines altitude.
Steven Chen_1
Super Advisor

Re: CPU Replace & Re-configure

Thayanidhi,

syslog.log has no lpmc error. It only shows 2 CPUs of path 160 & 162 are configured. From BDC, I see CPU3 is stopped and deconfigurared. Though manual re-start could bring back the CPU, my fear is:

1) would this CPU go bad again?
2) would psm go bad in this case?
3) would system board go back in this case?

Any comments?

Very appreciated.

Steven
Steve
Thayanidhi
Honored Contributor

Re: CPU Replace & Re-configure

Hi,
Without any reason a CPU may not get de-configured.
What GSP log says?
Press Ctrl+B in console and from GSP>sl and press e for error logs.

Generally a CPU get de-configued during restart! did you restart?

run
/etc/opt/resmon/lbin/moncheck
to find out whether your processor is monitored for LPMC. You should see some thing similar below.
/system/events/cpu/lpmc ... OK.
For /system/events/cpu/lpmc/cache_errors:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 4 (SERIOUS) Goto TCP; host=hostname port=49296

Also check /var/tombstones/ts99 ts98 ts97 ..etc for possible catch.

Regds
TT



Attitude (not aptitude) determines altitude.
Steven Chen_1
Super Advisor

Re: CPU Replace & Re-configure

GSP error indiates a process#3 is removed from configuration.

I get /var/opt/resmon/log/event.log and cut to the followings:

******************************
Event data from monitor:

Event Time..........: Sun May 1 14:55:22 2005
Severity............: INFORMATION
Monitor.............: lpmc_em
Event #.............: 100501
System..............: rubble

Summary:

Module at Hard Physical Address = 0xfffffffffffa6000 : Cache Error(s)
detected on Processor 3.


Description of Error:

Parity error has been detected in either the Instruction or Data Memory
Cache (I-Cache or D-Cache). The operating system has recovered from the
error.

Probable Cause / Recommended Action:

Contact your HP support representative if an excessive number of these
errors are generated.

Additional Event Data:
System IP Address...: 172.16.16.26
Event Id............: 0x4275261a00000000
Monitor Version.....: B.01.00
Event Class.........: LPMC
Client Configuration File...........:
/var/stm/config/tools/monitor/default_lpmc_em.clcfg
Client Configuration File Version...: A.01.00
Qualification criteria met.
Number of events..: 1
Associated OS error log entry id(s):
None
Additional System Data:
System Model Number.............: 9000/800/L2000-5X
EMS Version.....................: A.03.20
STM Version.....................: A.30.00
Latest information on this event:
http://docs.hp.com/hpux/content/hardware/ems/lpmc_em.htm#100501

v-v-v-v-v-v-v-v-v-v-v-v-v D E T A I L S v-v-v-v-v-v-v-v-v-v-v-v-v



Component Data:
HPA.....................: 0xfffffffffffa6000
Processor Number........: 3
Physical Device Path....: 166
Serial Number...........: 4272af443

Error Details:

LPMC Data:
Hmodel: 0x000005d9 HPA: 0xfffffffffffa6000
PIM Length: 175 Check Type Word: 0x80000000
HVerDep/CacheParity: 0000000000 Cache Check Word: 0x80000000

Check Type Register=0x80000000
An error has occurred in Cache System.
Cache Check Word=0x80000000 and Hversion Dependent Word=0000000000
Parity Error(s) detected in I-cache associative

*********************************

You'r right, cache memory issue. Should this CPU be replaced? How bad is it?

thanks.

Steve
timmy b.
Honored Contributor

Re: CPU Replace & Re-configure

Steven,
The problem is bad enough that the CPU has deconfigured itself. You want to get this CPU replaced, NOW. It isn't doing anything for you as long as it is having problems.

Good Luck - Tim
There are 10 kinds of people in this world: Those who understand Binary, and those who don't.
Thayanidhi
Honored Contributor

Re: CPU Replace & Re-configure

Steven,
Your CPU is anyway bad. Replace it ASAP.
Remember re-configure the processor in BDC after replacing with new CPU.

Regds
TT
Attitude (not aptitude) determines altitude.