HPE 9000 and HPE e3000 Servers
1752805 Members
5256 Online
108789 Solutions
New Discussion юеВ

CPU failure: reboot possible

 
SOLVED
Go to solution
Mann_2
Frequent Advisor

CPU failure: reboot possible

Dear all,

one of our customer has a rp5450 with 4(360Mhz) cpu's and 8GB of RAM. six months before, we have put a new cpu in this servers and it work fast an verry well.

But now, the new placed cpu isn't available in the system, if I use top, I can see only 3 cpu's.

The customer has now a bad performance.
to my question: if it possible to restart the server without boot-problems?

I can't remove the cpu to this time, the customer isn't in my nearness.

it is possible to make a reboot or will the server refuse the start with the "dead"-cpu?

Thank for your answer in advance
BR
Roland
Roland
19 REPLIES 19
jaivinder
Frequent Advisor

Re: CPU failure: reboot possible

Hi,

As you r telling that the O/P of top is showing only three out of four cpu's.
Check the state of the fourth cpu using ioscan.
ioscan -fnC processor
The o/p will revel whether the cpu is claimed or not.
If you want to reboot the system in this case your system should come up without any problem.For the cpu failure had u checked the console logs or not.

First of all see the cpu state and then reboot if required.

Jaivinder
Mann_2
Frequent Advisor

Re: CPU failure: reboot possible

Hi Jaivinder,

isoscan means, that the cpu is claimed:

ioscan -fnC processor
Class I H/W Path Driver S/W State H/W Type Description
===================================================================
processor 0 160 processor CLAIMED PROCESSOR Processor
processor 1 162 processor CLAIMED PROCESSOR Processor
processor 2 164 processor CLAIMED PROCESSOR Processor
processor 3 166 processor CLAIMED PROCESSOR Processor
#

but in top i can see only 3 cpu's.

i found this from yesterday in the syslog.log:
Jul 10 16:55:17 HE_H1_DB EMS [2348]: ------ EMS Event Notification ------ Value: "MAJORWARNING (3)" for Resource: "/system/events/
cpu/lpmc/cache_errors" (Threshold: >= " 3") Execute the following command to obtain event details: /opt/resmon/bin/resdata
-R 153878530 -r /system/events/cpu/lpmc/cache_errors -n 153878562 -a

what can I do in this case?
Roland
Torsten.
Acclaimed Contributor

Re: CPU failure: reboot possible

Looks like the system was taking off all the load from this CPU due to cache errors.

If you reboot, stop at the boot menu.

Check, if the CPU is disabled (I guess it is).

Main Menu: Enter command or menu > co

---- Configuration Menu ------------------------------------------------------

Command Description
------- -----------
AUto [BOot|SEArch|STart] [ON|OFF] Display or set specified flag
BootID [] [] Display or set Boot Identifier
BootINfo Display boot-related information
BootTimer [0 - 200] Seconds allowed for boot attempt
CPUconfig [] [ON|OFF] Config/Deconfig processor
DEfault Set the system to predefined values
FastBoot [ON|OFF] Display or set boot tests execution
ResTart [ON|OFF] Display or set the System Restart Policy
PAth [PRI|ALT] [] Display or modify a path
SEArch [DIsplay|IPL] [] Search for boot devices
TIme [c:y:m:d:h:m:[s]] Read or set the real time clock in GMT

BOot [PRI|ALT|] Boot from specified path
DIsplay Redisplay the current menu
HElp [] Display help for specified command
RESET Restart the system
MAin Return to Main Menu


Use the CPUconfig command to check and disable if needed.

Did you check the logs/mails for any related messages?

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Torsten.
Acclaimed Contributor

Re: CPU failure: reboot possible

Looks like the system was taking off all the load from this CPU due to cache errors.

If you reboot, stop at the boot menu.

Check, if the CPU is disabled (I guess it is).

Main Menu: Enter command or menu > co

---- Configuration Menu ------------------------------------------------------

Command Description
------- -----------
AUto [BOot|SEArch|STart] [ON|OFF] Display or set specified flag
BootID [] [] Display or set Boot Identifier
BootINfo Display boot-related information
BootTimer [0 - 200] Seconds allowed for boot attempt
CPUconfig [] [ON|OFF] Config/Deconfig processor
DEfault Set the system to predefined values
FastBoot [ON|OFF] Display or set boot tests execution
ResTart [ON|OFF] Display or set the System Restart Policy
PAth [PRI|ALT] [] Display or modify a path
SEArch [DIsplay|IPL] [] Search for boot devices
TIme [c:y:m:d:h:m:[s]] Read or set the real time clock in GMT

BOot [PRI|ALT|] Boot from specified path
DIsplay Redisplay the current menu
HElp [] Display help for specified command
RESET Restart the system
MAin Return to Main Menu


Use the CPUconfig command to check and disable if needed.

Did you check the logs/mails for any related messages?

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Mann_2
Frequent Advisor

Re: CPU failure: reboot possible

Hi Torsten,

you saw a failure by an hpux-cpu before?

I would have never tought, which it is possible to have a defect hpux RAM or CPU....

tommorrow, i would drive to that customer to see the error live an to disable the cpu.

Thanks for your help !

BR
Roland

Roland
Patrick Wallek
Honored Contributor

Re: CPU failure: reboot possible

I have seen defects in both RAM and CPUs on HP-UX machines. Any component in your system has the potential to fail.

Bharath PSB
Advisor

Re: CPU failure: reboot possible

Are you sure that all the CPU's are of same type? Why I ask is if there is a mismatch between the CPU's HVersion and CVersion, there is a threat that some of the processors will not boot. Can you please confirm this?? To get this information, you may have to do a "in pr" at BCH.
Mann_2
Frequent Advisor

Re: CPU failure: reboot possible

Hi Barath,

yes, I think so. The server works for six months with 4 cpu's verry well. Fast and without problems.

but one of the cpu's will not work since 2 days..

today i'll drive to this customer for installation 4 new(refurbished) cpu's with 440Mhz(at the moment they have 360Mhz), and I hope we don't get more problems as before:-)

we'll see:-)

Thanks for all your answers

BR
Roland
Roland
Marcel Burggraeve
Trusted Contributor

Re: CPU failure: reboot possible

If you're going to replace those CPU's with faster ones make sure you have a document with you showing all dipswitch settings on the system board for the different CPU speeds out there.