Integrity Servers
1752297 Members
4976 Online
108786 Solutions
New Discussion юеВ

Re: Superdome sx2000 hardware degraded

 
Peter Marais
Frequent Advisor

Superdome sx2000 hardware degraded

When I reboot a particular npar on my superdome I receive a warning that my harware is degraded:

Keyword = CELL_HW_DEGRADED

Description:

A dimm or CPU has failed and is not operational for the system. This event is emitted prior to determining if the cell should be integrated into the Partition.

Cause/Action:

A deconfigured dimm or cpu has been detected. Examine earlier events to isolate the problem.


Recommendation:

None

Yet when I look at the system via parstatus or stm I can not find any degraded memory or cpu...
Also my SEL does not give me any more detail of what failed

any other sagestions of where I can look? Running HP-UX 11.23
18 REPLIES 18
Torsten.
Acclaimed Contributor

Re: Superdome sx2000 hardware degraded

Use "info all" from EFI or BCH, depending on the CPUs installed (RISC vs. Integrity server).

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Peter Marais
Frequent Advisor

Re: Superdome sx2000 hardware degraded

Thanks Torsten is there any unix commands I can try as I do not have the luxery of bringinig the system down
Torsten.
Acclaimed Contributor

Re: Superdome sx2000 hardware degraded

You can check the memory with stm.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Matti_Kurkela
Honored Contributor

Re: Superdome sx2000 hardware degraded

If you have currently unused iCOD/iCAP CPUs in your system, the system is smart enough to "swap" a failed CPU with an unused one.

The rationale is apparently something like: you've paid for X CPUs, so the system will provide you with X CPUs as long as there are enough usable CPUs to do that.

MK
MK
Andrew Rutter
Honored Contributor

Re: Superdome sx2000 hardware degraded

hi,

Did you run the info tool in stm? and look at the log for the memory or cpu's. Also try running the info tool on the system, usually the first item in the map in CSTM.

this should show any degraded/deconfigured items

Andy
Peter Marais
Frequent Advisor

Re: Superdome sx2000 hardware degraded

Hi Guys

Thanks for all the responces.

As I stated in my opening mail stm reveals no deconfigured memory or cpu's. The only message I get is from SEL on the mp stating the qouted worning (from opening statement)above and marks the cell board with a w in VFP. No ICOD running on this system
Machinfo also gives the correct amount of CPU's and memory.
Torsten.
Acclaimed Contributor

Re: Superdome sx2000 hardware degraded

So I would investigate the cells from MP with

MP:CM> ps

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!   
Peter Marais
Frequent Advisor

Re: Superdome sx2000 hardware degraded

Thanks Torston, ps only gives a power status. No joy with that.
Torsten.
Acclaimed Contributor

Re: Superdome sx2000 hardware degraded

I would expect an item in "failed" status.

However, there must be a related message in the diags logs, maybe in root's mailbox and in the MP logs.

Hope this helps!
Regards
Torsten.

__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.

__________________________________________________
No support by private messages. Please ask the forum!

If you feel this was helpful please click the KUDOS! thumb below!