ProLiant Servers (ML,DL,SL)
1753810 Members
7743 Online
108805 Solutions
New Discussion юеВ

DL585 G7 - Internal system health degraded

 
RuudApg
Occasional Advisor

DL585 G7 - Internal system health degraded

Last week one of our servers crashed without any reason. No HPSIM or Rimboard logging available. The server has been restarted and works fine AFAIK but since then the "Internal system health degraded"-LED keeps flashing yellow. The Rimboard says the same but further nothing: no bad memory, processor, temperature or what ever can be seen.

What can be causing the trouble? And if it was just a unexplainable hick-up, can the LED be turned off?

Many thanks!

Kind regards, Ruud Baltissen
APG - Netherlands
3 REPLIES 3
David Claypool
Honored Contributor

Re: DL585 G7 - Internal system health degraded

What does it say in the IML?
RuudApg
Occasional Advisor

Re: DL585 G7 - Internal system health degraded

Hallo David,

This morning I had the chance to power it down completely, including pulling the powerplugs. After that the LED stayed out. Strangely enough the IML showed data I haven't seen before:

System Error 03/08/2011 07:51 03/08/2011 07:51 1 An Unrecoverable System Error (NMI) has occurred (System error code 0x00000032, 0x5DAD4350)
103 System Error 03/08/2011 07:51 03/08/2011 07:51 1 An Unrecoverable System Error (NMI) has occurred (System error code 0x00000032, 0xF188A722)
102 System Error 03/08/2011 07:51 03/08/2011 07:51 1 An Unrecoverable System Error (NMI) has occurred (System error code 0x00000032, 0x4686248D)
101 System Error 03/08/2011 07:51 03/08/2011 07:51 1 An Unrecoverable System Error (NMI) has occurred (System error code 0x00000032, 0x741816E1)
100 System Error 03/08/2011 07:51 03/08/2011 07:51 1 An Unrecoverable System Error (NMI) has occurred (System error code 0x00000032, 0x12625406)
27 CPU 03/08/2011 07:50 03/08/2011 07:50 1 Uncorrectable Machine Check Exception (Board 0, Processor 3, APIC ID 0x00000030, Bank 0x00000004, Status 0xF2000080'00020C0F, Address 0x00000000'00000000, Misc 0x00000000'00000000)
26 CPU 03/08/2011 07:50 03/08/2011 07:50 1 Uncorrectable Machine Check Exception (Board 0, Processor 4, APIC ID 0x00000041, Bank 0x00000004, Status 0xF2000280'00020C0F, Address 0x00000000'00000000, Misc 0x00000000'00000000)
25 CPU 03/08/2011 07:50 03/08/2011 07:50 1 Uncorrectable Machine Check Exception (Board 0, Processor 1, APIC ID 0x00000010, Bank 0x00000004, Status 0xF2000080'00020C0F, Address 0x00000000'00000000, Misc 0x00000000'00000000)

This is the bottom part of the list. The list goes on until 07:53 and only shows NMI errors with one exception: same processor related message but now for processor 2 some where in 07:52.

I hope you can make something of it. I don't :(

Many thanks in advance!


Kind regards, Ruud Baltissen
PS-Support T-Systems
Occasional Contributor