HPE 9000 and HPE e3000 Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

unexpected HPMC (Non Production Box)

 
SOLVED
Go to solution
Scott Frye_1
Super Advisor

unexpected HPMC (Non Production Box)

HP-UX 11.i box has been rebooting since Saturday. I came in this morning to find our test box rebooting again and again. I noticed the Unexpected HPMC message. I searched the forums archives to find it is most likely a HW issue. How do I determine what I'm looking at. I'm assuming it may be a CPU but I was wondering if there is a way to tell for sure. **NOTE** this box never comes up completely to gain access to logs.

Thanks

Scott
12 REPLIES 12
Robert-Jan Goossens_1
Honored Contributor

Re: unexpected HPMC (Non Production Box)

Hi Scott,

Can you log into the gsp ?

Type CTRL b on the system console to enter GSP (Guardian Service
Processor) mode.

Access the GSP "error" log using the GSP "SL" command.

Regards,
Robert-Jan

Steven E. Protter
Exalted Contributor

Re: unexpected HPMC (Non Production Box)

Yes Hardwre.

HPMC
High Priority Machine Check

System must boot immediately.

If it keeps booting it may be a CPU in a single CPU otherwise the box would come up with 1 CPU

To find out.

At the console interupt at the 10 second prompt

There is an In command (help may get you what you need)

There there are some basic hardware checks. You can check the cpu, memory and networking.

Then you can call hardware support with confidence.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Scott Frye_1
Super Advisor

Re: unexpected HPMC (Non Production Box)

Robert-Jan,
I wish we had a GSP, this is an old K class and we don't have a GSP.

SEP,
I will interept the boot and report back.

The exact message is
Unexpected HPMC. Processor HPA FFFFFFFF'FFFA0000

Oh, and I also forgot to mention we are not paying for HW support so I guess we are on our own.

Thanks

Scott
Bill Hassell
Honored Contributor

Re: unexpected HPMC (Non Production Box)

No simple answer. The Kbox lacks the sophisticated processor fround in the GSP. You can try interrupting the boot process and see what diag commands may be present in the processor ROM code. There are offline diags available on the core/install CD. Since you'll need to spend some time learning about all this, you might start by booting into single user mode. If that works, you can list the ts99 log but you'll still need some help to interpret many of the registers. Another possibility is to use parts from a spare K-box. If the K-box has multiple processors, try pulling all but one. Of course, the bad one may be the one that's left so swap that. Also remove RAM down to just a couple of DIMMs.

However, if the problem is in the backplane or a power supply signal, you'll need to pay for a service call.


Bill Hassell, sysadmin
Scott Frye_1
Super Advisor

Re: unexpected HPMC (Non Production Box)

Also, this is a 4 CPU box. with 4 gig of memory
Scott Frye_1
Super Advisor

Re: unexpected HPMC (Non Production Box)

Shouldn't this be under sysadmin and not databases???
Steven E. Protter
Exalted Contributor
Solution

Re: unexpected HPMC (Non Production Box)

HP will fix this on a T&M basis, meaning time and materials.

It will may cost more than a hardware service contract, but you are not strictly speaking on your own.

Replacing a CPU is not the most difficult operation to attempt.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Stefan Stechemesser
Honored Contributor

Re: unexpected HPMC (Non Production Box)

Hi Scott,

mybe you can catch the chassis codes that appear during the processing of the HPMC. Sometimes it is possible to determine the problem cause without the full HPMC dump.
The best way is to connect to the console, set the key switch to service mode, press CTRL-B to get to the access port and then enter "sl" which will display the last 50 chassis codes. Post it here, and maybe someone can tell something about the error cause.

best regards

Stefan
Scott Frye_1
Super Advisor

Re: unexpected HPMC (Non Production Box)

Thanks Stefan, trying it now.
Scott Frye_1
Super Advisor

Re: unexpected HPMC (Non Production Box)

Stefan,

The sl shows the front panel displays. The output is as follows.

WARN E000
SHUT D100
FLT 5508
FLT 5002
INIT CC02
RUN F102
INIT CEE0
INIT CEC0
INIT CE75
INIT CE70
*NEXT COLUM*
WARN E000
INIT CB0B
FLT 540F
FLT 540F
INIT CC01
RUN F101
INIT CEE0
INIT CE0F
INIT CE74
INIT CE69
*NEXT COLUM*
SHUT D200
INIT CB00
FLT 5108
FLT 5308
INIT CEF2
RUN F200
INIT CEDF
INIT CE78
INIT CE73
INIT CE64
*NEXT COLUM*
INIT CBF8
SHUT D000
FLT 5208
FLT CBF0
INIT CEF0
RUN F100
INIT CEDB
INIT CE77
INIT CE72
INIT CE63
*NEXT COLUM*
INIT CBF7
FLT CBFB
FLT 5508
INIT CC03
RUN F103
INIT CEE1
INIT CED0
INIT CE76
INIT CE71
INIT CE62

Hopefully, this is what you were looking for.

thanks

Scott
Stefan Stechemesser
Honored Contributor

Re: unexpected HPMC (Non Production Box)

Hi Scott,

the chassis code tell me that it is most probably an I/O related error, not a CPU or Memory Problem.
All CPUs except CPU #0 logged broadcast errors (5x08 with x=CPU number). It is CPU #0 because it is executing the selftests.
CPU #0 logged 5002 which means "path error".
A path error happens when a memory or I/O access fails. In case of a memory error I would expect chassis codes beginning with 7, but there is nothing like this.
In this case it is an I/O related error with a very high chance and I would normally check the I/O error log in the PIM information which is not possible, because you cannot get to the BCH prompt.

Only solution how to troubleshoot this(should be done by a trained person, f.e. a HP CE ;-) :

Build a minimal config (unplugg all I/O cards except the Core I/O). If the system still does not boot to BCH, I would replace the Core I/O. You can also build an absolute minimal config (CPU #0, Core I/O, 2 dimms) and swap the CPUs with each other.
When you get to BCH, then you can check the I/O error log in the PIM output, or maybe you already found the error cause by try and error.

best regards

Stefan
Scott Frye_1
Super Advisor

Re: unexpected HPMC (Non Production Box)

Well, based on the error message we saw "Unexpected HPMC. Processor HPA..." we thought it could be processor 0. We removed processor 0 replacing it with processor 3 but still had the same problem. Stefan asked for the front panel codes. Based on his assessment, we now need to evaluate if it is cost affective to have someone come and fix it (causing us to pay for the part and service call) or just buying a refubished A clas. I had also posted a question about further upgrades to this K class. Since we are at 11.i V1, we have reached the limit of upgrades. The K class is no longer supported for further upgrades (11.i V2). This is huge deciding point for me.

Thanks for all the expertise!

Scott