Operating System - Tru64 Unix
1748165 Members
4232 Online
108758 Solutions
New Discussion юеВ

Re: hwmgr status comp (critical)

 
SOLVED
Go to solution
Ivan Ferreira
Honored Contributor

hwmgr status comp (critical)

When I run hwmgr status component on some of my servers, I get this output:

STATUS ACCESS INDICT
HWID: HOSTNAME SUMMARY STATE STATE LEVEL NAME
------------------------------------------------------------------------------

200: HOSTNAME critical online available high CPU0
201: HOSTNAME critical online available high CPU1

What this means? Why is critical? And what is the Indict Level?
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?
3 REPLIES 3
Aaron Biver_2
Frequent Advisor
Solution

Re: hwmgr status comp (critical)

Ivan
Most likely, WSEA or its predecessor (Compaq Analyze) is installed to monitor the system, and it detected one or more hardware errors that make it think these two cpus might be going bad. (try "whereis wsea; whereis ca" to verify whther wsea or compaq analyze are installed)
WSEA has logic to indict hardware components like this, so this is the most likely explanation. Its logic is not infallible, though.
You should have the /var/adm/binary.errlog analyzed using the wsea utility. If you have a service contract, they will take care of this.

If there is no service contract, it would probably be wise to look into the errors, and possibly remove one or more of the cpus before you start seeing real problems. In that case, I can give you a summary of wsea commands that would give you more info.

Harmanjit_1
Frequent Advisor

Re: hwmgr status comp (critical)

Hi,

I had experienced the same problem.
You can try following command

# hwmgr -unindict -id 200
# hwmgr -unindict -id 201

200 and 201 is CPU id

Sometimes alpha keeps the status of CPU inthis state if any crash has happened.
Ivan Ferreira
Honored Contributor

Re: hwmgr status comp (critical)

Thanks for the responses.

Information also available on:

Online Addition and Removal

man hwmgr_ops
Por que hacerlo dificil si es posible hacerlo facil? - Why do it the hard way, when you can do it the easy way?