1833788 Members
2472 Online
110063 Solutions
New Discussion

Re: ems event #47

 
radi_1
Frequent Advisor

ems event #47

Hi,
I have an hp rp5470 server run by hp-ux 11.11 ,the root mail receives an event reporting power supply error and it suggests checking power connections to the server power supplies and the AC source.I did that and found no problems therre and also no error LEDs were found on the two power supplies ,still the server is running ok.The strange thing is the reporting occurs at exactly the same time of the day.Any suggestions?
Regards
never take simple maters for granted
5 REPLIES 5
generic_1
Respected Contributor

Re: ems event #47

Is there a third power supply on that model that is not powered up? Also make sure your stm and PDC and diags are up to date. Also would not hurt to make sure your facilites folks did not do any power tests or have outages.
radi_1
Frequent Advisor

Re: ems event #47

Hi,
I will check for that third power supply.yes they had many voltage outages recently.but for now i need to stop this error by knowing the cause of it.
Thanks very much.
never take simple maters for granted
Zeev Schultz
Honored Contributor

Re: ems event #47

Hi

Run the following (without going physically and checking that P/S is in place):

echo "sel dev 1;info;wait;il" | /usr/sbin/cstm

See on the first page of the output a nice
table showing P/S and Fans (present/non-present).

What is EMS/STM version? (Check with "swlist -l bundle | grep -i online")
If its earlier than Dec 2003 - suggested to upgrade (anyway,on 11i system it doesn't need reboot).Also install EMS/STM patches.

rgds,

Zeev
So computers don't think yet. At least not chess computers. - Seymour Cray
generic_1
Respected Contributor

Re: ems event #47

If you have a third power supply and it is not plugged in or you had voltage problems it could have tripped your alerts. I would correlate the event times with the voltage issues, just to see if they look plausable
Andrew Merritt_2
Honored Contributor

Re: ems event #47

The events are presumably from the dm_core_hw monitor, and they should be identifying which power module they think is faulty, e.g.:

Device identification information:

Number of failed power supplies..........: 1
Location of failed power supplies........: compute cabinet
Failed power supply number(s)............: 0

What does the full event show?

The reason they are logged at the same time of day is because of the way the EMS Hardware Monitors work. They limit the frequency that each event will be reported, and for this event, 47 from dm_core_hw, the limit is once per day. So, having detected the condition that leads to the event once, it will be 24 hours before it gets reported again, even if it is detected every time the monitor polls (every 15 minutes for dm_core_hw).

When did the events first start getting reported? Did any hardware or software changes happen to the system around that time?

I would echo the advice to make sure the OnlineDiags are up-to-date.

Andrew