ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

G5 ilo errors & server reboots

Muzz_1
Occasional Advisor

G5 ilo errors & server reboots

Hi There,

I have had an odd problem this morning that appears to have affected all G5 servers on the local network. I've attached some entries from the ilo log and event logs from an affected windows server.

The sequence of errors appears to repeat a number of times over the following 30 minutes, finally resulting in the server being reset, booting up and and complaining of an unexpected shutdown.

The problem appears to have affected all G5 servers regardless of the OS, as well as a number of internal jetdirect devices on the same network that hung.

The background to all this taking place was a problem with a switch on the network resulting in intermittent connectivity, however I am a little worried that it appears that the ilo cards were the cause of the G5 servers rebooting.

Has anybody seen anything like this before?

Regards

Murray
6 REPLIES
john_347
Regular Advisor

Re: G5 ilo errors & server reboots

HI Murray.
I would upgrade the ilo firmware and the s driver. Whilst we haven't had the same issue, we have had multiple sensor alerts from ilo until it was updated.

Sorry can't be of more help

John
marcus1234
Honored Contributor

Re: G5 ilo errors & server reboots

Muzz_1
Occasional Advisor

Re: G5 ilo errors & server reboots

Hi John & Mark,

This also affected linux machines - see an extract from /var/log/messages below. Alongside this the ilo log displays similar messages about asr and curiously, power.

Has anybody seen anything similar on Linux?

It is a pity that the advisory doesn't go into any more detail other than an unexpected reboot - the circumstances that could cause this would be useful. :)

Nov 24 07:28:32 esx01 kernel: ipmi_kcs_sm: kcs hosed: IBF not ready in time
Nov 24 07:28:42 esx01 kernel: ipmi_kcs_sm: kcs hosed: IBF not ready in time
Nov 24 07:28:57 esx01 hpasmxld[7436]: OsKcsExecCmd: IPMI NetFN 0x36 CMD:
0x2 has timed out!
Nov 24 07:29:37 esx01 last message repeated 4 times
Nov 24 07:30:37 esx01 last message repeated 6 times
Nov 24 07:30:37 esx01 hpasmxld[7436]: OsKcsExecCmd: Expected Cmd (0x2), Msg I
D (0x42472f)
Nov 24 07:30:37 esx01 hpasmxld[7436]: OsKcsExecCmd: Received Cmd (0x2), Msg I
D (0x42472d)
Nov 24 07:30:37 esx01 hpasmxld[7436]: OsKcsExecCmd: IPMICTL_RECEIVE_MSG_TRUN
C returned EAGAIN
Nov 24 07:31:46 esx01 kernel: ipmi_kcs_sm: kcs hosed: IBF not ready in time
Nov 24 07:34:15 esx01 last message repeated 2 times
Nov 24 07:35:25 esx01 last message repeated 2 times
Nov 24 07:37:21 esx01 kernel: ipmi_kcs_sm: kcs hosed: IBF not ready in time
Nov 24 07:47:18 esx01 syslogd 1.4.1: restart.
Nov 24 07:47:18 esx01 syslog: syslogd startup succeeded
marcus1234
Honored Contributor

Re: G5 ilo errors & server reboots

ok muzz hm.... referring to jet direct traffic
boot p arp rarp traffic after network issue or during such a period can cause ilo issues,under "special circumstances"

i would ensure all traffic on network is secure ,,!^* look for Vulnerabilities

start here http://www.networkscanning.com/General-VSSF.html


try netstat -abno on win boxes and verify pid with task manager

best to be on the safe side,,,

could be coincidnce .....

hope this is of some benefit....
enjoy:)










Muzz_1
Occasional Advisor

Re: G5 ilo errors & server reboots

Hi Mark,

that's interesting - there was a lot of arp traffic at the time. Do you have any more information on ilo issues under these circumstances?

Bruno Fleury_1
Occasional Visitor

Re: G5 ilo errors & server reboots

Hi Murray,

Did you find out what's happen that day, you have such error on your ESX ?
did you notice any Broadcast storm initiated by this ESX ?

i will be interested to know.

I had the same error from the ESX, at the same time a Broadcast storm was initiated by this ESX.

Not sure if this is the cause or the consequences.

Thx for your help.

Bruno