ProLiant Servers (ML,DL,SL)
1748242 Members
4019 Online
108760 Solutions
New Discussion юеВ

Re: iLO 2 Access all of a sudden not working on multiple servers

 
Robert W. Eastman Jr._2
Frequent Advisor

iLO 2 Access all of a sudden not working on multiple servers

We are starting to experience and issue where iLO2 will all of a sudden stop functioning. Sometimes you can ping the iLO address but most of the time you cannot. This is happening on DL380 G5 and ML350 G5's. Firmware is at least 1.60 since they are almost all new servers as of the quarter end of 2008. The only thing that has really changed is the we updated HPSIM to 5.3. I can't see how communication from HPSIM to the Servers would stop the iLO from functioning. It appears that a restart of the server will start the iLO to start communicating once again.

Is anyone else seeing this?
Dream On Alice This Ain't Wonderland
(NTFS) No Time For Stupidity
25 REPLIES 25
Bijl
Advisor

Re: iLO 2 Access all of a sudden not working on multiple servers

Hi Robert,
QQ, what OS are you using?
Robert W. Eastman Jr._2
Frequent Advisor

Re: iLO 2 Access all of a sudden not working on multiple servers

This is happening on Windows 2003 SP2 and also ESX 3.5 Update 1. I had another one last night stop functioning. Of course since the only thing that has really changed is HPSIM version I am beginning to wonder if this isn't causing it.
Dream On Alice This Ain't Wonderland
(NTFS) No Time For Stupidity
Cookie_2
Frequent Advisor

Re: iLO 2 Access all of a sudden not working on multiple servers

Hi Rob,

I'm not sure if HP SIM is causing this. But since its an issue with iLo2, i would suggest upgrading the firmware to 1.70

Regards,
Cookie
Sometimes a Loser Wins!!
Robert W. Eastman Jr._2
Frequent Advisor

Re: iLO 2 Access all of a sudden not working on multiple servers

I don't think HP SIM is causing this either, but just a coincidence that I updated HPSIM and a week later it started happening.

I have a total of about 30 servers now all 1.60 version of the iLO2 firmware that are just disappearing. The "ONLY" way to get them back to be able to even update them appears to be to have the server powered down and remove the plug from them so the iLO actually gets powered off. Then we are able to access the iLO. Unfortunately we are in the midst of our busy season (Tax) and we don't have competent people in our remote offices that can do what we need them to do. The biggest issue is that most of the servers affected are running ESX 3.5 so when we do actually have an issue with the server we have no way of powering it off without the iLO to get it back up and running.

I guess my only alternative is going to be a waiting game and wait until we have to reboot each server and walk the users in the remote offices of unplugging each power plug so the iLO no longer receives power.

HP is still indicating that they have no know issues with 1.60, but hell I can show them 30 :)
.
Dream On Alice This Ain't Wonderland
(NTFS) No Time For Stupidity
MacSWW
Frequent Advisor

Re: iLO 2 Access all of a sudden not working on multiple servers

I'm also getting this exact same problem on 4 of our 5 Windows 2003 SP2 file servers. Interestingly, all 4 are DL360 G5's. The only one with a working iLO is a DL360 G3. Firmware is v1.61 and they've been working fine since they were installed at the end of last year. They've all stopped working in the last few weeks (earliest was on 4th Feb) and we've done nothing to the servers that might cause this.
The HP System Management page is showing the Interface Status as 'Not Responding' and the iLO port won't ping.
I've tried to update the firmware to 1.70, but it fails saying 'Unable to communicate with the management processor'.

There are also eventlog entries saying:
Remote Insight Agent: The Remote Insight Board/Integrated Lights-Out has detected a controller interface error.
[SNMP TRAP: 9006 in CPQSM2.MIB]

Does anyone know if there is a hard iLO2 reset we can try. I'm obviously trying to find a solution that doesn't involve taking the files servers down.
Robert W. Eastman Jr._2
Frequent Advisor

Re: iLO 2 Access all of a sudden not working on multiple servers

The only way we have found so far to get the iLO to start responding again in order to update it to 1.70 was to physically power off the server and have the plugs removed. This way the iLO no longer gets power and will reset. Simply shutting down the server does not work because power is still flowing to the iLO.

Hopefully your servers are local and you can do this. Unfortunately the majority of ours are remote offices with no one in the office that is technical.
Dream On Alice This Ain't Wonderland
(NTFS) No Time For Stupidity
Robert W. Eastman Jr._2
Frequent Advisor

Re: iLO 2 Access all of a sudden not working on multiple servers

Jason, are u using HPSIM 5.3
Dream On Alice This Ain't Wonderland
(NTFS) No Time For Stupidity
MacSWW
Frequent Advisor

Re: iLO 2 Access all of a sudden not working on multiple servers

Hi Rob,

Thanks for the reply. Yes, we're running HPSIM 5.3, which was upgraded from 5.2 on 3rd Feb. The first of our iLO2 problems started on the 4th Feb, but cleared 5 minutes later. It then went down again on 9th and has stayed down since.

I'm reluctant to point the finger at SIM (although it does seem like quite a coincidence), but that's based on nothing but gut instinct as I honestly don't know enough about how it works yet.

Thankfully, all my servers are fairly local, so I can come in and power them down out of hours. Just out of interest, what happens after you upgrade to firmware 1.7? Does iLO2 miraculously start working again and, if so, for how long?

I'm going to try and do some more investigation this week, so I'll post any results.
Robert W. Eastman Jr._2
Frequent Advisor

Re: iLO 2 Access all of a sudden not working on multiple servers

Just as a test I update one server locally to version 1.70 and another server I just updated the support pack to PSP 8.15. I had to have the server that I update to PSP 8.15 power completed removed from the server in order to get the iLO back to a responsive state. Once the server had the power removed I was once again able to access the iLO on this server. After about 3 days of running the server with the just PSP 8.15 update the iLO became unresponsive again while the 1.70 firmware version is still up and functioning. It appears that 1.60 is the issue here.
Dream On Alice This Ain't Wonderland
(NTFS) No Time For Stupidity