BladeSystem - General
cancel
Showing results for 
Search instead for 
Did you mean: 

BL460c's hanging up

Ian McGann
Advisor

BL460c's hanging up

I have had 3 BL406's hang up on reboot in the past 2 days. They're stuck at a blank grey screen and doing a reset or cold boot from iLO does no good.

In two instances, I came in the next morning and powered the servers on from iLO and they booted no problem. On the third server, it was during the day and reseating the server in the enclosure did the trick.

Two of the servers are just a few months old, one is about a year and a half old. We're running Windows Server 2003 Enterprise with SP2. All servers have iLO 1.50. The OA's are all 2.20. Oh, and all 3 servers are in different enclosures...

Has anybody seen this behavior before? Any suggestions are appreciated.

Thanks,
Ian
36 REPLIES
Ian McGann
Advisor

Re: BL460c's hanging up

Forgot to mention that there is no indications of any hardware problems/failures in the sys mgmt homepage, IML log, iLO log, or even in the Windows log.
WFHC-WI
Honored Contributor

Re: BL460c's hanging up

Remove the blades and take a look at the connectors that link the blade to the enclosure's midplane. We experienced a similar problem and found it to be a damaged pin.

Good luck!
Ian McGann
Advisor

Re: BL460c's hanging up

Thanks, I'll check it out! One thing, though: Did you see this happening consistently? I've since rebooted the servers and they booted fine..
Ian McGann
Advisor

Re: BL460c's hanging up

WFHC-WI,
The back of the servers just have plugs, the only pins would be at the back of the enclosure. I shined a flashlight back there, but didn't see any obvious problems. Just for clarification, you're saying the pins on the midplane were bent?

I've attached a picture of the back of the server...
Raghuarch
Honored Contributor

Re: BL460c's hanging up

Pins on the back of the blade looks fine to me. Normally if you have a damaged Mid plane when you insert the server it will also get damaged.

You mentioned you saw it once right. You are struck @ Blank Gray screen, I assume the Server was able to power on. Did you check the server Physically was the Front LED's were Green??

You can Update your OA: (I don't think it is related to The problem, But some times it does Solve the Existing Problems)
Download the Latest 2.25 OA from below link:
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=e
n&cc=us&prodTypeId=329290&prodSeriesId=3188465&swItem=MTX-b75c040de0a8
42dcae2c96efb5&prodNameId=3188475&swEnvOID=2065&swLang=8&taskId=135&
amp;mode=3


I assume you have the latest ROM on the Blades:
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=en&cc=us&prodNameId=3288156&prodTypeId=15351&prodSeriesId=1842750&swLang=8&taskId=135&swEnvOID=2023#12212

Try updating the OA and ROM. Keep us posted.
Ian McGann
Advisor

Re: BL460c's hanging up

The ROM's on the servers were all updated yesterday (after the problem). But, the servers did boot cleanly once BEFORE updating the ROM, so.... ? I've run into one-time issues like this before, and am usually able to prove them as just a one-off, but one of these servers is a new DC and another a node in a major file cluster, so understandably, I'm a little nervous.

I'll update the OA's and go from there.
JKytsi
Honored Contributor

Re: BL460c's hanging up

Usually if these standard tricks does not make the servers work again, I call to HP and ask them to change mainboard.
Remember to give Kudos to answers! (click the KUDOS star)

You can find me from Twitter @JKytsi
Ian McGann
Advisor

Re: BL460c's hanging up

Thanks to all that have replied, but I still haven't found a solution. I had a fourth server have this problem last night and I had to call and have someone in the data center pull it out and reseat it in the enclosure.

All of my OA's have been upgraded to FW 2.25. I've talked to the local CE's and they'll be researching this and coming on site later in the day..

Any other thoughts?
Sean Brandon
Occasional Visitor

Re: BL460c's hanging up

Did you find a solution to this? We have the exact same issue. Thanks.
Eric_76
Regular Advisor

Re: BL460c's hanging up

"Also looking for solution"
we to are having the same exact problems.
we have multiple blades dropping off and OA say it is OK and green but physically the server has an amber light in it and nonresponsive the only way to get it back is to un-plug and re-plug in the blade.
we have BL460g1 ilo 1.50 w2k3 std.
BladeSystem c7000 Enclosure ver. 2.25
Ian McGann
Advisor

Re: BL460c's hanging up

Wow, I had totally forgotten about this thread. Has the problem gone away? Pretty much.. Do I know why? Not exactly...

In short, I battled back and forth with a local CE and HP Escalation on this. Upgrading to iLO 1.60 seems to have fixed the problem, however we did have one server with the same problem after the 1.60 update.

So, it seems that iLO 1.60 fixed the problem, but without explanation of why the one server froze with it.. HP has been unable to give me a reason other than "It seems as if the issue has been resolved."
Dan Silva
Occasional Visitor

Re: BL460c's hanging up

Hi, everyone:

Based on my experience, it appears iLO Firmware 1.60 should fix this problem even though it is not mentioned in the "fixes" page.

Download Drivers and Software: http://h20000.www2.hp.com/bizsupport/TechSupport/DriverDownload.jsp?lang=en&cc=us&prodNameId=3288156&taskId=135&prodTypeId=3709945&prodSeriesId=1842750〈=en&cc=us

Choose your OS, then "Firmware - Lights Out Management".

If anyone has this hang after updating to 1.60 (or recently released 1.61), please post to this group and log a case with HP.

I do not believe a system board will fix the issue, so if you have a lot of servers that have this same issue (especially if they're in different enclosures), it's important you log a case with HP so it can be analyzed.

But, of course, try the iLO firmware update first and update the System ROM, ProLiant Support Pack, etc, to make the call center happy.

Regards,

Dan
Ian McGann
Advisor

Re: BL460c's hanging up

Thanks again, Dan.

Ian
Fabio_S
Advisor

Re: BL460c's hanging up

Hello,
same problem here. I first tried and update firmwares on blades ILO's and enclosure's OA, after reading here, a couple of months ago.

But the issue appeared again yesterday, trying to reboot one of the blades.
It seems like it can't start and hangs there in grey screen. Nothing works but physically unplugging and re-plugging the blade.

My feeling is about something hw-related in the connection between enclosure and blades, since this happens on all blades (not always).

Anyway, I've just updated enclosure's OA and only the involved blade's ILO to a new fw release.

If the issue persists, I'll open a ticket with HP.

I'll let you know....
Ken Henault
Honored Contributor

Re: BL460c's hanging up

I have seen many strange issues like this fixed by reseating the OA tray. Not just the OAs, but the tray they sit in. This will require that the OAs are removed first, then the tray, then reseat everything. This can be done without disrupting the servers.

I've seen this fix so many strange issues that I now make it part of my install to reseat every OA tray during the install. They just seem to have an issue when in transit. Once reseated the issues don't come back.

I hope this helps with this issue.

Ken
Ken Henault
Infrastructure Architect
HP
rvilat
Occasional Visitor

Re: BL460c's hanging up

Hi,

We have the same problem here, running BL460c's in C7000 chassis. The servers are 2xE5430's, 32Gb ram running suse linux 10SP1.

When rebooting the servers, the server will shutdown ok, but does not reset or POST. ILO shows a blank screen and the VSP does not connect. Interestingly, the OA shows the power consumption as 509W compared to 200W when the blade runs normally. The problem is intermittent and we have had 8 out of 72 blades across 5 chassis fail so far.

Reseating the blade has not always worked, usually a HP engineer has come onsite and removed RAM, mezzanine cards, CPU's etc until it comes back to life.

HP's standard response has been to upgrade ILO/BIOS/OA/etc but the problem exists at all firmware versions. I do not think it has been addressed by HP.

Has anyone found a satisfactory solution to this?

thanks
rosh

Simon Grant IRL
Occasional Advisor

Re: BL460c's hanging up

I have had the same problem with 2 bl460c's. I rebooted one of them yesterday and it failed to reboot. No video via the port on the front and just the grey screen via ILO. As mentioned above no temperature data is reported for that blade in the OA console. I rebooted a 2nd today and had the same issue.

I have 5 BL460c's running ESX, all at ILO v1.5. OA is at v2.25

Has anyone gotten any further with the root causes? I am afraid to do anything with the remaining blades. Thanks to HA / VMotion all my VM's are still up and running, just on fewer hosts.



Ian McGann
Advisor

Re: BL460c's hanging up

99% of our issues stopped after upgraded the servers to iLO v 1.60
rvilat
Occasional Visitor

Re: BL460c's hanging up

We were running iLO 1.70, OA 2.32 and BIOS 01/APR/08 when we had the problems.

Just upgraded to OA 2.41 and BIOS 02/NOV/08 and are doing a reboot test every 1hr to see if the problem can be replicated.
Simon Grant IRL
Occasional Advisor

Re: BL460c's hanging up

Overnight I upgraded the OA to v2.41 and the ILO to v1.7 on all blades, including the problem one.

Unless I am missing something I cannot upgrade the System BIOS on the blade that fails to boot, in its current state.

Let me know how the testing goes. Thanks.
Batesydw
Occasional Visitor

Re: BL460c's hanging up

I'm getting the same problem on a number of blades. We have 8x c-7000 and had the problem with 4 different firmware versions we have run and I'm just about to go OA 2.41, VC 2.01, blade bl460c1,5 11/nov/08 and ilo 1.70.

None of the firmware upgrades I've used to date have ever worked. It is still happening and random. Lets hope the latest works.

Problem being the Powers on and blank ilo etc. call logged to HP.
rvilat
Occasional Visitor

Re: BL460c's hanging up

An update to our testing:

We rebooted 8 blades in 1 chassis every hour for 5 days. All of them worked ok with the latest firmware. So it looks promising that the issue may have been fixed.

However, this does not preclude it being caused by rebooting after the systems have been up a long time.

Has anyone else had the issue after upgrading to latest firmware?
tomcy
Occasional Visitor

Re: BL460c's hanging up

Having the same problems here across multiple C7000 enclosures. Server gets rebooted and it does not come back. Stuck on a blank grey screen when you connect via the ilo. It appears to respond the virtual power commands but it never gets past the grey screen.

These are BL460C's running ilo 1.70 and also 1.60. OA's are running 2.25 and 2.41. Have seen this on 4 BL460C's in the past 7 days. Resetting the e-fuse from the OA does not fix the problem.

Case open with support but HP has not provided any solution so far.

rvilat
Occasional Visitor

Re: BL460c's hanging up

What BIOS version are you running?