- Community Home
- >
- Servers and Operating Systems
- >
- HPE BladeSystem
- >
- BladeSystem - General
- >
- Re: Critical Error Redundancy Lost
BladeSystem - General
1752244
Members
4972
Online
108785
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-17-2009 06:47 AM
04-17-2009 06:47 AM
Re: Critical Error Redundancy Lost
Slight modification to my post above, I noticed that the 2.24 version referred to in earlier posts, was for BL680c blades. That version was released on April 2nd this year.
There was a ROM flash released for BL465c G5 blades dated April 3rd (version 2009.03.12) which I hope is the equivalent. The "Fixes" for this release states;
"Resolved an issue where the Low Power Halt State (AMD C1 Clock Ramping) option in the ROM Based Setup Utility (RBSU) does not properly disable this power management feature for AMD Opteron 2300-series processors. Previous revisions of the System ROM would not disable this functionality even when this option was configured for Disabled. This fix is not required for systems configured with older versions of AMD processors or if the Low Power Halt State was enabled (which is the default state)."
Resolved an extremely intermittent issue that could cause the system to encounter a system boot hang with a red screen.
Maybe one of these is the fix we are looking for although neither seems to reference it explicitly.
Dave.
There was a ROM flash released for BL465c G5 blades dated April 3rd (version 2009.03.12) which I hope is the equivalent. The "Fixes" for this release states;
"Resolved an issue where the Low Power Halt State (AMD C1 Clock Ramping) option in the ROM Based Setup Utility (RBSU) does not properly disable this power management feature for AMD Opteron 2300-series processors. Previous revisions of the System ROM would not disable this functionality even when this option was configured for Disabled. This fix is not required for systems configured with older versions of AMD processors or if the Low Power Halt State was enabled (which is the default state)."
Resolved an extremely intermittent issue that could cause the system to encounter a system boot hang with a red screen.
Maybe one of these is the fix we are looking for although neither seems to reference it explicitly.
Dave.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-17-2009 08:52 AM
04-17-2009 08:52 AM
Re: Critical Error Redundancy Lost
My observations addressing Brit's questions:
1) In my case, no, the whacked-out "Power Allocated" figure on the one BL680c-G5 which caused our problem did not occur at power-up, it occured several days later. First indication was an amber alert on the blade with associated "Degraded Status". Once this issue occured and as long as it existed, any blades that had been powered down prior to the event could not be powered up (because the cabinet thinks at that point that there is insufficient power). Any blades that were running fine when the issue occured and were later powered off and then attempted to be powered back on would not go back on for the same reasons. As long as all blades in the cabinet had been running prior to the event, there has been no observed performance degredation or loss of server capabilities as long as they stay powered up. But you're playing with fire here; you are running in a crippled mode and run the risk that a critical production server could go down for some other reason and then you would not be able to bring it back up without powering off the root-cause blade as well.
2) In my experience, it did not actually affect operational performance of a running server at all, as long as that server stayed powered up. As in 1 above though, once the event occurs, you cannot power down another blade in the same cabinet and then power it back up without shutting down/upgrading firmware/resetting cache on the root-cause blade first.
3) My blades are all X86 based so I have no experience on Itanium.
4) I had upgraded my blades to 2.25 and did not see a re-occurence of exactly the same problem. However, with 2.25 I DID have a degraded status event on the BL680c-G5 1.5 weeks after the upgrade, but without the associated "Power Allocated" issue. For this reason I consider this to be a different event for different root causes, as-yet undetermined. Note that with this issue on 2.25, I was able to power-down a blade and back up again without any problems.
I've had very long discussions with the HP Support folks and I know that this issue has been getting a lot of scrutiny both at the Response Center and in the Labs.
1) In my case, no, the whacked-out "Power Allocated" figure on the one BL680c-G5 which caused our problem did not occur at power-up, it occured several days later. First indication was an amber alert on the blade with associated "Degraded Status". Once this issue occured and as long as it existed, any blades that had been powered down prior to the event could not be powered up (because the cabinet thinks at that point that there is insufficient power). Any blades that were running fine when the issue occured and were later powered off and then attempted to be powered back on would not go back on for the same reasons. As long as all blades in the cabinet had been running prior to the event, there has been no observed performance degredation or loss of server capabilities as long as they stay powered up. But you're playing with fire here; you are running in a crippled mode and run the risk that a critical production server could go down for some other reason and then you would not be able to bring it back up without powering off the root-cause blade as well.
2) In my experience, it did not actually affect operational performance of a running server at all, as long as that server stayed powered up. As in 1 above though, once the event occurs, you cannot power down another blade in the same cabinet and then power it back up without shutting down/upgrading firmware/resetting cache on the root-cause blade first.
3) My blades are all X86 based so I have no experience on Itanium.
4) I had upgraded my blades to 2.25 and did not see a re-occurence of exactly the same problem. However, with 2.25 I DID have a degraded status event on the BL680c-G5 1.5 weeks after the upgrade, but without the associated "Power Allocated" issue. For this reason I consider this to be a different event for different root causes, as-yet undetermined. Note that with this issue on 2.25, I was able to power-down a blade and back up again without any problems.
I've had very long discussions with the HP Support folks and I know that this issue has been getting a lot of scrutiny both at the Response Center and in the Labs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-10-2009 02:51 AM
06-10-2009 02:51 AM
Re: Critical Error Redundancy Lost
There are newer OA fw versions (they appeared in end of May 2009), which should resolve those â status: degradedâ problems. The latest version is 2.51 (May 28 2009) and I'm about to put it in my Blade enclosure soon.
Does anyone tried this fw version already?
Does anyone tried this fw version already?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-10-2009 05:09 AM
06-10-2009 05:09 AM
Re: Critical Error Redundancy Lost
In terms of my comment from Apr 16, 2009 08:40:05 GMT
"ROM Version "I17 02/24/2009" for "ProLiant BL680c G5" solved the problem in our case"
solved this ROM the power allocation issue but brings a new failure. Both ports of the HBA get the same WWPN. This ROM was removed by HP.
The power allocation issue occurs in only two BL680c. It doesn't matter in which enclosure they are plugged in.
The new ROM Version "I17 05/10/2009" solved in my case the power allocation issue and gives different WWPN to the HBA.
"ROM Version "I17 02/24/2009" for "ProLiant BL680c G5" solved the problem in our case"
solved this ROM the power allocation issue but brings a new failure. Both ports of the HBA get the same WWPN. This ROM was removed by HP.
The power allocation issue occurs in only two BL680c. It doesn't matter in which enclosure they are plugged in.
The new ROM Version "I17 05/10/2009" solved in my case the power allocation issue and gives different WWPN to the HBA.
- « Previous
- Next »
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
News and Events
Support
© Copyright 2024 Hewlett Packard Enterprise Development LP