- Community Home
- >
- Servers and Operating Systems
- >
- HPE BladeSystem
- >
- BladeSystem - General
- >
- Re: C7000 chassis "Degraded" then "OK" emails.
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-29-2010 12:38 PM
тАО03-29-2010 12:38 PM
C7000 chassis "Degraded" then "OK" emails.
We have email notification set up on the onboard admin so send it sends out alerts if something goes wrong on the server.
Four times in the last week, we have received an email from the OA that says the enclosure status has changed to "degraded", followed about 25 seconds later by another email that the enclosure status has changed to OK.
The problem is that there is nothing wrong that I can find, and the OA logs don't show anything happening at those times.
Three of the four times (Wednesday, Saturday, and Monday) were all at 4:10pm, however there was another one at 10:19 on Monday..
In addition to both OA logs, I've checked the IML logs on all the blade, and I can't find anything listed.
Any suggestions would be appreciated.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-29-2010 04:50 PM
тАО03-29-2010 04:50 PM
Re: C7000 chassis "Degraded" then "OK" emails.
ISSUE:
Onboard Administrator Shows the Server Blade BL460c Server installed with RHEL 5.3 x86 version as "Degraded" Status in spite of having all the BladeSystem Firmware up-to-date.
SOLUTION:
On any LINUX OS or VMware installed Servers please ensure the OS has the Proliant Support Pack/HPASM agents installed, as the ILO driver and agents communicate the server health to the OA. If the Proliant Support Pack/HPASM agents are absent, then the server status would appear degraded in the OA.
Let me know if there are updated
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-29-2010 11:51 PM
тАО03-29-2010 11:51 PM
Re: C7000 chassis "Degraded" then "OK" emails.
we got the exact same issue, with several C7000 enclosures running OA 2.51 and OA 2.52. We could not determine the cause of the issue, as we also couldn't find any events within OA syslogs. same for servers syslogs, no issue matching the time of events. our customer did an update on some enclosures to OA 2.60 and the false alert mails semms to be gone. But from the engineering point of view this is not a known issue, and they believe that there is an OA firmware desynchronisation between the active and stby OA (i could not find anay argument for that). I'm very interested about this issue, as for now you i couldn't find someone having the same issue. I just asked my customer to try rebooting both Oas, and then trying to reflash with the running OA versions (2.51 or 2.52) instead of flashing to OA 2.60, as i want to ensure that the pb is really solved with the 2.60 OA firmware and not solved just as a result of reflashing the Oas, with existing versions 2.51 or 2.52. Please keep me updated
kind regards
Jean-Denis
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-30-2010 05:08 AM
тАО03-30-2010 05:08 AM
Re: C7000 chassis "Degraded" then "OK" emails.
Both OA modules are at 2.60, the ILO on each blade is 1.79 and the virtual connect modules are at 2.30 (enet) and 1.40 (fiber)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-01-2010 06:28 AM
тАО04-01-2010 06:28 AM
Re: C7000 chassis "Degraded" then "OK" emails.
What seems very strange in our case, is the fact that since customer did an update from oa 2.5x to oa 2.60 the alertmails disapeared. I also asked my customer to do the following on one enclosure showing the issue with OA 2.51.
=> restart both OA modules.
For the moment the alert mails do not reoccur on this single enclosure. but it's to early to say that this solved the pb. If i find something i will let you know. this is a crazy pb as we cannot capture any relevant information from the enclosure.
regards
Jean-Denis
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-02-2010 05:09 PM
тАО04-02-2010 05:09 PM
Re: C7000 chassis "Degraded" then "OK" emails.
This is really odd. So far it's happened 5 times, and four of them were at exactly the same time of day, 4:10pm. It's not the same interval of days, 3/24, 3/27, 3/29, 4/2.
I can't figure out what might be happening at exactly that time, and it's really frustrating that it won't say what was "degraded".
Next week I'm going to try flashing the OA bios. I'm already at 2.60, but maybe flashing and rebooting the OA might reset something.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-05-2010 07:08 AM
тАО04-05-2010 07:08 AM
Re: C7000 chassis "Degraded" then "OK" emails.
If I can verify that we meet the requirements with our other components, I'll go ahead and update to OA 3.00.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-05-2010 02:53 PM
тАО04-05-2010 02:53 PM
Re: C7000 chassis "Degraded" then "OK" emails.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-06-2010 05:02 AM
тАО04-06-2010 05:02 AM
Re: C7000 chassis "Degraded" then "OK" emails.
With all the other fixes listed, it wouldn't suprise me if it did resolve the issue, but I would still like to get some info on the minimimum firmware needed in the other components. The link provided under the "firmware dependency" still only references OA 2.6.
It happened again yesterday, at exactly 4:10pm, just like most of the other times.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-06-2010 06:59 AM
тАО04-06-2010 06:59 AM
Re: C7000 chassis "Degraded" then "OK" emails.
Agree with You. I don't believe that there is a known issue with OA 2.60 as I already sais that i got the same kind of issues with OA 2.51 and OA 2.52 and it seems that after flashing to OA 2.60 on some enclosures my customer no longer gets alertmails on these enclosures. So I believe that either rebooting both OAs or reflashing OA firmware "resets something" and results in enclosures no longer sending alertmails. Unlike you we get the alerts at random days/time and the delay between the degraded and OK status is always less than one minute.
My customer was able to capture a show all when one of his enclosure went to degraded and there is as usual no entry in the OA1 and OA2 syslog. the only thing he could notice is the following information returned within the SHOW ENCLOSURE STATUS
Enclosure:
Status: Degraded
Unit Identification LED: Off
Diagnostic Status:
Internal Data OK
Redundancy Failed
I asked for the SHOW OA STATUS but I'm not sure that my customer had time to capture the show all output. I discussed about this kind of error with OA engineering and this seems not to be a known issue. They said that it could be due to some fw desynchronizing between both Oas, but it seems not the case for our enclosures. I will let you know if i can identify what is going on. seems to be an OA related pb (when having redundant OAsin an enclosure), but not specific to a given OA version, as i got the issues with 2.51 and 2.52 and you got the issues with 2.60 (we no longer have issues since we flashed some enclosures from 2.5x to 2.60). also i asked my customer to restart both OAs on an enclosure which had issues and for the moment the alertmails didn't reoccur ??
regards
Jean-Denis