- Community Home
- >
- Storage
- >
- Midrange and Enterprise Storage
- >
- StoreVirtual Storage
- >
- Re: Catch 22 -- can't update because critical aler...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-24-2014 08:09 AM - edited 11-24-2014 08:11 AM
11-24-2014 08:09 AM - edited 11-24-2014 08:11 AM
Catch 22 -- can't update because critical alert, can't clear alert because outdated
I've got multiple issues, surely a result of my own lack of education on this system...and very frustrating.
I have a P4300 G2 with a few bad drives (in a cluster with another identical one and a 4330). It doesn't like new replacement drives (reports them "faulty") and I'm pretty sure it needs a patch or firmware update or something.
The CMC's "Upgrades" tab shows a few updates. (To be honest, I'm not confident that these will help, but I have to try.) When I try to install them, it fails with "Pre installation test failed because management group 'HPSANGroup' has the following issues: Critical Alarms exist. Canceling all further installations." I believe the only fix for these Critical Alarms is an update.
I did find some patches and firmware updates that look like they somehow get installed directly on the P4000-series nodes but I have no idea how. The patches and firmware I found:
http://h20565.www2.hp.com/hpsc/swd/public/readIndex?sp4ts.oid=4118705&swLangOid=8&swEnvOid=54
The files are .patch and .upgrade files and I can't find any advice on installing them.
Can someone help steer me in the right direction so I can bail myself out?
- Tags:
- update
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-24-2014 08:23 AM
11-24-2014 08:23 AM
Re: Catch 22 -- can't update because critical alert, can't clear alert because outdated
You cannot update anything while the system is critical, you first need to fix the disk issue.
What means "a few" bad drives?
Once the RAID set has lost too many drives, it is gone and you probably need to rebuild the node from scratch after the bad disks are replaced. Then you can sync the nodes and install updates.
Hope this helps!
Regards
Torsten.
__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.
__________________________________________________
No support by private messages. Please ask the forum!
If you feel this was helpful please click the KUDOS! thumb below!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-24-2014 08:40 AM
11-24-2014 08:40 AM
Re: Catch 22 -- can't update because critical alert, can't clear alert because outdated
Three drives show "Faulty" health. Two are brand new, bought to replace old drives that went "Faulty". The new drives also report "Faulty". I do not know how to fix the disk issue; I've already tried replacing them with brand new disks. Is there something else I can do?
Timeline:
- Disk 7 goes "Faulty". Order new disk, proper HP Spare # 508011-001.
- While waiting for that to arrive, Disk 1 goes "Faulty". Order a second one.
- First replacement arrives. Install in bay 7. Status reports "Rebuilding" for 3 days, then "Active" with Health as "Faulty".
- Second replacement arrives. Install in bay 7, assuming first replacement was DOA. Status reports "Rebuilding" for 3 days, then "Active" with Health as "Faulty".
- Try first replacement disk in bay 1. Status reports "Rebuilding" for 3 days, then "Active" with Health as "Faulty". Either two brand new drives are bad, the node has failed, or the new drives are incompatible with my old firmware.
- Disk 3 goes "Faulty".
CMC reports "Safe to Remove" as "Yes" for all disks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-03-2014 06:41 AM
12-03-2014 06:41 AM
Re: Catch 22 -- can't update because critical alert, can't clear alert because outdated
Nobody has any further thoughts on this?
In the meantime, Disk 2 now shows "Faulty".
Miraculously, I managed to do enough clean up to crowd everything onto the two good nodes and take the bad node offline. I can't sustain it but I can hold it like this long enough to do something with the failing node.
Can anyone suggest how to proceed?
Maybe I should remove the failed/replaced disks, reconfigure the RAID on that node to operate on fewer disks (and present less storage), re-join the node to the management group (if it was necessary to remove it), do the updates (since there are no alarms anymore), then see if it can talk to the new replacement disks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-03-2014 06:53 AM
12-03-2014 06:53 AM
Re: Catch 22 -- can't update because critical alert, can't clear alert because outdated
Even if certain versions of lefthand OS wants to maintain the firmware from OS, I would try to cold boot the system from the current SPP 2014.09 and update all the firmware. For disks mb1000famyu version HPD7 is included.
Problem Fixed:
- This firmware prevents a rare condition that may occur during a WRITE SAME command sequence that may result in incorrect data being written to the hard drive. The WRITE SAME command may be used during RAID ARRAY parity initialization.
This could be a reason.
Hope this helps!
Regards
Torsten.
__________________________________________________
There are only 10 types of people in the world -
those who understand binary, and those who don't.
__________________________________________________
No support by private messages. Please ask the forum!
If you feel this was helpful please click the KUDOS! thumb below!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-10-2014 06:21 AM
12-10-2014 06:21 AM
Re: Catch 22 -- can't update because critical alert, can't clear alert because outdated
Just FYI, an update...
I was able to acquire the SPP (thanks to a recent server purchase). I ran it and it appeared to update a ton of stuff. Still no good. I also received one more drive from my vendor and that one is good.
With all the updates I'm now getting better diagnostic info and it turns out that the bad drives are SMART predicted failure, not actual failures. Dumb SMART.
So, I think I really did get two DOA drives in a row and this whole thing has been an incredibly inconvenient wild goose chase. The good news is that I have done some serious, badly needed cleanup of LUNs.
Thanks everyone, I'll update this thread again when there's more news.