- Community Home
- >
- Servers and Operating Systems
- >
- HPE ProLiant
- >
- ProLiant Servers (ML,DL,SL)
- >
- Re: RAID problem - one disk shows predective failu...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-16-2013 12:47 PM - edited 10-16-2013 01:00 PM
10-16-2013 12:47 PM - edited 10-16-2013 01:00 PM
RAID problem - one disk shows predective failure and the hot spare stays rebuilding eternally
hi,
recently I got an alert of one disk being marked as Predictive Failure, I thought that the hot spare would take over but it stays rebuilding for ever.
the server is getting stuck twice a day and I am forced to restart it manually.
Server info:
Server Type : "ProLiant DL380 G6"
Raid Controller : HP Smart Array P410i Controller
Controller Firmware Version : 2.50
I see this events in windows 2003 :
User Action
Check the Power-On-Self-Test (POST) errors, and take corrective action as needed.
WBEM Indication Properties
AlertingElementFormat: 0 0 (Unknown)
AlertType: 5 0x5 (Device Alert)
Description: "Power-On-Self-Test (POST) errors occurred during the last system startup."
EventCategory: 4 0x4 (System Hardware)
EventID: "1"
EventTime: "20131015175126.346000+000"
ImpactedDomain: 4 0x4 (System)
IndicationIdentifier: "{33ACC83B-F598-4BAF-B5A4-8C358799C2E0}"
IndicationTime: "20131015145114.859000-180"
NetworkAddresses[0]: "172.18.4.5"
OSType: 69 0x45 (Microsoft Windows Server 2003)
OSVersion: "5.2.3790"
PerceivedSeverity: 5 0x5 (Major)
ProbableCause: 8 0x8 (Configuration/Customization Error)
ProbableCauseDescription: "POST Errors Occurred"
ProviderName: "HP POST"
ProviderVersion: "2.3.0.0"
RecommendedActions[0]: "Check the Power-On-Self-Test (POST) errors, and take corrective action as needed."
Summary: "POST errors occurred"
SystemCreationClassName: "HP_WinComputerSystem"
SystemFirmwareVersion[0]: "2009.10.01"
SystemFirmwareVersion[1]: "2009.10.01"
SystemGUID: "33343934-3932-4247-3830-303448444542"
SystemModel: "ProLiant DL380 G6"
SystemName: "DELPRYASU-FS004.DELPRYASU.delegations.cec.eu.int"
SystemProductID: "494329-B21"
SystemSerialNumber: "GB8004HDEB"
TIME_CREATED: 130263330863468280 0x1cec9cf2b604ef8
VariableNames[0]: "POST Error Code"
VariableNames[1]: "POST Error String"
VariableTypes[0]: 3 0x3 (uint8)
VariableTypes[1]: 1 0x1 (string)
VariableValues[0]: "14"
VariableValues[1]: "POST Error: 301-Keyboard Error"
User Action
Check the Power-On-Self-Test (POST) errors, and take corrective action as needed.
WBEM Indication Properties
AlertingElementFormat: 0 0 (Unknown)
AlertType: 5 0x5 (Device Alert)
Description: "Power-On-Self-Test (POST) errors occurred during the last system startup."
EventCategory: 4 0x4 (System Hardware)
EventID: "1"
EventTime: "20131015175126.649000+000"
ImpactedDomain: 4 0x4 (System)
IndicationIdentifier: "{FD4D333F-7981-4A25-95F7-BD8A1D077BBC}"
IndicationTime: "20131015145126.346000-180"
NetworkAddresses[0]: "172.18.4.5"
OSType: 69 0x45 (Microsoft Windows Server 2003)
OSVersion: "5.2.3790"
PerceivedSeverity: 5 0x5 (Major)
ProbableCause: 8 0x8 (Configuration/Customization Error)
ProbableCauseDescription: "POST Errors Occurred"
ProviderName: "HP POST"
ProviderVersion: "2.3.0.0"
RecommendedActions[0]: "Check the Power-On-Self-Test (POST) errors, and take corrective action as needed."
Summary: "POST errors occurred"
SystemCreationClassName: "HP_WinComputerSystem"
SystemFirmwareVersion[0]: "2009.10.01"
SystemFirmwareVersion[1]: "2009.10.01"
SystemGUID: "33343934-3932-4247-3830-303448444542"
SystemModel: "ProLiant DL380 G6"
SystemName: "DELPRYASU-FS004.DELPRYASU.delegations.cec.eu.int"
SystemProductID: "494329-B21"
SystemSerialNumber: "GB8004HDEB"
TIME_CREATED: 130263330866491380 0x1cec9cf2b8e6ff4
VariableNames[0]: "POST Error Code"
VariableNames[1]: "POST Error String"
VariableTypes[0]: 3 0x3 (uint8)
VariableTypes[1]: 1 0x1 (string)
VariableValues[0]: "92"
VariableValues[1]: "POST Error: 1792-Drive Array Reports Valid Data Found in Array Accelerator"
User Action
Check the Power-On-Self-Test (POST) errors, and take corrective action as needed.
WBEM Indication Properties
AlertingElementFormat: 0 0 (Unknown)
AlertType: 5 0x5 (Device Alert)
Description: "Power-On-Self-Test (POST) errors occurred during the last system startup."
EventCategory: 4 0x4 (System Hardware)
EventID: "1"
EventTime: "20131015175126.936000+000"
ImpactedDomain: 4 0x4 (System)
IndicationIdentifier: "{47E606F8-AFC1-4465-90DD-60998ADC054A}"
IndicationTime: "20131015145126.649000-180"
NetworkAddresses[0]: "172.18.4.5"
OSType: 69 0x45 (Microsoft Windows Server 2003)
OSVersion: "5.2.3790"
PerceivedSeverity: 5 0x5 (Major)
ProbableCause: 8 0x8 (Configuration/Customization Error)
ProbableCauseDescription: "POST Errors Occurred"
ProviderName: "HP POST"
ProviderVersion: "2.3.0.0"
RecommendedActions[0]: "Check the Power-On-Self-Test (POST) errors, and take corrective action as needed."
Summary: "POST errors occurred"
SystemCreationClassName: "HP_WinComputerSystem"
SystemFirmwareVersion[0]: "2009.10.01"
SystemFirmwareVersion[1]: "2009.10.01"
SystemGUID: "33343934-3932-4247-3830-303448444542"
SystemModel: "ProLiant DL380 G6"
SystemName: "DELPRYASU-FS004.DELPRYASU.delegations.cec.eu.int"
SystemProductID: "494329-B21"
SystemSerialNumber: "GB8004HDEB"
TIME_CREATED: 130263330869363325 0x1cec9cf2bba427d
VariableNames[0]: "POST Error Code"
VariableNames[1]: "POST Error String"
VariableTypes[0]: 3 0x3 (uint8)
VariableTypes[1]: 1 0x1 (string)
VariableValues[0]: "90"
VariableValues[1]: "POST Error: 1778-Drive Array Resuming Automatic Data Recovery Process"
I dont really know what to do, I thought that maybe removing the hard disk is showing the predective failure, the RAID will be force to finish the rebuilding process... :?
any suggestion?
Thanks a lot in advance.
Jose
update
-----
I just connected to the management page and check the RAID... it is weird, phisically all the disks are green but I have a precation sign in the logical part. once I click on the disk 4, I clearly see the errors... should I just remove the disk then?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-16-2013 03:16 PM
10-16-2013 03:16 PM
Re: RAID problem - one disk shows predective failure and the hot spare stays rebuilding eternally
Since you have reset the system a couple of times already, I dont see the harm in removing the pred. failed drive at this point. However, I would have a replacement on hand. It seems like the spare is having trouble rebuilding, so by removing the bad drive the rebuild should attempt using the spare. Adding the new drive may allow you to add another spare that can be used if the existing spare is also bad.
-Bob
Was this helpful? Like this post by giving me a thumbs up below!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-16-2013 05:01 PM
10-16-2013 05:01 PM
Re: RAID problem - one disk shows predective failure and the hot spare stays rebuilding eternally
Since we are dealing with Raid5 I would definetly do a disc2vhd first.
I've seen it so often that raid5 fails after X years and since all discs are old, odds are that one more fails as well when a full parity rebuild is forced on the array.
Hot spare problems might actually mean that one more disc is having issues.
I'd alwas go with raid 10 or raid 6, but that's another discussion :)