- Community Home
- >
- Servers and Operating Systems
- >
- HPE ProLiant
- >
- ProLiant Servers - Netservers
- >
- Re: HP NetServer LT6000r NetRaid Raid5 array rebui...
ProLiant Servers - Netservers
1752272
Members
4323
Online
108786
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Go to solution
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-14-2007 08:01 AM
06-14-2007 08:01 AM
I have a NetServer LT6000r with an internal NetRaid controller which gives an error on rebuild of a failed member drive in a Raid5 array.
The Raid5 array consists of (4) 18.2 GB physical drives which created a logical drive size of 52GB. All drives were used as data space and there is no hot spare.
The physical drive # 0 channel 0 failed. I attempted a rebuild after swapping out the failed drive with a spare. I formatted the spare, then started the rebuild. The rebuild fails consistently at 56% completion. I have tried three different drives with the same results. One thing I noticed when checking properties of all the member drives is that drive #3 has (2) media errors.
Is this the reason the array will not re-build?
My question is if I pull drive #3 out (the one with the 2 media errors) and replace it, will I loose any data? The Raid5 configuration is in degraded mode and it continues to run on the three remaining drives. Will the logical drive run on only two physical drives?
Is the correct recovery procedure to first replace the #3 drive with the media errors, do a rebuild then replace the failed #0 drive and do a second rebuild?
Will the above work, or am I already at the point of not being able to rebuild this array?
One additional fact, the above array is partitioned as the "system" volume and the "ora-home" volume containing the OS and Oracle home respectivly, so my only other option is to image the volumes and re-create the array if nothing else works, so I would prefer an easier solution.
Thanks
Steve
The Raid5 array consists of (4) 18.2 GB physical drives which created a logical drive size of 52GB. All drives were used as data space and there is no hot spare.
The physical drive # 0 channel 0 failed. I attempted a rebuild after swapping out the failed drive with a spare. I formatted the spare, then started the rebuild. The rebuild fails consistently at 56% completion. I have tried three different drives with the same results. One thing I noticed when checking properties of all the member drives is that drive #3 has (2) media errors.
Is this the reason the array will not re-build?
My question is if I pull drive #3 out (the one with the 2 media errors) and replace it, will I loose any data? The Raid5 configuration is in degraded mode and it continues to run on the three remaining drives. Will the logical drive run on only two physical drives?
Is the correct recovery procedure to first replace the #3 drive with the media errors, do a rebuild then replace the failed #0 drive and do a second rebuild?
Will the above work, or am I already at the point of not being able to rebuild this array?
One additional fact, the above array is partitioned as the "system" volume and the "ora-home" volume containing the OS and Oracle home respectivly, so my only other option is to image the volumes and re-create the array if nothing else works, so I would prefer an easier solution.
Thanks
Steve
Solved! Go to Solution.
1 REPLY 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-14-2007 08:06 PM
06-14-2007 08:06 PM
Solution
Steve,
there is no way you can pull out another disk since the raid array is already in degraded mode, if you do so you will loose all data. A RAID5 array build with 4 disks needs a minimum of three disks to survive.
The best option is to backup the data and reinstall the system, or image it.
The reason for the rebuild to fail is as you figured out, a problem in another stripe of the raid5 data such that the controller is not capable to reconstruct the full 4 disk wide stripe anymore. There is a chance that your data itself is not affected and that it is "only" a bad spot in an area that currently holds no user data, but the controller is not capable of knowning that since the RAID5 and the user data are abstracted from each other so to speak, this is a industry wide 'problem' and not specific to the Netraid controller.
If you run a full backup which reads all the data on the disks and that is succesfull, you are almost certain the bad spot holds no user data (if it does, then the backup should fail on a certain folder or file). If the backup fails on a certain file, try to exclude it from the backup and see if that helps you through the full backup. If it fixes it, then great and you were lucky to have spotted the issue at the file system level.
As a good practise and to minimize the issue you ran into (rebuild fails after a disk failure due to another issueon the stripe) it's to make sure the Netraid consistency check is scheduled to run and check the raid parity stripe on weekly (default schedule) basis. This corrects eventual raid stripe inconsistencies in area's that are almost never accessed i.e.
The Netraid monitor/consistency check is downloadable from here:
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=en&cc=us&prodNameId=296140&prodTypeId=329290&prodSeriesId=51930&swLang=13&taskId=135&swEnvOID=1005
HTH
Kris
there is no way you can pull out another disk since the raid array is already in degraded mode, if you do so you will loose all data. A RAID5 array build with 4 disks needs a minimum of three disks to survive.
The best option is to backup the data and reinstall the system, or image it.
The reason for the rebuild to fail is as you figured out, a problem in another stripe of the raid5 data such that the controller is not capable to reconstruct the full 4 disk wide stripe anymore. There is a chance that your data itself is not affected and that it is "only" a bad spot in an area that currently holds no user data, but the controller is not capable of knowning that since the RAID5 and the user data are abstracted from each other so to speak, this is a industry wide 'problem' and not specific to the Netraid controller.
If you run a full backup which reads all the data on the disks and that is succesfull, you are almost certain the bad spot holds no user data (if it does, then the backup should fail on a certain folder or file). If the backup fails on a certain file, try to exclude it from the backup and see if that helps you through the full backup. If it fixes it, then great and you were lucky to have spotted the issue at the file system level.
As a good practise and to minimize the issue you ran into (rebuild fails after a disk failure due to another issueon the stripe) it's to make sure the Netraid consistency check is scheduled to run and check the raid parity stripe on weekly (default schedule) basis. This corrects eventual raid stripe inconsistencies in area's that are almost never accessed i.e.
The Netraid monitor/consistency check is downloadable from here:
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=en&cc=us&prodNameId=296140&prodTypeId=329290&prodSeriesId=51930&swLang=13&taskId=135&swEnvOID=1005
HTH
Kris
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
News and Events
Support
© Copyright 2024 Hewlett Packard Enterprise Development LP